A recent Cochrane review assessed the efficacy of methylphenidate for attention-deficit/hyperactivity disorder (ADHD) in children and adolescents. Notwithstanding the moderate-to-large effect sizes for ADHD symptom reduction found in the meta-analysis, the authors concluded that the quality of the evidence is low and therefore the true magnitude of these effects remains uncertain. We identified a number of major concerns with the review, in the domains of study inclusion, approaches to quality assessment and interpretation of data relating to serious adverse events as well as of the clinical implications of the reported effects. We also found errors in the extraction of data used to estimate the effect size of the primary outcome. Considering all the shortcomings, the conclusion in the Cochrane review that the status of the evidence is uncertain is misplaced. Professionals, parents and patients should refer to previous reviews and existing guidelines, which include methylphenidate as one of the safe and efficacious treatment strategies for ADHD.
Statistics from Altmetric.com
Attention-deficit/hyperactivity disorder (ADHD) is a common disorder starting in childhood and frequently persisting across the lifespan. Current treatment guidelines, including those of the National Institute for Health and Care Excellence (NICE)1 identify methylphenidate (MPH) as a first-line treatment for ADHD. Furthermore, previous systematic reviews and meta-analyses (e.g, refs. 2 ,3) have reported large effect sizes for the efficacy of MPH in the treatment of ADHD, at least in the short term.
In November 2015, Storebø et al4 published a Cochrane review on the efficacy and tolerability of MPH for the treatment of ADHD in children and adolescents. This review challenged the conclusions of previous reviews and guidelines. While the meta-analysis found similar effect sizes in relation to efficacy to those previously reported, the authors concluded that the magnitude of this effect is uncertain due to the very low quality of the evidence. As such, the conclusions of Storebø et al4 could raise questions about the role of MPH as a core component of ADHD treatment. As an international group committed to the provision of evidence-based clinical guidance on the management of ADHD, we do not agree with the conclusions of Storebø et al.4 Rather, we argue that the Storebø et al4 review is flawed in a number of ways that lead to these incorrect conclusions:
Inappropriate selection of studies for inclusion;
Internal inconsistencies and idiosyncratic procedures in the risk of bias and overall study quality assessment;
Misinterpretations of evidence in relation to serious adverse events;
A misunderstanding of the meaning of effect sizes, and their clinical implications for individual patients.
Additionally, there are a number of errors in the calculation of standardised mean differences (SMDs) and meta-analytic weights.
Inappropriate study inclusion
As per their protocol, Storebø et al4 aimed to include randomised controlled trials (RCTs) comparing MPH with placebo or no intervention, allowing cointerventions provided that the compared intervention groups received the cointervention similarly. In fact, Storebø et al4 included three studies, among them the large Multimodal Treatment of ADHD (MTA) study, where there was no placebo/no treatment arm5 ,6 and/or where MPH was used as an add-on intervention5–7 and a study (included only in secondary analyses) that was not randomised.8 Removing these studies increases the effect size for the primary outcome of teacher-reported ADHD symptom ratings from −0.77 (95% CIs −0.90, −0.64) to −0.83 (95% CIs −0.96,−0.70). The inclusion of the large MTA study5 has important implications for all subgroup analyses of long-term (>6 months) versus short-term MPH administration. Storebø et al4 reported a smaller effect for long-term administration according to teacher (but not parent or observer) reports. However, since the MTA is the only study meeting their definition of long-term administration, these analyses are misleading. We agree with Storebø et al4 that there is an absence of long-term data, but disagree that long-term placebo-controlled RCTs provide the solution. There are serious ethical problems in extending placebo-controlled RCTs into the longer term where there is evidence of strong benefit. In our view, a more appropriate design would be ‘randomised discontinuation trials’ as proposed in the NICE guideline.1 These, along with longer term safety studies, are now a requirement of the regulatory development programme of the European Medicines Agency for new ADHD medications.
Assessment of study quality
Storebø et al4 adopted the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach (http://handbook.cochrane.org/chapter_12/12_2_1_the_grade_approach.html), which includes the Cochrane risk of bias (RoB) tool (http://handbook.cochrane.org/chapter_8/8_assessing_risk_of_bias_in_included_studies.html). Using GRADE with regard to the main outcome (teacher-rated ADHD symptoms), they downgraded the quality of evidence by one point for inconsistency of effects (heterogeneity) and by two points for RoB. Both these decisions are questionable. In relation to heterogeneity, I2 for the meta-analysis of the primary outcome was 37%. The Cochrane handbook suggests that heterogeneity up to 40% ‘might not be important’. This downgrading is not appropriate, especially considering that exclusion of the MTA study reduces heterogeneity to 25%.
With regard to the RoB assessment, we identified two major issues. First, Storebø et al4 added an additional domain, ‘vested interests’, which is not included in the current Cochrane RoB. The authors support their choice, citing a single work9 and the assessing the methodological quality of systematic reviews approach (AMSTAR) (http://amstar.ca/). While there remains controversy over the best method to account for vested interests,10 ,11 the Cochrane handbook suggests that it should be reported in the ‘characteristics of included studies’ table rather than in the RoB. Moreover, it is unclear how Storebø et al4 handled the assessment of vested interests in specific cases. They defined a trial as being at ‘high RoB’ when it was ‘funded by parties that might have had a conflict of interest (eg, a manufacturer of MPH) or where potential conflicts of interest were reported by trial authors’. However, there appears to be no further specification, for example, the number or role of authors with supposed conflict of interest, the time frame of the putative conflicts or the type of industry support that would lead to a study being rated as biased. This may partially explain why this domain was inconsistently rated across the included trials (see online supplementary appendix table 1 for example). In a BMJ online reply to the concerns of others about their use of this domain, the authors stated ‘There were no trials with only the ‘vested interest bias’ domain assessed as ‘unclear RoB’ or ‘high RoB’. This is incorrect; there are actually seven studies in which this was the only RoB domain rated as unclear or high (see online supplementary appendix table 2).
Storebø et al4 rated studies as being at high RoB if any domain (including their additional domain of vested interests) was rated as high or unclear. This approach is problematic because domains were rated as unclear due to lack of detailed information. However, there is no evidence that the authors routinely attempted to contact authors for clarification regarding RoB domains (including vested interest) rated as ‘unclear’ (see online supplementary appendix, table 3), although they did contact some authors for missing quantitative data. For example, in the study by Ashare et al12 five RoB domains were rated as ‘unclear’ due to a lack of information. While Storebø et al4 wrote that they contacted the authors for quantitative data, there is no evidence they requested RoB information.
Storebø et al4 argue that the low quality of studies (their assessment) casts doubt on the accuracy of effect sizes. A more scientific approach is to test their opinion with the available data on RoB and study quality. The authors compared RCTs at high versus low risk of all bias (their ratings) and found no significant difference in effect size (χ21=2.43, p=.12). Unfortunately, this finding is not included in their adjunct publications in the BMJ13 and JAMA.14
Storebø et al4 went on to suggest that even those studies where no item of the RoB was rated as high/unclear were likely to be biased, due to unblinding. They assumed that ‘people in the trial might know which treatments the children were taking’ because of adverse effects associated with MPH. We deem this unlikely, at least with their primary outcome ratings from teachers. The most common adverse effects reported are sleep difficulties and appetite reduction. There is no evidence to support the claim that teachers would be aware of these symptoms in their pupils and this is not our clinical experience.
Serious adverse effects (SAEs) of MPH
The authors used data from nine parallel group trials involving 1532 participants to explore SAEs. They reported a risk ratio (RR) of 0.98 (95% CIs 0.44, 2.22), which did not change significantly with the inclusion of 1712 participants from both arms of crossover trials (RR=1.50, 95% CIs 0.34, 7.71). They point to the findings from trial sequential analysis to show that the sample size is underpowered to detect SAEs of the frequency reported for MPH versus placebo. It is common knowledge that short-term RCTs are neither intended nor powered to evaluate rare adverse effects and long-term safety. While this is a limitation of the available evidence, a more balanced interpretation of the current evidence is that SAEs due to MPH are rare in clinical trials. The vast majority of published observational studies also suggest that SAEs are rare in clinical practice and that the causal associations between these SAE and MPH (rather than to ADHD itself or other associated conditions) remain to be confirmed.
Effect sizes and clinical effectiveness
In a comparison of meta-analyses for common treatments in medicine and psychiatry, Leucht et al15 showed that the effect size for MPH compares favourably both with medications for other psychiatric conditions and with other commonly used interventions in medical conditions. Notable were the moderate effects for corticosteroids in asthma (0.54) and antihypertensives in high blood pressure (0.54, 0.56 mm Hg for systolic and diastolic blood pressure), with a small effect of the latter for long-term cardiovascular outcome/mortality (0.11). In this context, the efficacy of MPH should be viewed favourably.
A further difficulty with the interpretation of Storebø et al4 is their translation of group-based effect sizes into mean levels of symptom reduction in an attempt to quantify individual improvement. This is an uninformative and inaccurate measure of individual response as it fails to account for baseline symptom severity and the proportion of treated individuals showing clinical benefit, due to interindividual variability in response. Clinical impact for individuals is better described using the number needed to treat and to harm metrics, which were not included in the review.
In addition to these major conceptual and methodological issues, we checked Storebø et al's4 primary analysis of teacher-reported ADHD symptoms. Among the 19 studies reported (three of which we have argued should not have been included), compared to the published data, we found errors in the imputation of data and/or sample size in seven. These errors included a systematic doubling of the sample sizes for all crossover trials (by counting each arm as if they were independent) (see online supplementary appendix table 3). Since these errors occurred in both directions, correction of the data had little impact on the aggregate SMD for this outcome. Our limited check raises concerns about the accuracy of the other quantitative findings and the overall quality and conduct of the review.
In summary, we think that the analysis undertaken by Storebø et al4 is flawed and the conclusions misplaced. The additional benefits of MPH in improving both general behaviour (SMD −0.68, 95% CIs −0.78, −0.60) and quality of life (SMD 0.61, 95% CIs 0.48, 0.80) further support the value of MPH for youth with ADHD. We are therefore perplexed why the authors refer to their own findings as ‘apparent effects’ (our emphasis). While we agree that some of the trials included in this review are at RoB, this aspect of the evidence has not been assessed well. This has led to an overly negative interpretation of the evidence.
It is unfortunate that the authors focus on problems with the quality of the evidence when their own analyses do not support a significant impact of study RoB on the effect of MPH. Reviewers are becoming more transparent about the methods to assess study design and data analysis. However, it is still much more difficult to objectively address the introduction of bias through interpretation that appears in the discussion section of a paper. In this case, we believe that the interpretation given by Storebø et al4 is not warranted, given their errors and deviations from currently recommended methods.
The beneficial effects of MPH (and other medications for ADHD) need to be placed in the context of other non-pharmacological interventions for ADHD. In a systematic review and meta-analysis of RCTs, we found a relative lack of efficacy on blinded outcomes in reducing ADHD symptoms for many of these.16–19 Where these interventions lead to beneficial reductions in ADHD symptoms, the magnitude of these effects is considerably less than that reported for MPH and indeed other stimulant and non-stimulant ADHD medications. We therefore believe that there is continued support for the use of MPH as an effective and safe treatment for ADHD.
We read with interest the reply by Storebø and colleagues20 in relation to our Perspective article in this issue of EBMH. The purpose of our article is to describe in detail the methodological flaws and errors in data handling which have contributed to erroneous conclusions by Storebø et al in their review about methylphenidate in ADHD. There has already been much dialogue in scientific journals between Storebø et al and other scientists expressing concern regarding the methods and conclusions of their Cochrane review.4 We do not intend to enter into further dialogue as it is unlikely to lead to consensus. However, we note that Storebø et al accept almost all of the errors we have identified. While it is reassuring that they will update their review accordingly, they should note that we have not externally checked any of their secondary analyses, which are likely, in our opinion, to show similar levels of inaccuracy. We strongly encourage Storebø and colleagues to check all of their data and publish an erratum. In relation to the Lufi et al 1997 study,8 we note that we also received information via e-mail from Dr. Lufi (July 24th, 2016) stating that the study was not randomised (e-mail available upon request). In their response, Storebø et al suggest they have contacted many more study authors regarding possible vested interests compared to the information given in their Cochrane review. We would encourage the authors also to revise and update their review with this information as they suggest it is currently misleading. Large systematic reviews such as this4 are potentially of clinical and scientific importance. A current and future challenge for the Cochrane Collaboration, as well as other bodies responsible for external reviews, is to ensure they are both accurate and balanced.
This report is written on behalf of the European ADHD Guidelines Group, whose members are: Philip Asherson, Tobias Banaschewski, Daniel Brandeis, Jan Buitelaar, David Coghill, Samuele Cortese, David Daley, Marina Danckaerts, Ralf Dittmann, Manfred Doepfner, Maite Ferrin, Chris Hollis, Martin Holtmann, Eric Konofal, Michel Lecendreux, Aribert Rothenberger, Paramala Santosh, ES, Edmund Sonugra-Barke, Cesar Soutullo, Hans-Christoph Steinhausen, Argyris Stringaris, Eric Taylor, Saskia Van der Oord, Ian CK Wong, Alessandro Zuddas.
Competing interests TB served in an advisory or consultancy role for Actelion, Hexal Pharma, Lilly, Medice, Novartis, Oxford outcomes, Otsuka, PCM scientific, Shire and Viforpharma. He received conference support or speaker's fee from Medice, Novartis and Shire. He is/has been involved in clinical trials conducted by Shire and Viforpharma. He received royalties from Hogrefe, Kohlhammer, CIP Medien and Oxford University Press. JB has, in the past 3 years, been a consultant to/member of the advisory board of/and/or speaker for Janssen Cilag BV, Eli-Lilly, Lundbeck, Shire, Roche, Medice, Novartis and Servier. He has received research support from Roche and Vifor. He is not an employee nor stock shareholder of any of these companies. He has no other financial or material support, including expert testimony, patents, royalties. DC reports grants and personal fees from Shire, personal fees from Eli-Lilly, grants from Vifor, personal fees from Novartis and personal fees from Oxford University Press. SC: Since January 2016 onwards, SC has received reimbursement for travel and accommodation expenses from the Association for Child and Adolescent Central Health (ACAMH), a non-profit organisation, in relation to lectures that he delivered for ACAMH. He declares the absence of any financial conflicts of interest.ICKW receives grants from the European Union FP7 programme during the conduct of the study; grants from Shire, grants from Janssen-Cilag, grants from Eli-Lilly and grants from Pfizer, outside the submitted work; he ICKW is a member of the National Institute for Health and Clinical Excellence (NICE) ADHD Guideline Group and acted as an advisor to Shire.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.