Effectiveness and cost-effectiveness of universal school-based mindfulness training compared with normal school provision in reducing risk of mental health problems and promoting well-being in adolescence: the MYRIAD cluster randomised controlled trial

Background Systematic reviews suggest school-based mindfulness training (SBMT) shows promise in promoting student mental health. Objective The My Resilience in Adolescence (MYRIAD) Trial evaluated the effectiveness and cost-effectiveness of SBMT compared with teaching-as-usual (TAU). Methods MYRIAD was a parallel group, cluster-randomised controlled trial. Eighty-five eligible schools consented and were randomised 1:1 to TAU (43 schools, 4232 students) or SBMT (42 schools, 4144 students), stratified by school size, quality, type, deprivation and region. Schools and students (mean (SD); age range=12.2 (0.6); 11–14 years) were broadly UK population-representative. Forty-three schools (n=3678 pupils; 86.9%) delivering SBMT, and 41 schools (n=3572; 86.2%) delivering TAU, provided primary end-point data. SBMT comprised 10 lessons of psychoeducation and mindfulness practices. TAU comprised standard social-emotional teaching. Participant-level risk for depression, social-emotional-behavioural functioning and well-being at 1 year follow-up were the co-primary outcomes. Secondary and economic outcomes were included. Findings Analysis of 84 schools (n=8376 participants) found no evidence that SBMT was superior to TAU at 1 year. Standardised mean differences (intervention minus control) were: 0.005 (95% CI −0.05 to 0.06) for risk for depression; 0.02 (−0.02 to 0.07) for social-emotional-behavioural functioning; and 0.02 (−0.03 to 0.07) for well-being. SBMT had a high probability of cost-effectiveness (83%) at a willingness-to-pay threshold of £20 000 per quality-adjusted life year. No intervention-related adverse events were observed. Conclusions Findings do not support the superiority of SBMT over TAU in promoting mental health in adolescence. Clinical implications There is need to ask what works, for whom and how, as well as considering key contextual and implementation factors. Trial registration Current controlled trials ISRCTN86619085. This research was funded by the Wellcome Trust (WT104908/Z/14/Z and WT107496/Z/15/Z).

mindfulness input throughout year groups, suggested smartphone apps and using parts of the SBMT programme in core curriculum subjects.
Our approach to implementing the SBMT was informed by theory and implementation science (Tudor et al., under review), and was designed to be fully integrated into the school curriculum, over several years. Because implementation affects both reach and outcomes, all schools were supported with implementation guidance to increase the likelihood that it was introduced into the schools in ways that maintain its integrity and are sustainable. Implementation started with engaging the school leadership team, and then identifying a potential pool of teachers from within the school who could be trained and timetabled to deliver it to the pupils. The selected teachers then went through a training programme (see below).
All participating schools randomised to SBMT agreed to deliver the SBMT programme to a minimum of three classes within years 8 and/or 9 or equivalent year groups across the nations (pupils aged 12-14), but were also encouraged to consider how they might introduce mindfulness into the curriculum more broadly, for the potential benefit of other school pupils and the wider school climate.
The SBMT teacher training involved first participating in an 8-week personal mindfulnessbased cognitive therapy for life (MBCT-L) programme. MBCT-L was developed as a mindfulness training for the general population that supports resilience and well-being. The programme comprises eight two-hour sessions per week, with an all-day mindfulness session supported by a course handbook and online mindfulness practices (Kuyken et al., manuscript in preparation). Participants are encouraged to develop a daily mindfulness practice, both during the MT and to sustain this in an ongoing way afterwards. From the pool of teachers undergoing personal mindfulness training, schools selected the sample of teachers to go forward with the SBMT. Senior leadership teams in schools based selection on whether teachers would be willing and available to attend the further training and could be timetabled to teach the SBMT to participating study classes. Teachers who were selected to progress then attended a 4-day training workshop to learn how to deliver the SBMT curriculum to students. Following this 4-day training, participating teachers taught at least one complete SBMT curriculum to students, with support from an experienced mentor, before going on to teach the study students.
Within participating schools, as many teachers as possible were invited to attend the personal mindfulness training, to give schools the best opportunity to timetable the required number of teachers to teach the SBMT curriculum to study classes. Further embedding SBMT into the school included opening the training up to staff beyond the nominated teachers, helping schools integrate mindfulness into their school improvement plan by providing document templates, making mindfulness practice part of teacher catch-up days, professional development days, suggested schedules for progressive, regular mindfulness input throughout year groups (e.g., during assemblies), suggested free smartphone apps for both students and teachers and using mindfulness skills throughout the school curriculum.
We have reported separately on the acceptability, effectiveness and cost-effectiveness of this teacher training  as well as the relative merits of less and more intensive training for school teachers in terms of acceptability, effectiveness, and mechanisms (Montero-Marin et al., in press). Based on the findings from this work the teachers received the more intensive mindfulness curriculum as well as ongoing support to implement and deliver the SBMT curriculum to pupils.

Teaching as usual
The trial aimed to establish if SBMT, when integrated into social-emotional teaching in secondary schools, adds value over and above current good practice. Recent UK Department of Education reports suggest that 60% of secondary schools offer Personal, Social, Health and Economic Education (PSHE) lessons that are 'good or better' and that this provision occurs across ages 11-16 years (Key Stages 3 and 4) through a variety of methods including regular scheduled lessons, drop-down days, within other subjects, and in tutor/form time. (Department of Education, 2010). Determining whether schools have good PSHE provision is challenging. In cohort 1, schools were eligible for inclusion if their provision of PSHE (or equivalent) met four criteria: (1) the presence of discrete, regular, named teaching time for PSHE, (2) a named PSHE lead, (3) a written PSHE policy and (4) a named member of the senior leadership team responsible for PSHE. However, for cohort 2, the 'written SEL policy' criterion was modified to "documentation denoting clear strategic planning of SEL within the school." Experience in cohort 1 indicated that schools do not always use the term 'SEL policy' to denote strategic planning of SEL. Moreover, in some cases, there are schools that have an extensive, well-established and well-documented SEL curriculum, indicative of a clear structure and strategy around SEL, but do not have this formalised as a school policy. TAU schools agreed not to provide the MT programme (or other curricula that include MT) until study completion. While SEL provision was not uniform in the TAU arm, content is intended to prepare students with the knowledge, skills and attributes they need to manage their lives. It typically covers relationships, sex education, and physical and mental health education.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Demographics
Pupil demographics, including gender (male, female), ethnicity (White, Arab/Arab British, Black/African/Caribbean/Black British, Mixed/Multiple Ethnic Groups, Other Ethnic Group), were gathered via pupil self-report at baseline. Year group (year 7, year 8, year 9, year S1), and dates of birth were reported by the school, and the research group calculated the corresponding age.

Co-primary outcomes
Center for Epidemiologic Studies for Depression Scale (CES-D; Radloff, 1991).
The "Center for Epidemiologic Studies for Depression Scale" (CES-D; Radloff, 1991) is a 20-item self-report questionnaire that assesses depressive symptoms in the past week (e.g., "I felt depressed"). It has been validated for use in adolescents (Radloff, 1991). Each item is rated on a rating-scale from 0 ("rarely or none of the time"), to 3 ("most or all of the time"), yielding a total score between 0 and 60, with higher scores indicating greater risk for depression. The following two cut-off points have been proposed: a) a lower cut-point of 16, to identify pupils at risk of depression (Rushton, Forcier & Schectman, 2002), and b) a higher cut-point of 28 to identify pupils with symptoms likely to meet diagnostic criteria for major depressive disorder (Radloff, 1991). The Cronbach's alpha value of the CES-D in our study at baseline were α = 0·88, being of α = 0·91 at pre-intervention, of α = 0·92 at postintervention, and of α = 0·92 at 1-year follow up.
Warwick-Edinburgh Mental Well-being Scale (WEMWBS; Tennant et al., 2007) The "Warwick-Edinburgh Mental Well-being Scale" (WEMWBS; Tennant et al., 2007) is a 14-item measure assessing both feeling and functioning aspects of mental well-being over the last two weeks (e.g., "I've been feeling useful"). Items are scored on a rating-scale from 1, "none of the time" to 5, "all of the time", yielding a total score that ranges between 14 and 70. Items are worded positively and therefore higher scores indicate greater levels of mental well-being. The WEMWBS measure has been validated for its specific use in adolescents (Clark et al., 2011). There are not established cut-offs for the WEMWBS, although using M±1SD has been suggested. The internal consistency value of the WEMWBS in our study at baseline was α = 0·88, being of α = 0·87 at pre-intervention, of α = 0·89 at post-intervention, and of α = 0·91 at 1-year follow up.
Behaviour Rating Inventory of Executive Function, Second Edition (BRIEF2; Gioia et al., 2015) The "Behaviour Rating Inventory of Executive Function, Second Edition" (BRIEF2, Gioia et al., 2015) is a 55-item self-report measure designed to assess self-perception of everyday behaviours associated with executive function in older children and adolescents (aged 11-18), e.g., "I have trouble sitting still". The BRIEF-2 assesses executive function across the past 6 months and takes into account 7 domains: inhibit; self-monitor; shift; emotional control; task completion; working memory; and plan/organize. Items are rated as follows: 1 = "never", 2 = "sometimes", 3 = "often". Total scores are calculated by summing the sub-scores, with higher scores suggesting higher levels of executive dysfunction. The three items of the infrequency scale ("I forget my name", "I have trouble counting to three", "I cannot find the front door of my home") are only used as indicators of validity and are not included in the calculation of raw scale scores, so that the total score ranges between 52 and 156, and therefore higher scores meaning worse executive functioning. The pupil-report of this inventory was used in the present study. The internal consistency (Cronbach's alpha value) of the BRIEF-2 at preintervention was α = 0.97, α = 0.97 at post-intervention, and α = 0.97 at 1 year follow up.

Drug and alcohol use (risk measure)
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) The Risk Measure contains seven questions assessing risk behaviours split into three subscales: smoking, drinking, and cannabis and other substances (https://www.thereachstudy.com/uploads/7/3/2/1/73211845/reach_protocol.pdf and http://www.espad.org). The questions gauge frequency of cigarette, alcohol, and cannabis use and include binary "yes/no" questions, e.g., "Have you used cannabis ever?", as well as frequency scales, e.g., "Which of the following apply to you. I drink alcohol:" ranging from 0, "never", to 6, "every day or almost every day". The "Cannabis and Other Substances" subscale asks whether participants have tried a list of 11 substances/substance categories, e.g., "amphetamines/methamphetamine (e.g., speed, crystal meth)" with binary "yes/no" response options. It includes one "dummy" substance, "Decopan"; if a participant responded 'yes' to this item, all drug positive responses to this section of the risk measure were excluded from analysis. In addition, if a participant responded 'yes' to painkillers and no other drugs in this section, the positive painkiller response was excluded from analysis.
The Revised Children's Anxiety and Depression Scale (RCADS; Weiss & Chorpita, 2011) The "Revised Children's Anxiety and Depression Scale" (RCADS; Weiss & Chorpita, 2011) is a youth self-report questionnaire that measures separation anxiety disorder (SAD), social phobia (SP), generalized anxiety disorder (GAD), panic disorder (PD), obsessive compulsive disorder (OCD), and major depressive disorder (MDD). Adapted for the MYRIAD project, the questions that make up the depression subscale were removed, to leave a 37-item questionnaire measuring anxiety, asking 'how often' each item happens (e.g., "I worry when I think I have done poorly at something"). Measured using a 4 point Likert-type scale (0 = Never, 1 = Sometimes, 2 = Often, and 3 = Always), the questionnaires subscales are scored by summing the items. A total Anxiety Scale can also be calculated by summing the 5 subscales. The internal consistency (Cronbach's alpha value) of the RCADS total score at pre-intervention was α = 0.96, being of α = 0.96 at post-intervention and of α = 0.96 at 1-year follow up.

Suicide and Self-harm
Suicide ideation and self-harm were measured using three self-report items (Madge et al., 2008). Participants were asked to consider their thoughts and behaviour since the last MYRIAD school visit. They could respond "yes", "no" or "prefer not to say" to the following statements: "Have you thought that life was not worth living, or that you would be better off dead?"; "Have you thought seriously about trying to harm yourself in some way (for example by cutting yourself or taking an overdose of pills or other medication)?", and "Have you actually, deliberately harmed yourself in some way (for example by cutting yourself or taking an overdose of pills or other medication)?".
The Child and Adolescent Mindfulness Measure (CAMM; Greco et al., 2010) The "Child and Adolescent Mindfulness Measure" (CAMM; Greco et al., 2010) is a selfreport mindfulness skills scale designed specifically for use with children and adolescents. It consists of 10-items, which measure awareness of the present moment as well as nonjudgemental and non-avoidant responses to thoughts and feelings (e.g., "I keep myself busy so I don't notice my thoughts or feelings"). Participants are asked how often each sentence is true, and responses are given using a 5-point Likert-type rating-scale ranging from 0, "Never True", to 4, "Always True". Each item is reverse scored and summed, producing a total score of 0-40, with higher scores corresponding to higher levels of mindfulness. The CAMM has BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) been validated for use in non-clinical samples of adolescents (de Bruin et al., 2013;Kuby et al., 2015), and has adequate psychometric properties (Greco et al., 2010). The internal consistency (Cronbach's alpha value) of this measure of mindfulness skills at preintervention was α = 0.84, α = 0.86 at post-intervention and o α = 0.88 at follow up.

School Climate and Connectedness Survey (SCCS; Association of Alaska School Boards, 2015)
The pupil version of the "School Climate and Connectedness Survey" (SCCS) measures aspects of school climate (social and environmental factors that contribute to the subjective experience of a school), and connectedness (perceptions and feelings about the people at school) for students, asking them to consider the way the school is 'most of the time' (e.g., "Teachers here are nice people"). There are 63 questions (9 subscales) on the original SCCS scale, all with 5-point Likert-type responses (e.g., from 1, "strongly agree", to 5, "strongly disagree"). For the current study, 21 questions (4 subscales) from the original SCCS questionnaire were employed: "School Leadership and Student Involvement Scale", "Respectful Climate", "Peer Climate Scale" and "Caring Adults Scale". Total scores were calculated by summing the corresponding subscale scores, with higher scores reflecting a more positive school climate and connectedness. The internal consistency of this measure of school climate at pre-intervention was α = 0.91, being of α = 0.91 at post-intervention and of α = 0.91 at 1-year follow up.
The Child Health Utility 9D (CHU9D; Stevens, 2011) The "Child Health Utility 9D" (CHU9D) is a preference-based measure of health-related quality of life in young people. This is suitable for the calculation of Quality Adjusted Life Years (QALYs) and has been shown to be valid and responsive to change in adolescent populations (Furber & Segal, 2015). It includes nine health dimensions: worried, sad, pain, tired, annoyed, schoolwork, sleep, daily routine, and ability to join in with activities (not all of these dimensions are 'health dimensions'). Each dimension has five levels increasing with severity (e.g., from 1 = "I don't feel worried today" to 5 = "I feel very worried today"). Scores were produced using the UK SPSS Syntax for utility for CHU9D (Stevens, 2008). The measure was originally developed for use with ages 7-11 but it has since been validated for use with adolescents (Stevens and Ratcliffe, 2012). The CHU9D has also been demonstrated to have face, content, and construct validity (Stevens, 2011).
Child and Adolescent Service Use Schedule (CA-SUS; Byford et al., 2007) Service use was recorded using a brief version of the "Child and Adolescent Service Use Schedule" (CA-SUS; Byford et al., 2007) in a format that was suitable for self-completion by adolescents in schools. The measure asked pupils to recall use of key services and resources over the last three months. Data were collected on the number of contacts with hospital services (e.g., inpatient stays, outpatient contacts, accident and emergency attendances), community health and social care services (e.g., GP, social worker, pharmacist, school nurse etc.), accommodation services (e.g., foster care, residential care, respite care), and teaching support services as well as prescribed psychotropic medication. The CA-SUS was based on previous versions that have been successfully applied in adolescent depression populations (Byford et al., 2007), in particular a brief version focused on key services (high cost and high volume of use) which was designed for self-completion by parents of primary school children in a school-based cluster RCT (Ford et al., 2018). Economic data were collected at pre-and BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) post-intervention and at one-year follow-up. Reported service use at each timepoint was scaled to cover the entire duration of the follow-up period.

Sessions Attended
The number of SBMT sessions that pupils attended (which ranged between 0 and 10) was completed by the SBMT teacher teaching the lesson, who was in charge of taking pupil attendance.

Pupil acceptability
We assessed pupil's ratings of the SBMT's acceptability at post-intervention using a 5-item rating-scale measure ("How much does what's being taught in these lessons make sense to you in helping you to deal with issues young people face?", "Do you think that these lessons will help you have a healthier lifestyle?", "Would you recommend these lessons to a friend?", "How important do you think it is that these lessons are available to young people?", "How successful do you believe these lessons are likely to be in decreasing problems or issues that young people have?"), answered on a Likert-type scale (0 ="not at all", 10 ="a great deal"). Total scores were calculated by summing all the items divided by the number of items (mean total scores, ranging between 0 and 10). The internal consistency (Cronbach's alpha value) of this measure at post-intervention was α = 0.95.

Pupil engagement
We assessed the extent (i.e., frequency) of pupil home-based mindfulness practice during the SBMT intervention, and also after the SBMT intervention, using a 6-item rating-scale measure that included the following items: "During the course you were taught a range of mindfulness practices. How often did you practice being mindful?", "During the course you were invited to pause and focus on your breathing by doing a 7-11 or FOFBOC or a .b (i.e. stop, breathe and be). How often did you do this?", "During the course you were taught to use 'beditation' as a way of helping you get to sleep. How often did you do this?", "During the course you were asked to be mindful in your everyday lives, for example walk a short distance mindfully, or eat a mouthful of food mindfully. How often did you do this?", "During the course you were asked to notice stress in your body, e.g. 'stress signature' in difficult times, noticing where in the body you were feeling stress. How often did you do this?", "During the course you were taught to think about your thoughts as passing objects such as buses, clouds or rivers that pass through your mind. How often did you do this?". Items were answered at post-intervention (to respond to the frequency of mindfulness practice during the SBMT intervention), and at 1-year follow up (to respond to the frequency of mindfulness practice after the SBMT intervention), on a Likert-type scale, ranging from 0 = "never", to 5 = "almost every day". Scores were calculated by summing all the items divided by the number of items (mean total scores, that ranged from 0 to 5). Therefore, higher scores represent a higher frequency of pupil home-based mindfulness practice. The internal consistency (Cronbach's alpha value) of this measure of pupil engagement with the mindfulness practice at post-intervention was α = 0·89, being of α = 0·89 at the 1-year follow up measurement.

Teacher Measures
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Adherence (Fidelity)
The fidelity to the original ".b" SBMT programme was measured as the percentage of the standardised curriculum that was covered overall in two randomly selected lessons per intervention class. All SBMT lessons were filmed and a subset of 2 out of the 10 possible lessons from each class were evaluated. The two lessons evaluated for each class were randomly chosen by a computer random number generator from a subset of combinations which were chosen as they provided the best opportunities of observing a full practice (these possible combinations were: 3&6, 3&7, 4&6, where available; in cases where these classes were not available, e.g. not recorded, other appropriate classes were reviewed). Teachers did not know in advance that these combinations would be chosen. For each randomly selected lesson observed, independent evaluators indicated whether key curriculum elements (essential and non-essential, as they are defined by the ".b" SBMT teaching materials), were delivered or not. These ratings were summarised as the percentage of curriculum elements covered per each lesson, and they were finally averaged across the two ".b" randomly selected lessons to provide a percentage of elements covered per intervention class.

MBI-TAC-Teach -Competency (Quality)
To assess the quality of the SBMT intervention we considered the competency of the teaching that pupils received. All SBMT lessons were filmed and a randomly selected subset of 2 out of the 10 possible lessons from each class were rated using the "Mindfulness-Based Interventions -Teaching Assessment Criteria" (MBI-TAC; . External evaluators did not see the pupils but only the research team, as the videos were anonymised. Lessons were rated by one of four different assessors using an adapted version of the MBI-TAC for the teaching context (MBI-TAC-Teach). External evaluators were experienced MBI teachers, who had a recognised mindfulness training pathway, were qualified to teach the SBMT (".b") having more than two years of experience, and were qualified classroom teachers. They were trained in the use of the MBI-TAC-Teach assessment taking part in two days of training where the MBI-TAC-Teach was introduced by two experienced supervisors. They all had experience of being rated by the MBI-TAC in their own teaching pathway so all of them were familiar with that tool. The training focused on the specific aspects of the MBI-TAC-Teach tool, and the training was based on collective discussions and evaluations of some case studies to ensure standardisation. Evaluators were also allowed time to rate some examples independently to ensure consistency and that these ratings were within an acceptable range for all the evaluators at the end of the training. All evaluators took part in regular supervision sessions with the aim of ensuring/maximising assessment standardisation. A randomly chosen 'back-up' lesson was also used by the evaluators if they felt that observing the first two lessons did not provide sufficient evidence for the overall ratings. Certain lessons were not used, for example, lesson 5 ('Moving Mindfully') as pupils move around in this lesson meaning that it was not easy to completely capture the pupil and teacher interactions on film. If videos were not available for the chosen combination of lessons, then a decision was made to use different lessons based upon the videos available and the lessons that would provide the best opportunity to observe all domains. The MBI-TAC was developed in the context of Mindfulness-Based Stress Reduction and Mindfulness-Based Cognitive Therapy and was adapted to MBI-TAC (Teach) to be used to rate classroom teachers teaching mindfulness to young people in school contexts. Competence is rated across 6 domains on a 6-point scale (1 = "incompetent", 2 = "beginner", 3 = "advanced beginner", 4 = "competent", 5 = "proficient", and 6 = "advanced"). The domains assess: coverage, pacing and organisation of session curriculum; relational skills; embodiment of BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) mindfulness; guiding mindfulness practices; conveying course themes through interactive enquiry and didactic teaching; and holding the group learning environment. Evaluators provided competency ratings on the 6 domains, and an overall competency rating per lesson (based on their own overall assessment rather than a sum score of the 6 domains), for the two randomly selected lessons per intervention class. Based on the two lessons, an overall rating per domain for that class was completed, and then used by evaluators to provide one overall final competency rating per class as a measure of the quality of the intervention delivery.

Implementation of mindfulness within the school curriculum
It was measured to what extent ".b" was: i) additive to the PSHE curriculum, so that it was unaffected (either ".b" did not take place in PSHE lessons, or if they did, PSHE curriculum content was either not removed, or taught elsewhere if removed); ii) substitutive, and therefore PSHE content was removed to make room for .b and it was not taught elsewhere; iii) partially additive/partially substitutive, and thus PSHE curriculum content was condensed to make room for ".b" content.

Broad context, community, and operational features
We gathered school characteristics such as the region (England, Scotland, Wales, Northern Ireland), urbanity (urban, rural), school size (<1,000 pupils, ≥1,000 pupils), type of school (mixed, girls only), Ofsted school quality rating (does not require improvement: outstanding, good; requires improvement: requires improvement, inadequate), school deprivation (% of pupils eligible for free school meals). We also described (a) the provision quality of 'Personal, Social, Health and Economic Education' (PSHE) (see next paragraph).
School quality is measured differently in public and private schools and across the nations. We developed a measure to allow all the different school inspection rating systems to map onto each other to allow us to use the terms "outstanding" to "requires improvement." Social Emotional Learning (SEL) in England is taught as part of 'Personal, Social, Health and Economic Education' (PSHE) lessons. Due to the fact that delivering PSHE lessons in schools is not mandatory in England, there is a wide variation across schools in the delivery of PSHE lessons (in terms of content covered and teaching time allocated). A literature review highlighted that there was no existing measures of PSHE which would allow the current study to assess which schools had a minimum level of good practice in PSHE to be considered for participation in the study. Thus, a new PSHE assessment tool was devised for the current study. For inclusion in the trial, schools had to meet 5 criteria for their current PSHE provision: regular, discrete, named teaching time for PSHE (or equivalent); a designated PSHE lead; a named member of the Senior Leadership Team (SLT) responsible for PSHE; documentation denoting clear strategic planning of SEL within the school; and evaluation of pupil progress in PSHE. Once schools became a participating trial school, PSHE was assessed by discussing PSHE provision with the teacher responsible for PSHE at each school (or a member of the Senior Leadership Team). Sixteen quality indicators (listed below) were used to assess PSHE provision. They were created specifically for this trial and identified through a review of existing measures and via expert consultation (Department of BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Supplement C Costs
For each item of service use reported in the CA-SUS, a nationally applicable unit cost was applied to calculate the costs for each participant. Unit costs for hospital services were sourced from NHS reference costs (Department of Health, 2019). Costs from the annually published unit costs of health and social care compendium were applied to community-based health, social care and Local Authority accommodation services (Curtis & Burn, 2019). Costs applied to some community services were also sourced from webpages of charitable organisations (NSPCC, 2021). The costs of medications were based on prices listed in the British National Formulary for Children (Royal Pharmaceutical Society of Great Britain, 2019) and Prescription Cost Analysis data (NHS Business Services Authority Statistics, 2019). Costs for teaching support were estimated from information published by the National Education Union (National Education Union, 2019). All unit costs applied were for the financial year 2018-19 (summarised in Table S1) and are reported in UK pounds sterling. Costs incurred after 12-months from the start of the trial (for those whose 1-year follow-up was late) were discounted by 3.5%, as recommended by NICE (National Institute for Health and Clinical Excellence, 2013).
Resource inputs into SBMT training and delivery were costed using a micro-costing approach and included costs of training, materials, supply cover and subsistence. Data on intervention contacts and the costs of training and materials were collected directly from trial records. Total costs were calculated separately for phase 1 (self-mindfulness training) and phase 2 (syllabus training) for all teachers taking part in each phase of the trial.

Further details of economic methods
The use of all services is reported by trial arms as the mean, standard deviation, and range as well as the percentage of the sample for each arm that had at least one contact. Differences in resource use were not tested for statistical significance to avoid excessive significance testing and to keep the focus of the economic analysis on cost and cost-effectiveness.
For each participant, all costs were summed to calculate total costs over the 1-year follow-up. Costs and outcomes, including costs per sector, were summarised using the mean and standard error for each trial arm and the differences between the two were compared using standard parametric t-tests. Despite the skewed nature of cost data, this method allows inferences to be made about the arithmetic mean, which are more meaningful from a decision-making perspective (Thompson & Barber, 2000).
Cost-effectiveness was assessed using the net benefit approach (Stinnett & Mullahy, 1998), with joint effects of costs and outcomes estimated using mixed effects linear regression. A cost-effectiveness acceptability curve (CEAC) (Fenwick & Byford, 2005) was then constructed to examine the probability of the SBMT intervention being cost-effective compared to control for a range of possible values of willingness-to-pay per unit improvement in outcome. Cost-effectiveness was assessed first in terms of QALYs measured using the CHU9D (scenario #1). Secondary analyses explored cost-effectiveness in terms of the three co-primary clinical outcomes to assess the sensitivity of analyses to the alternative outcomes of interest (scenarios #2-4).
Pre-specified sensitivity analyses were undertaken to assess the impact of missing data by undertaking a complete-case analysis (scenario #5), the impact of taking a health and social care perspective preferred by NICE (excluding teaching support; scenario #6) and including only mental health services to better explore the impact of the intervention on mental health (scenario #7).
Two post-hoc sensitivity analyses were added. First, due to difficulties in timetabling around school holidays, four schools (three in intervention, one in control) completed the 1-year follow-up assessment after 14-months post-randomisation ( Figure S1). We therefore assessed the impact of variable follow-up duration by excluding cases with follow-up assessment after 14-months post-randomisation (scenario #8). Secondly, reports of use of prescribed psychotropic medication were deemed unreliable as a result of reports of use of antipsychotic medication that were substantially higher than general population prescribing patterns. We therefore assessed the impact of the quality of this data by excluding service use data on the use of prescribed psychotropic medication (scenario #9).
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Further details on the CONSORT diagram
The baseline characteristics of the schools and pupils have previously been reported . Randomisation of schools (clusters) was followed by 'randomisation of study classes' within schools which happened over the space of approximately six months at maximum, up until the pre-intervention data collection. Discussions with schools centred on approximate classes, often using class identifiers which at the time did not specifically refer to particular pupils. Rather, class identifiers were a proxy for pupils.
Pupils only became 'study pupils' at the point at which we received and confirmed who was in the selected study classes (i.e., confirmed the registers). For some schools, this was on our pre-intervention visit. Therefore, individual pupils did not have the opportunity to 'drop out' of the study between randomisation of schools, randomisation of study classes and preintervention data collection, as the pupils were 'selected' up until that point.

Representativeness of study schools and pupils
Eighty seven percent of schools were mixed (the remainder being girls only schools); 13% required improvement based on their school quality rating (for schools outside England, they were reviewed and scored as per the England ratings to align to the same scale); and just over a third of schools had a higher percentage of children eligible for free school meals than the national median (29·4%) (Department of Education, 2020). Descriptive data at baseline and pre-intervention suggest we broadly recruited a sample of pupils who are representative with respect to our primary outcomes regarding risk for depression (mean (SD) was 13·9 (9·7) in a previous study of Canadian students aged 12 or 13) (Briere et al., 2013), and well-being (mean (SD) was 48·8 (6·8) in a previous study of UK pupils aged 13 to 16)  but had slightly poorer social-emotional behavioural functioning (normative mean (SD) was reported to be 10·3 (5·2) on SDQ information website accessed on 26 th April 2021: https://www.sdqinfo.org/norms/UKNorm3.pdf) than the wider population.

Number of schools
All school level measures (e.g. Admissions type, Urbanicity and school size) were obtained using publicly available data published by the constituent nation the participating schools resided within. Data was usually obtained online from the education and statistics departments (e.g., Department of Education, England). Where this wasn't available, the data was obtained through email correspondence with the department. In all cases, published publicly available data were collected according to its proximity to the year in which participating pupils provided Baseline (T0) questionnaire data. Specifically, characteristics of the school community and operational features of the school were all obtained using data collected and released by the government as part of the annual school census. All public schools in the UK are required to provide data to their local authority yearly, which is then processed and released by the relevant nations governing education or statistical department.' School rating: Each country within the UK has its own schools inspectorate, each with a different quality rating system. To allow for analysis of school quality ratings across schools we mapped them all onto a single quality rating system. The majority of the schools in the project are inspected by Ofsted (England's state-funded schools inspectorate) with only 12 schools evaluated by a different inspectorate. It was therefore decided the 12 schools would have their ratings mapped onto the Ofsted rating system. Two researchers independently analysed the 12 schools' inspectorate reports and assigned them a quality rating from 1 -4 to be in line with Ofsted quality ratings (1 being the highest quality rating and 4 the lowest). None of the schools were given a rating of 4 (inadequate), this is in-line with the exclusion criteria for project participation which stated that to take part a school could not have an inspection rating of inadequate or equivalent. The report published closest to, and before, the date the school entered the project was used where possible. The Ofsted School Inspection Handbook 2018 informed the researchers' mappings. The inter-rater reliability for mappings was 91%.
These are the websites from which data was sourced:

Availability of data
The availability of health economics data is summarised in Supplementary Table S5. Data for all assessment periods were available for 3313 (78%) participants in the intervention group and 3287 (79%) participants in the control group.

Service use
Supplementary Table S6 summarises the use of hospital, community health and social care, accommodation, and teaching support services over the 1-year follow-up period. Overall, all services were accessed by similar proportions of participants and mean service use was also similar across trial arms.
For hospital services, mean service use for highest for outpatient appointments related to injury for both trial arms (mean number of appointments 1.38 in intervention and 1.34 in control). Mean service use was low for attendances related to mental health problems for both inpatient (mean number of nights 0.04 in intervention and 0.03 in control) and outpatient (mean number of appointments 0.33 in intervention and 0.29 control) services. GP, pharmacist, and school nurse were the most commonly used community services in both trial arms. Less than 1% of participants used Local Authority provided accommodation (foster or residential care) across trial arms. Use of teaching support was the same across trial arms (14% of participants used teaching support on some days and 2% used teaching support daily).
Mean use was higher in the intervention group for some services (e.g., outpatient -other, nurse/midwife) and higher in the control group for others (e.g., counselling, respite care). Overall, however, these differences were very small. The use of prescribed medication for mental health problems is summarised in Supplementary Table S7. Medication use was similar across trial arms, with 3% of participants using medication for anxiety and depression or eating disorders, 2% using medication for ADHD and 1% using medication for Tics/Tourette's.
Medication for psychosis was reported as most used in both trial arms (5% in intervention and 4% in control). However, this is much higher than what is reported in the literature (antipsychotics were prescribed to 1.74% pupils with intellectual disabilities and 0.12% without) (Henderson et al., 2021). We therefore assessed the impact of excluding medication costs in post-hoc sensitivity analyses (Supplementary Table S10; scenario #9).

Intervention costs
The average cost of delivering SBMT was £1,906·54 per teacher or £73·85 per pupil. A breakdown of the intervention cost is provided in Supplementary Table S8. Total average cost was £176·50 per teacher for phase 1 (self-mindfulness training) and £1,730·04 per teacher for phase 2 (syllabus training).

Costs and outcomes
Data used in the economic analyses, including costs, CHU9D utilities and QALYs, and the three co-primary clinical outcome measures, are reported in Supplementary Table S9. Total mean costs and QALYs were higher in the intervention arm (mean costs £1333·57, SD=£2389·40; mean QALYs 0·871, SD=0·130) compared to the control arm (mean costs £1290·79, SD=£1379·13; mean QALYs 0·847, SD=0·131). However, differences between trial arms in both costs and QALYs were small in adjusted analyses (adjusted mean difference in cost £6·84, 95% CI -£128·04 to £141·72; adjusted mean difference in QALYs 0·012, 95% CI -0·015 to 0·038). Differences between trial arms in all three primary outcome measures were also small, with better outcomes (lower scores) reported in the control arm compared to the intervention arm.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Cost-effectiveness analyses
Incremental costs and effects over 1-year follow-up for all analyses are presented in Supplementary Table S10.
The main cost-utility analysis using QALYs as the outcome of interest (scenario #1) suggests that SBMT has a higher probability (83%;) of being cost-effective at the willingness-to-pay threshold range used by NICE (£20,000-£30,000 per QALY) compared to control. All sensitivity tests of the main cost-utility analysis (three pre-specified and outlined in the main paper and two post-hoc and outlined in this supplement) support this suggestion that SBMT has a higher probability of being cost-effective (probability >50%) than control for willingness-to-pay thresholds ranging from £20,000-£30,000 per QALY (Supplementary Figure S2). Secondary cost-effectiveness analyses, using the primary clinical outcomes as the measure of effect, suggest that SBMT has a lower probability of being cost-effective (<40%; Supplementary Figure S1) than control for all willingness-to-pay thresholds.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)