Methods
This study was prospectively registered on the Australian and New Zealand Clinical Trial Registry (ACTRN12619000855123), Consolidated Standards of Reporting Trials guidelines followed,13 a protocol paper,9 process evaluation14 and baseline characteristics paper15 published. This trial was underway during the COVID-19 pandemic, and adjustments to data collection methods were necessary during October 2020 and July–September 2021, where some assessments were conducted remotely via Zoom when students were learning from home because of lockdowns. This impacted 1% of participants, of which 46% were in the intervention group and 54% in the control group.
Study design and participants
A cluster randomised, controlled, single-blind trial that consisted of two parallel arms (intervention and control) with 1:1 allocation was conducted. A cluster approach was selected to avoid potential contamination within schools. Recruitment occurred between March 2019 and March 2022. All schools located in the state of New South Wales (Australia) were invited to participate via email invitation. Non-government schools in capital cities around Australia were also invited to participate. A member of the research team contacted invited schools with a phone call to explore interest in participation and explain the study details. For eligibility, schools were required to have a counsellor or well-being staff member onsite during the data collection visits. Year 8 (13–14 years) students at participating schools were invited to take part. Eligibility criteria included participants attending a participating school, ability to provide active informed consent from both the parent/guardian and student, a mobile phone number and a smartphone with iOS or Android operating system. There were no restrictions placed on whether participants could seek or change treatments or therapy during the study period. There were no exclusion criteria.
Randomisation and blinding
Schools were randomised with a 1:1 allocation using a computer-generated randomisation schedule, stratified by school size (smaller/larger than 400 students), school location (metropolitan vs regional), school type (coeducational or gender selective) and socioeconomic level (Index of Community Socio-Educational Advantage16; high vs low). The trial statistician conducted the randomisation and was unaware of school identity (and hence condition) throughout the trial and analyses. School allocation was communicated only to the trial manager and team running the trial day to day. The investigator team were unaware of allocation. Schools were not explicitly informed of their allocation but could deduce it from study activities. All outcome assessments were conducted electronically and therefore not subject to assessor bias.
Outcomes and data collection methods
Study outcomes were assessed on student laptops in class and occurred at baseline, post intervention (6 weeks post baseline) and follow-up (12 months post baseline; primary endpoint). The assessments were facilitated by our research team who attended schools in person or via Zoom for data collection visits. Several strategies were put in place to minimise study attrition and maximise survey completion. These included reminding both schools and students about upcoming data collection sessions, having the team provide an overview of the Future Proofing Study and deliver a brief presentation about the value of participating in research at each data collection session, offering snacks for the students to enjoy at the data collection visits and enabling students who were away from school or who moved school the option to complete surveys in their own time. Additionally, at 6 months post baseline, participants were sent a text message and instructed to complete the primary outcome measure independently, in their own time.
Primary outcome
The Patient Health Questionnaire-Adolescent Version (PHQ-A; 20) is a nine-item questionnaire assessing depressive symptoms in the preceding 2 weeks. It has high specificity (94%) and sensitivity (73%) for major depressive disorder.17 18 Internal consistency was α=0.88 baseline; α=0.89 endpoint.
Programme engagement and acceptability
Engagement was measured by assessing the number of SPARX modules completed, and an analysis evaluating engagement level on the primary outcome was planned.9 Acceptability was also measured at the 6-week post baseline assessment, in the intervention group only. Guided by previous studies,11 participants were categorised into high (completed≥4 modules), low (completed 1–3 modules) or no (completed 0 modules) engagement groups. Acceptability was measured using an instrument designed specifically for SPARX11 with 11 items assessing barriers to use, programme usefulness, how easy or difficult the programme was to understand, behaviour change and whether participants would recommend the programme to others.
Sample size
Separate sample size estimates were calculated for each stage of the study. For stage 1, 1244 students were needed to detect a 0.30 mean standardised difference between conditions, with 80% power and an α value of 0.05 (two tailed). A correlation of 0.5 between baseline and endpoint symptom scores was assumed. A design effect was calculated based on an intraclass correlation coefficient of 0.03 and a mean cluster size of 50 students, yielding an effect of 2.47. This estimate allowed for 30% attrition at the primary endpoint. Calculations were informed by our previous trial of SPARX.11
Study conditions
Intervention
SPARX is a web-based cognitive–behavioural treatment for adolescent depression.10 SPARX was adapted into a prevention programme and evaluated in schools confirming effects on depression (d=0.29; 11). An app version was created for use in the current study. Core skills covered include emotion identification, emotion regulation, behavioural activation, challenging unhelpful thoughts and problem-solving. SPARX is delivered across seven 20 min modules in a game format where participants navigate through a fantasy world. Participants were encouraged to do one-to-two modules per week without human guidance and had access to SPARX for 6 weeks on their smartphones. Most intervention schools scheduled a long in-class session for the baseline assessment, immediately after which students were instructed to download SPARX and begin the first module. Schools were instructed to provide some class time for completion of the subsequent intervention modules. Specifically, schools were required to schedule a minimum of 4×20 min in class sessions for students to complete the first four modules of SPARX, with the remaining modules to be completed in students’ own time or further class time if provided by the school.
Control
Participants in the control group did not have access to any programme during the 6-week intervention period.
Participants in both groups were invited to use a data collection app during the 6 weeks between baseline and post intervention, which included cognitive tasks and passively collected sensor data.9 This was not part of the trial and will be published separately.
Statistical methods
The statistical approach was predetermined.25 Analyses were conducted in STATA (V.18). Analysis of the primary and secondary outcomes used an intention-to-treat approach, including all participants regardless of intervention received. The significance of the primary outcome was based on a planned contrast comparing change in depressive symptoms from baseline to 12 months between the trial arms using a mixed effects model repeated measure analysis (MMRM). Secondary outcomes also used this method. Additional analyses were undertaken to examine the level of engagement with the intervention (high vs low vs no vs control) as a between-group factor. Acceptability data were analysed descriptively. Subgroup analyses were conducted to examine whether there were effects for groups based on gender (female/male) and probable depression caseness at baseline (PHQ-A≥15, yes/no).
Missing outcome data were assumed to be missing at random. A random intercept for school accommodated potential clustering effects, and estimates of the intracluster correlation coefficients are reported. An unconstrained variance–covariance matrix was used to model within-individual dependencies. Df were estimated using the Kenward-Roger method with tests having more than 1000 df reported as z tests.
Findings
See figure 1 for study flow. A total of 200 schools consented to participate. However, baseline data collection coincided with multiple COVID‐19 lockdowns, and 66 schools withdrew from the study prior to baseline (see online supplemental file 2 for reasons). There were 134 schools allocated as clusters in the final sample, 77 (57.5%) of which were government schools and 57 (42.5%) non‐government schools. From these 134 schools, 20 533 Year 8 students were invited to participate. Consent was obtained from 7577 parents and baseline data collected from 6388 students (consent rate 31%). There were 3266 students (71 schools) randomised to the intervention group, and 3122 students (63 schools) randomised to the control group. Although schools were required to provide in-class time for intervention completion, many schools reported time constraints and school-level restrictions on in-class phone use as barriers. Therefore, schools reported that in these instances, they instructed students to complete SPARX in their own time, outside of school, with regular reminders provided by school staff. Follow-up data at the primary endpoint were provided by 4841 participants (24.2% attrition), with 62 students (0.97%) withdrawing during the study. Primary outcomes were assessed 12 months after baseline (mean=55.5 weeks). The intraclass correlation coefficient for schools at the primary endpoint was 0.051, 95% CI: 0.035 to 0.068.
Study flow.
Participant characteristics
See table 1 for baseline characteristics.
•
Baseline characteristics of study participants and schools by group
Primary outcome
Estimated marginal means and between-group and within-group effect sizes are presented in table 2. Intention-to-treat analyses showed no significant differences in change in depressive symptoms between the intervention and control group from baseline to post intervention (mean change difference= −0.02, z= −0.18, 95% CI: −0.26 to .21, p=0.86), a borderline change from baseline to 6 months, with the intervention group showing a decrease of 0.36 units greater than the control group (z= −1.94, 95% CI: −0.73 to 0.00, p=0.05) and no significant between-group changes from baseline to 12-month primary endpoint (mean change difference= −0.05, z= −0.32, 95% CI: −0.36 to 0.23, p=0.75). Notably, the response rate at 6 months when questionnaires were administered out of class was only 31% and so this result needs to be interpreted cautiously and is unlikely to reflect the participant experiences of the broader sample.
•
Estimated marginal means, SE and within-group effect sizes for depression, anxiety, distress and insomnia symptoms
Within-group contrasts showed a significant reduction in depressive symptoms from baseline to post intervention, with a decrease of −0.43 units in the intervention group (z= −5.11, 95% CI: −0.59 to −0.26, p<0.001), and −0.41 units in the control group (z= −4.79, 95% CI: −0.57 to −0.24, p<0.001). From post intervention to 6 months, symptoms continued to reduce in both the intervention (mean change= −0.76, z= −5.94, 95% CI: −1.01 to −0.51, p<0.001), and control group (mean change= −0.42, z= −3.36, 95% CI: −0.67 to −0.18, p<0.001), and then significantly increased from 6 months to 12 months (intervention, mean change=1.09, z=8.23, 95% CI: 0.83 to 1.35, p<0.001; control, mean change=0.77, z=5.98, 95% CI: 0.52 to 1.03, p<0.001), but not differentially so (t(172)= −1.28, 95% CI: −0.96 to 0.20, p=0.20). A contrast comparing symptoms from baseline to 12 months showed no significant change in the intervention (mean change= −0.10, z= −0.92, 95% CI: −0.32 to 0.12, p=0.36), nor control group (mean change= −0.05, z= −0.47, 95% CI: −0.27 to 0.17, p=0.64).
Secondary outcomes
See table 2 for full details. There were no significant differences in change in anxiety symptoms between groups from baseline to post intervention (mean change difference= −0.16, z= −0.70, 95% CI: −0.35 to 0.16, p=0.49), nor to 12 months (mean change difference= −0.09, z=1.56, 95% CI: −0.04 to 0.35, p=0.12). Results for psychological distress were similar, with no between-group differences in change scores from baseline to post intervention (mean change difference=0.16, z=1.69, 95% CI: −0.03 to 0.35, p=0.09), nor to 12 months (mean change difference= −0.13, z= −1.02, 95% CI: −0.38 to 0.12, p=0.31). For insomnia, there were no between-group differences in change between baseline and post intervention (mean change difference=0.07, z=0.64, 95% CI: −0.14 to 0.28, p=0.52), nor at 12 months (mean change difference=0.00, z=0.00, 95% CI: −0.28 to 0.28, p=0.99). Within-group results are reported in online supplemental file 3.
Subgroup analyses
Analyses of the PHQ-A were repeated separately for males and females, and for participants who met criteria for probable depression caseness (PHQ-A≥15) at baseline. For males, mean scores fell by 0.53 points more in the intervention group than the control group from baseline to 6 months (95% CI: 0.01 to 1.07, z= −1.98, p=0.05). No other differences were observed.
Intervention engagement
Of participants allocated to SPARX, 87% installed the app, 57% completed the first module and 12% completed all seven modules. 43% of participants comprised the ‘no engagement’ group (zero modules completed), 35% were in the ‘low engagement’ group (1–3 modules) and 22% in the ‘high engagement’ group (4+ modules). Females were more likely to be high engagers compared with males (OR=1.26, 95% CI: 1.03 to 1.54), as were those with no previous diagnosis of a mental health condition (OR=1.37, 95% CI: 1.07 to 1.74).
An MMRM was conducted using these engagement categories and the control group as a between-group factor. There were no differences in depression symptom change between the intervention and the control group, nor between the no and low engagement groups at any of the assessment points (all p values>0.05). A contrast comparing the high engagement group with the control group showed a significant difference in depression symptom change from baseline to post intervention (t(342.16)= −0.52; 95% CI: 0.14 to 0.89, p<0.01). However, symptom differences at baseline between these groups preclude firm conclusions being drawn from this finding, t(349.756)= −1.97; CI: −1.40 to 0.00, p<0.05. See online supplemental file 4 for full details.
Intervention acceptability
Of participants who accessed SPARX, approximately half (55.8%) reported finding SPARX ‘useful’ or ‘very useful’, and 50.9% of participants indicated they would use this kind of programme again in the future. Just over half (52.9%) of participants found the programme easy to understand, while 13.1% found it difficult to understand, with 34% finding it neither difficult nor easy to understand.