Telehealth/m-Health
Who uses online self-help? Using machine learning to predict treatment uptake
Gavin N. Rackoff, M.S.
Student
The Pennsylvania State University
State College, Pennsylvania
Michelle G. Newman, Ph.D.
Professor of Psychology and Psychiatry
Penn State University
University Park, Pennsylvania
Online self-help interventions are efficacious for treating common mental health problems (Harrer et al., 2019), yet their public health impact is limited by low treatment uptake. For example, a systematic review suggested that between 21% and 88% of people offered online self-help interventions recorded at least minimal use (Fleming et al., 2018). Identifying who is more or less likely to access an intervention when offered is crucial to maximize these interventions’ effectiveness. If likely users and non-users can be differentiated, the likely users can be offered self-help programs, whereas likely non-users may require additional support to improve motivation to use self-help interventions or different interventions altogether.
This project used data from the treatment arm of a trial examining online self-help among college students reporting elevated stress during the COVID-19 pandemic (n = 301), of whom 52% accessed the self-help program at least once. The goal was to train a machine learning model to predict treatment uptake (i.e., accessing the program at least once) based on demographics and self-report measures completed at the pre-treatment assessment. We trained several machine learning models in a training set consisting of 155 observations and examined their performance in predicting treatment uptake the remainder of the data. Principal components analysis was used to reduce the dimensionality of the pre-treatment symptom data (11 symptom measures) into a single symptom severity dimension, and this symptom severity dimension was used as a predictor alongside demographic data (11 other predictors) to predict treatment uptake. We examined the performance of several methods ranging in their modeling flexibility: logistic regression, generalized additive models, support vector machines, and random forests. To select tuning parameters for all methods, k-fold cross validation was used.
In terms of test set accuracy, support vector machines with a linear kernel performed best at predicting program uptake, with 71% sensitivity and 62% specificity (67% overall accuracy). This method was thus selected for the final model. Examining the model indicated that overall symptom severity, gender, and self-reported treatment interest were among the most influential predictors of treatment uptake, with higher symptom severity, being female, and being interested in receiving treatment all related to a higher probability of accessing the program.
Results suggest that data-driven machine learning approaches can be used to identify likely users and non-users of online self-help programs. This information may be used to maximize the public health impact of online self-help by supporting motivation among likely non-users or triaging them to other interventions.
References
Fleming, T., et al., (2018). Beyond the Trial: Systematic Review of Real-World Uptake and Engagement With Digital Self-Help Interventions for Depression, Low Mood, or Anxiety. J Med Internet Res, 20, e199. doi:10.2196/jmir.9275
Harrer, M. et al., (2019). Internet interventions for mental health in university students: A systematic review and meta-analysis. Int J Methods Psychiatr Res, 28, e1759. doi:10.1002/mpr.1759