当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices
Journal of Biomedical informatics ( IF 4.5 ) Pub Date : 2023-12-02 , DOI: 10.1016/j.jbi.2023.104556
Gideon Vos , Kelly Trinh , Zoltan Sarnyai , Mostafa Rahimi Azghadi

Introduction:

Advances in wearable sensor technology have enabled the collection of biomarkers that may correlate with levels of elevated stress. While significant research has been done in this domain, specifically in using machine learning to detect elevated levels of stress, the challenge of producing a machine learning model capable of generalizing well for use on new, unseen data remain. Acute stress response has both subjective, psychological and objectively measurable, biological components that can be expressed differently from person to person, further complicating the development of a generic stress measurement model. Another challenge is the lack of large, publicly available datasets labeled for stress response that can be used to develop robust machine learning models. In this paper, we first investigate the generalization ability of models built on datasets containing a small number of subjects, recorded in single study protocols. Next, we propose and evaluate methods combining these datasets into a single, large dataset to study the generalization capability of machine learning models built on larger datasets. Finally, we propose and evaluate the use of ensemble techniques by combining gradient boosting with an artificial neural network to measure predictive power on new, unseen data. In favor of reproducible research and to assist the community advance the filed, we make all our experimental data and code publicly available through Github at https://github.com/xalentis/Stress. This paper’s in-depth study of machine learning model generalization for stress detection provides an important foundation for the further study of stress response measurement using sensor biomarkers, recorded with wearable technologies.

Methods:

Sensor biomarker data from six public datasets were utilized in this study. Exploratory data analysis was performed to understand the physiological variance between study subjects, and the complexity it introduces in building machine learning models capable of detecting elevated levels of stress on new, unseen data. To test model generalization, we developed a gradient boosting model trained on one dataset (SWELL), and tested its predictive power on two datasets previously used in other studies (WESAD, NEURO). Next, we merged four small datasets, i.e. (SWELL, NEURO, WESAD, UBFC-Phys), to provide a combined total of 99 subjects, and applied feature engineering to generate additional features utilizing statistical summaries, with sliding windows of 25 s. We name this large dataset, StressData. In addition, we utilized random sampling on StressData combined with another dataset (EXAM) to build a larger training dataset consisting of 200 synthesized subjects, which we name SynthesizedStressData. Finally, we developed an ensemble model that combines our gradient boosting model with an artificial neural network, and tested it using Leave-One-Subject-Out (LOSO) validation, and on two additional, unseen publicly available stress biomarker datasets (WESAD and Toadstool).

Results:

Our results show that previous models built on datasets containing a small number (<50) of subjects, recorded in single study protocols, cannot generalize well to new, unseen datasets. Our presented methodology for generating a large, synthesized training dataset by utilizing random sampling to construct scenarios closely aligned with experimental conditions demonstrate significant benefits. When combined with feature-engineering and ensemble learning, our method delivers a robust stress measurement system capable of achieving 85% predictive accuracy on new, unseen validation data, achieving a 25% performance improvement over single models trained on small datasets. The resulting model can be used as both a classification or regression predictor for estimating the level of perceived stress, when applied on specific sensor biomarkers recorded using a wearable device, while further allowing researchers to construct large, varied datasets for training machine learning models that closely emulate their exact experimental conditions.

Conclusion:

Models trained on small, single study protocol datasets do not generalize well for use on new, unseen data and lack statistical power. Machine learning models trained on a dataset containing a larger number of varied study subjects capture physiological variance better, resulting in more robust stress detection. Feature-engineering assists in capturing these physiological variance, and this is further improved by utilizing ensemble techniques by combining the predictive power of different machine learning models, each capable of learning unique signals contained within the data. While there is a general lack of large, labeled public datasets that can be utilized for training machine learning models capable of accurately measuring levels of acute stress, random sampling techniques can successfully be applied to construct larger, varied datasets from these smaller sample datasets, for building robust machine learning models.



中文翻译:

在新的合成数据集上训练的集成机器学习模型可以很好地推广使用可穿戴设备的压力预测

介绍:

可穿戴传感器技术的进步使得能够收集可能与升高的压力水平相关的生物标记物。尽管在这一领域已经进行了大量研究,特别是在使用机器学习来检测升高的压力水平方面,但生成能够很好地泛化以用于新的、看不见的数据的机器学习模型的挑战仍然存在。急性应激反应具有主观、心理和客观可测量的生物成分,每个人的表达方式都不同,这使得通用压力测量模型的开发进一步复杂化。另一个挑战是缺乏标记为压力响应的大型公开数据集,可用于开发强大的机器学习模型。在本文中,我们首先研究了在包含少量受试者的数据集上构建的模型的泛化能力,这些受试者记录在单个研究方案中。接下来,我们提出并评估将这些数据集组合成单个大型数据集的方法,以研究基于较大数据集构建的机器学习模型的泛化能力。最后,我们提出并评估了集成技术的使用,将梯度增强与人工神经网络相结合,以测量对新的、未见过的数据的预测能力。为了支持可重复的研究并协助社区推进该领域的发展,我们通过 Github https://github.com/xalentis/Stress 公开提供所有实验数据和代码。本文对压力检测机器学习模型泛化的深入研究,为进一步研究使用可穿戴技术记录的传感器生物标记物进行压力反应测量提供了重要基础。

方法:

本研究使用了来自六个公共数据集的传感器生物标记数据。进行探索性数据分析是为了了解研究对象之间的生理差异,以及它在构建能够检测新的、看不见的数据的压力水平升高的机器学习模型时引入的复杂性。为了测试模型的泛化能力,我们开发了一种在一个数据集 (SWELL) 上训练的梯度增强模型,并在其他研究中使用的两个数据集 (WESAD、NEURO) 上测试了其预测能力。接下来,我们合并了四个小数据集(SWELL、NEURO、WESAD、UBFC-Phys),总共提供 99 个主题,并应用特征工程利用统计摘要生成附加特征,滑动窗口为 25 秒。我们将这个大数据集命名为 StressData。此外,我们利用对 StressData 的随机采样结合另一个数据集 (EXAM) 来构建一个由 200 个综合受试者组成的更大的训练数据集,我们将其命名为 SynthesizedStressData。最后,我们开发了一个集成模型,将梯度增强模型与人工神经网络相结合,并使用留一主题排除(LOSO)验证以及另外两个未公开的公开压力生物标记数据集(WESAD 和 Toadstool)对其进行了测试)。

结果:

我们的结果表明,以前的模型建立在包含少量(<50)受试者的数据集上,记录在单一研究方案中,不能很好地推广到新的、未见过的数据集。我们提出的方法通过利用随机抽样构建与实验条件密切相关的场景来生成大型综合训练数据集,这证明了显着的好处。当与特征工程和集成学习相结合时,我们的方法提供了一个强大的压力测量系统,能够对新的、未见过的验证数据实现 85% 的预测准确性,比在小数据集上训练的单个模型实现 25% 的性能提升。当应用于使用可穿戴设备记录的特定传感器生物标志物时,所得模型可用作分类或回归预测器,用于估计感知压力水平,同时进一步允许研究人员构建大型、多样化的数据集,用于训练密切相关的机器学习模型。模拟他们确切的实验条件。

结论:

在小型、单一研究方案数据集上训练的模型不能很好地推广到新的、未见过的数据,并且缺乏统计能力。在包含大量不同研究对象的数据集上训练的机器学习模型可以更好地捕获生理差异,从而实现更稳健的压力检测。特征工程有助于捕获这些生理差异,并且通过结合不同机器学习模型的预测能力,利用集成技术进一步改进了这一点,每个模型都能够学习数据中包含的独特信号。虽然普遍缺乏可用于训练能够准确测量急性压力水平的机器学习模型的大型标记公共数据集,但随机抽样技术可以成功地应用于从这些较小的样本数据集中构建更大的、多样化的数据集,例如构建强大的机器学习模型。

更新日期:2023-12-03
down
wechat
bug