当前位置: X-MOL 学术Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying environmental exposure profiles associated with timing of menarche: A two-step machine learning approach to examine multiple environmental exposures
Environmental Research ( IF 7.7 ) Pub Date : 2020-11-26 , DOI: 10.1016/j.envres.2020.110524
Sabine Oskar 1 , Mary S Wolff 2 , Susan L Teitelbaum 2 , Jeanette A Stingone 3
Affiliation  

Background

Variation in the timing of menarche has been linked with adverse health outcomes in later life. There is evidence that exposure to hormonally active agents (or endocrine disrupting chemicals; EDCs) during childhood may play a role in accelerating or delaying menarche. The goal of this study was to generate hypotheses on the relationship between exposure to multiple EDCs and timing of menarche by applying a two-stage machine learning approach.

Methods

We used data from the National Health and Nutrition Examination Survey (NHANES) for years 2005–2008. Data were analyzed for 229 female participants 12–16 years of age who had blood and urine biomarker measures of 41 environmental exposures, all with >70% above limit of detection, in seven classes of chemicals. We modeled risk for earlier menarche (<12 years of age vs older) with exposure biomarkers. We applied a two-stage approach consisting of a random forest (RF) to identify important exposure combinations associated with timing of menarche followed by multivariable modified Poisson regression to quantify associations between exposure profiles (“combinations”) and timing of menarche.

Results

RF identified urinary concentrations of monoethylhexyl phthalate (MEHP) as the most important feature in partitioning girls into homogenous subgroups followed by bisphenol A (BPA) and 2,4-dichlorophenol (2,4-DCP). In this first stage, we identified 11 distinct exposure biomarker profiles, containing five different classes of EDCs associated with earlier menarche. MEHP appeared in all 11 exposure biomarker profiles and phenols appeared in five. Using these profiles in the second-stage of analysis, we found a relationship between lower MEHP and earlier menarche (MEHP ≤ 2.36 ng/mL vs >2.36 ng/mL: adjusted PR = 1.36, 95% CI: 1.02, 1.80). Combinations of lower MEHP with benzophenone-3, 2,4-DCP, and BPA had similar associations with earlier menarche, though slightly weaker in those smaller subgroups. For girls not having lower MEHP, exposure profiles included other biomarkers (BPA, enterodiol, monobenzyl phthalate, triclosan, and 1-hydroxypyrene); these showed largely null associations in the second-stage analysis. Adjustment for covariates did not materially change the estimates or CIs of these models. We observed weak or null effect estimates for some exposure biomarker profiles and relevant profiles consisted of no more than two EDCs, possibly due to small sample sizes in subgroups.

Conclusion

A two-stage approach incorporating machine learning was able to identify interpretable combinations of biomarkers in relation to timing of menarche; these should be further explored in prospective studies. Machine learning methods can serve as a valuable tool to identify patterns within data and generate hypotheses that can be investigated within future, targeted analyses.



中文翻译:


识别与初潮时间相关的环境暴露概况:检查多种环境暴露的两步机器学习方法


 背景


初潮时间的变化与晚年的不良健康结果有关。有证据表明,儿童时期接触激素活性剂(或内分泌干扰化学物质;EDC)可能会加速或延迟初潮。本研究的目的是通过应用两阶段机器学习方法,对多种 EDC 暴露与初潮时间之间的关系提出假设。

 方法


我们使用 2005-2008 年国家健康和营养检查调查 (NHANES) 的数据。对 229 名 12-16 岁女性参与者的数据进行了分析,她们对 41 种环境暴露的血液和尿液生物标志物进行了测量,所有 7 类化学品的 >70% 均高于检测限。我们利用暴露生物标志物对初潮提前的风险进行了建模(<12 岁与以上)。我们应用了由随机森林(RF)组成的两阶段方法来识别与初潮时间相关的重要暴露组合,然后采用多变量修正泊松回归来量化暴露概况(“组合”)和初潮时间之间的关联。

 结果


RF 确定尿邻苯二甲酸单乙基己酯 (MEHP) 浓度是将女孩分为同质亚组的最重要特征,其次是双酚 A (BPA) 和 2,4-二氯苯酚 (2,4-DCP)。在第一阶段,我们确定了 11 种不同的暴露生物标志物概况,其中包含与早期初潮相关的五种不同类别的 EDC。 MEHP 出现在所有 11 种暴露生物标志物谱中,酚类出现在 5 种中。在第二阶段分析中使用这些曲线,我们发现较低的 MEHP 与较早的初潮之间存在关系(MEHP ≤ 2.36 ng/mL 与 >2.36 ng/mL:调整后的 PR = 1.36,95% CI:1.02,1.80) 。较低 MEHP 与二苯甲酮-3、2,4-DCP 和 BPA 的组合与早期初潮具有相似的关联,但在较小的亚组中稍弱一些。对于 MEHP 不较低的女孩,暴露情况包括其他生物标志物(BPA、肠二醇、邻苯二甲酸单苄酯、三氯生和 1-羟基芘);这些在第二阶段分析中基本上显示出无效关联。协变量的调整并未实质性改变这些模型的估计值或置信区间。我们观察到一些暴露生物标志物概况的弱效应或无效效应估计,并且相关概况由不超过两个 EDC 组成,这可能是由于亚组中样本量较小。

 结论


结合机器学习的两阶段方法能够识别与初潮时间相关的生物标志物的可解释组合;这些应该在前瞻性研究中进一步探讨。机器学习方法可以作为一种有价值的工具来识别数据中的模式并生成可以在未来有针对性的分析中进行研究的假设。

更新日期:2021-01-29
down
wechat
bug