当前位置: X-MOL 学术Forensic Sci. Int. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development and comparison of forensic interval age prediction models by statistical and machine learning methods based on the methylation rates of ELOVL2 in blood DNA
Forensic Science International: Genetics ( IF 3.1 ) Pub Date : 2023-12-25 , DOI: 10.1016/j.fsigen.2023.103004
Takayuki Yamagishi , Wataru Sakurai , Ken Watanabe , Kochi Toyomane , Tomoko Akutsu

Age estimation can be useful information for narrowing down candidates of unidentified donors in criminal investigations. Various age estimation models based on DNA methylation biomarkers have been developed for forensic usage in the past decade. However, many of these models using ordinary least squares regression cannot generate an appropriate estimation due to the deterioration in prediction accuracy caused by an increased prediction error in older age groups. In the present study, to address this problem, we developed age estimation models that set an appropriate prediction interval for all age groups by two approaches: a statistical method using quantile regression (QR) and a machine learning method using an artificial neural network (ANN). Methylation datasets (n = 1280, age 0–91 years) of the promoter for the gene encoding ELOVL fatty acid elongase 2 were used to develop the QR and ANN models. By validation using several test datasets, both models were shown to enlarge prediction intervals in accordance with aging and have a high level of correct prediction (>90 %) for older age groups. The QR and ANN models also generated a point age prediction with high accuracy. The ANN model enabled a prediction with a mean absolute error (MAE) of 5.3 years and root mean square error (RMSE) of 7.3 years for the test dataset (n = 549), which were comparable to those of the QR model (MAE = 5.6 years, RMSE = 7.8 years). Their applicability to casework was also confirmed using bloodstain samples stored for various periods of time (1–14 years), indicating the stability of the models for aged bloodstain samples. From these results, it was considered that the proposed models can provide more useful and effective age estimation in forensic settings.



中文翻译:

基于血液 DNA 中 ELOVL2 甲基化率,通过统计和机器学习方法开发和比较法医间隔年龄预测模型

年龄估计对于在刑事调查中缩小身份不明捐赠者的候选人范围是有用的信息。在过去的十年里,各种基于 DNA 甲基化生物标志物的年龄估计模型已经被开发出来供法医使用。然而,许多使用普通最小二乘回归的模型无法生成适当的估计,因为老年群体的预测误差增加导致预测精度下降。在本研究中,为了解决这个问题,我们开发了年龄估计模型,通过两种方法为所有年龄组设置适当的预测区间:使用分位数回归(QR)的统计方法和使用人工神经网络(ANN)的机器学习方法)。编码ELOVL 脂肪酸延伸酶2 的基因启动子的甲基化数据集( n用于开发 QR 和 ANN 模型。通过使用多个测试数据集进行验证,这两个模型都显示出根据衰老情况扩大了预测区间,并且对老年群体具有高水平的正确预测(> 90%)。QR 和 ANN 模型还生成了高精度的点龄预测。对于测试数据集 ( n,ANN 模型的预测平均绝对误差 (MAE) 为 5.3 年,均方根误差 (RMSE) 为 7.3 年 ,这与 QR 模型的预测相当 (MAE = 5.6 年,RMSE = 7.8 年)。使用储存不同时间段(1-14 年)的血迹样本也证实了它们对案例工作的适用性,表明老化血迹样本模型的稳定性。从这些结果来看,我们认为所提出的模型可以在法医环境中提供更有用和更有效的年龄估计。

更新日期:2023-12-25
down
wechat
bug