当前位置: X-MOL 学术Database J. Biol. Databases Curation › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
mAML: an automated machine learning pipeline with a microbiome repository for human disease classification.
Database: The Journal of Biological Databases and Curation ( IF 5.8 ) Pub Date : 2020-06-25 , DOI: 10.1093/database/baaa050
Fenglong Yang 1 , Quan Zou 1
Affiliation  

Due to the concerted efforts to utilize the microbial features to improve disease prediction capabilities, automated machine learning (AutoML) systems aiming to get rid of the tediousness in manually performing ML tasks are in great demand. Here we developed mAML, an ML model-building pipeline, which can automatically and rapidly generate optimized and interpretable models for personalized microbiome-based classification tasks in a reproducible way. The pipeline is deployed on a web-based platform, while the server is user-friendly and flexible and has been designed to be scalable according to the specific requirements. This pipeline exhibits high performance for 13 benchmark datasets including both binary and multi-class classification tasks. In addition, to facilitate the application of mAML and expand the human disease-related microbiome learning repository, we developed GMrepo ML repository (GMrepo Microbiome Learning repository) from the GMrepo database. The repository involves 120 microbiome-based classification tasks for 85 human-disease phenotypes referring to 12 429 metagenomic samples and 38 643 amplicon samples. The mAML pipeline and the GMrepo ML repository are expected to be important resources for researches in microbiology and algorithm developments.

中文翻译:

mAML:具有用于人类疾病分类的微生物组存储库的自动机器学习管道。

由于齐心协力利用微生物功能来改善疾病预测能力,因此对旨在摆脱手动执行ML任务的繁琐工作的自动化机器学习(AutoML)系统的需求很大。在这里,我们开发了mAML(一种ML模型构建管道),它可以以可重现的方式自动,快速地生成针对个性化微生物组分类任务的优化和可解释的模型。该管道部署在基于Web的平台上,而服务器则用户友好且灵活,并已设计为可根据特定要求进行扩展。该管道对包括二进制和多类分类任务的13个基准数据集表现出高性能。此外,为了促进mAML的应用并扩展与人类疾病相关的微生物组学习资料库,我们从GMrepo数据库开发了GMrepo ML资料库(GMrepo微生物组学习资料库)。该库涉及针对85种人类疾病表型的120种基于微生物组的分类任务,涉及12 429个宏基因组样本和38 643个扩增子样本。预计mAML管道和GMrepo ML存储库将是微生物学和算法开发研究的重要资源。
更新日期:2020-06-26
down
wechat
bug