当前位置: X-MOL 学术G3 Genes Genomes Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Multivariate Poisson Deep Learning Model for Genomic Prediction of Count Data.
G3: Genes, Genomes, Genetics ( IF 2.1 ) Pub Date : 2020-10-27 , DOI: 10.1534/g3.120.401631
Osval Antonio Montesinos-López 1 , José Cricelio Montesinos-López 2 , Pawan Singh 3 , Nerida Lozano-Ramirez 3 , Alberto Barrón-López 4 , Abelardo Montesinos-López 5 , José Crossa 6, 7
Affiliation  

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.



中文翻译:

用于计数数据基因组预测的多元泊松深度学习模型。

基因组选择(GS)范式是开发新植物和动物的革命性方法。这是一种预测方法,因为它使用学习方法来执行其任务。不幸的是,没有一个通用模型可以用于所有类型的预测。因此,每种类型的输出(响应变量)都需要特定的方法。由于缺乏有效的多变量计数数据结果的方法,本文提出了一种多变量泊松深度神经网络(MPDN)模型,用于同时预测各种计数结果的基因组。MPDN 模型使用泊松分布的负对数似然作为损失函数,在隐藏层中使用修正线性单元 (RELU) 激活函数捕获非线性模式,在输出层中,使用指数激活函数来生成相同计数范围内的输出。在两个计数数据的实验数据集中,将所提出的 MPDN 模型与传统的广义泊松回归模型和单变量泊松深度学习模型进行了比较。我们发现,所提出的 MPDL 优于单变量泊松深度神经网络模型,但在预测方面并未优于单变量广义泊松回归模型。所有深度学习模型均以 Tensorflow 作为后端、Keras 作为前端实现,这允许在中型和大型数据集上实现这些模型,这比之前针对多变量计数数据的 GS 模型具有显着优势。

更新日期:2020-11-06
down
wechat
bug