Effect of missing values in multi-environmental trials on variance component estimates,Crop Science

当前位置： X-MOL 学术 › Crop Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Effect of missing values in multi-environmental trials on variance component estimates
Crop Science ( IF 2.0 ) Pub Date : 2021-08-17 , DOI: 10.1002/csc2.20621
Jens Hartung ₁ , Hans‐Peter Piepho ₁

Affiliation

A common task in the analysis of multi-environmental trials (MET) by linear mixed models (LMM) is the estimation of variance components (VCs). Most often, MET data are imbalanced (e.g., due to selection). The imbalance mechanism can be missing completely at random (MCAR), missing at random (MAR), or missing not at random. If the missing-data pattern in MET was caused by selection, it is usually MAR. In this case, likelihood-based methods are the preferred methods for analysis as they can account for a MAR data pattern. Likelihood-based methods used to estimate VCs in LMM have the property that all VC estimates are constrained to be non-negative; thus, the estimators are generally biased. Therefore, there are two potential causes of bias in MET analysis: a data pattern not being MCAR, and the bias of likelihood-based VC estimators. The current study tries to dissect and quantify both possible sources of bias. A simulation study with MET data typical of cultivar evaluation trials was conducted in which the missing data pattern and the size of VCs were varied. The results showed that for the simulated MET, bias in VC estimates was similar under MCAR and MAR. Thus, the bias is solely due to the likelihood-based estimation. Bias increases when increasing the ratio of genotype variance to error variance is small. Bias was similar for MAR and MCAR data patterns. Thus, it may be concluded that selection does not increase bias in VC estimation.

中文翻译：

多环境试验中缺失值对方差分量估计的影响

通过线性混合模型 (LMM) 分析多环境试验 (MET) 的一项常见任务是估计方差分量 (VC)。大多数情况下，MET 数据是不平衡的（例如，由于选择）。不平衡机制可以完全随机缺失 (MCAR)、随机缺失 (MAR) 或非随机缺失。如果 MET 中的缺失数据模式是由选择引起的，则通常是 MAR。在这种情况下，基于似然的方法是首选的分析方法，因为它们可以解释 MAR 数据模式。用于在 LMM 中估计 VC 的基于似然的方法具有以下特性：所有 VC 估计都被约束为非负；因此，估计量通常是有偏差的。因此，在 MET 分析中有两个潜在的偏差原因：数据模式不是 MCAR，以及基于似然的 VC 估计量的偏差。当前的研究试图剖析和量化两种可能的偏见来源。使用典型的栽培品种评估试验的 MET 数据进行了模拟研究，其中缺失数据模式和 VC 的大小各不相同。结果表明，对于模拟的 MET，在 MCAR 和 MAR 下，VC 估计的偏差是相似的。因此，偏差完全是由于基于似然的估计。当增加基因型方差与误差方差的比率很小时，偏差会增加。MAR 和 MCAR 数据模式的偏差相似。因此，可以得出结论，选择不会增加 VC 估计的偏差。结果表明，对于模拟的 MET，在 MCAR 和 MAR 下，VC 估计的偏差是相似的。因此，偏差完全是由于基于似然的估计。当增加基因型方差与误差方差的比率很小时，偏差会增加。MAR 和 MCAR 数据模式的偏差相似。因此，可以得出结论，选择不会增加 VC 估计的偏差。结果表明，对于模拟的 MET，在 MCAR 和 MAR 下，VC 估计的偏差是相似的。因此，偏差完全是由于基于似然的估计。当增加基因型方差与误差方差的比率很小时，偏差会增加。MAR 和 MCAR 数据模式的偏差相似。因此，可以得出结论，选择不会增加 VC 估计的偏差。

更新日期：2021-08-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11