当前位置: X-MOL 学术Plant Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery
Plant Methods ( IF 5.1 ) Pub Date : 2019-11-01 , DOI: 10.1186/s13007-019-0508-7
Jiating Li 1 , Arun-Narenthiran Veeranampalayam-Sivakumar 1 , Madhav Bhatta 2 , Nicholas D Garst 3 , Hannah Stoll 3 , P Stephen Baenziger 3 , Vikas Belamkar 3 , Reka Howard 4 , Yufeng Ge 1 , Yeyin Shi 1
Affiliation  

Automated phenotyping technologies are continually advancing the breeding process. However, collecting various secondary traits throughout the growing season and processing massive amounts of data still take great efforts and time. Selecting a minimum number of secondary traits that have the maximum predictive power has the potential to reduce phenotyping efforts. The objective of this study was to select principal features extracted from UAV imagery and critical growth stages that contributed the most in explaining winter wheat grain yield. Five dates of multispectral images and seven dates of RGB images were collected by a UAV system during the spring growing season in 2018. Two classes of features (variables), totaling to 172 variables, were extracted for each plot from the vegetation index and plant height maps, including pixel statistics and dynamic growth rates. A parametric algorithm, LASSO regression (the least angle and shrinkage selection operator), and a non-parametric algorithm, random forest, were applied for variable selection. The regression coefficients estimated by LASSO and the permutation importance scores provided by random forest were used to determine the ten most important variables influencing grain yield from each algorithm. Both selection algorithms assigned the highest importance score to the variables related with plant height around the grain filling stage. Some vegetation indices related variables were also selected by the algorithms mainly at earlier to mid growth stages and during the senescence. Compared with the yield prediction using all 172 variables derived from measured phenotypes, using the selected variables performed comparable or even better. We also noticed that the prediction accuracy on the adapted NE lines (r = 0.58–0.81) was higher than the other lines (r = 0.21–0.59) included in this study with different genetic backgrounds. With the ultra-high resolution plot imagery obtained by the UAS-based phenotyping we are now able to derive more features, such as the variation of plant height or vegetation indices within a plot other than just an averaged number, that are potentially very useful for the breeding purpose. However, too many features or variables can be derived in this way. The promising results from this study suggests that the selected set from those variables can have comparable prediction accuracies on the grain yield prediction than the full set of them but possibly resulting in a better allocation of efforts and resources on phenotypic data collection and processing.

中文翻译:

从无人机图像提取的特征中选择主变量来解释冬小麦产量变化

自动化表型分析技术不断推进育种过程。然而,在整个生长季节收集各种次要性状并处理大量数据仍然需要付出巨大的努力和时间。选择具有最大预测能力的最少数量的次要性状有可能减少表型分析工作。本研究的目的是选择从无人机图像中提取的主要特征和对解释冬小麦籽粒产量贡献最大的关键生长阶段。无人机系统在2018年春季生长季节采集了5个日期的多光谱图像和7个日期的RGB图像。从植被指数和株高中提取每个样地的两类特征(变量),总计172个变量地图,包括像素统计和动态增长率。应用参数算法LASSO回归(最小角度和收缩选择算子)和非参数算法随机森林进行变量选择。使用 LASSO 估计的回归系数和随机森林提供的排列重要性得分来确定每个算法中影响粮食产量的 10 个最重要的变量。两种选择算法都将最高重要性得分分配给与灌浆阶段植株高度相关的变量。算法还选择了一些与植被指数相关的变量,主要是在生长早期到中期以及衰老过程中。与使用从测量表型得出的所有 172 个变量进行的产量预测相比,使用所选变量的效果相当甚至更好。我们还注意到,适应的 NE 品系 (r = 0.58–0.81) 的预测准确性高于本研究中包含的具有不同遗传背景的其他品系 (r = 0.21–0.59)。通过基于 UAS 的表型分析获得的超高分辨率地块图像,我们现在能够获得更多特征,例如地块内植物高度或植被指数的变化,而不仅仅是平均数,这些特征可能非常有用育种目的。然而,通过这种方式可以导出太多的特征或变量。这项研究的令人鼓舞的结果表明,从这些变量中选择的一组在粮食产量预测方面可以比全部变量具有可比的预测精度,但可能会导致在表型数据收集和处理方面更好地分配工作和资源。
更新日期:2019-11-01
down
wechat
bug