当前位置: X-MOL 学术J. Econ. Entomol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating the Quality of Ecoinformatics Data Derived From Commercial Agriculture: A Repeatability Analysis of Pest Density Estimates
Journal of Economic Entomology ( IF 2.2 ) Pub Date : 2021-06-05 , DOI: 10.1093/jee/toab127
Jay A Rosenheim 1
Affiliation  

Each year, consultants and field scouts working in commercial agriculture undertake a massive, decentralized data collection effort as they monitor insect populations to make real-time pest management decisions. These data, if integrated into a database, offer rich opportunities for applying big data or ecoinformatics methods in agricultural entomology research. However, questions have been raised about whether or not the underlying quality of these data is sufficiently high to be a foundation for robust research. Here I suggest that repeatability analysis can be used to quantify the quality of data collected from commercial field scouting, without requiring any additional data gathering by researchers. In this context, repeatability quantifies the proportion of total variance across all insect density estimates that is explained by differences across populations and is thus a measure of the underlying reliability of observations. Repeatability was moderately high for cotton fields scouted commercially for total Lygus hesperus Knight densities (R = 0.631) and further improved by accounting for observer effects (R = 0.697). Repeatabilities appeared to be somewhat lower than those computed for a comparable, but much smaller, researcher-generated data set. In general, the much larger sizes of ecoinformatics data sets are likely to more than compensate for modest reductions in measurement precision. Tools for evaluating data quality are important for building confidence in the growing applications of ecoinformatics methods.

中文翻译:

评估源自商业农业的生态信息学数据的质量:害虫密度估计的可重复性分析

每年,在商业农业领域工作的顾问和实地侦察员都会进行大规模、分散的数据收集工作,因为他们监测昆虫种群以做出实时害虫管理决策。这些数据如果整合到数据库中,将为在农业昆虫学研究中应用大数据或生态信息学方法提供丰富的机会。然而,人们质疑这些数据的基本质量是否足够高,可以作为稳健研究的基础。在这里,我建议可以使用可重复性分析来量化从商业现场侦察收集的数据的质量,而不需要研究人员收集任何额外的数据。在这种情况下,可重复性量化了所有昆虫密度估计值中总方差的比例,这是由种群间的差异所解释的,因此是对观察的潜在可靠性的衡量。商业上针对草盲蝽总密度 (R = 0.631) 进行的棉花田的可重复性中等偏高,并且通过考虑观察者效应 (R = 0.697) 进一步提高了可重复性。重复性似乎比研究人员生成的可比较但小得多的数据集的计算结果要低一些。一般来说,规模大得多的经济信息学数据集可能足以弥补测量精度的适度降低。评估数据质量的工具对于建立对日益增长的生态信息学方法应用的信心非常重要。
更新日期:2021-06-05
down
wechat
bug