Imputation of incomplete large-scale monitoring count data via penalized estimation,Methods in Ecology and Evolution

当前位置： X-MOL 学术 › Methods Ecol. Evol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Imputation of incomplete large-scale monitoring count data via penalized estimation
Methods in Ecology and Evolution ( IF 6.6 ) Pub Date : 2021-03-22 , DOI: 10.1111/2041-210x.13594
Mohamed Dakki ₁ , Geneviève Robin ₂ , Marie Suet ₃ , Abdeljebbar Qninba ₁ , Mohammed A. El Agbani ₁ , Asmâa Ouassou ₁ , Rhimou El Hamoumi ₄ , Hichem Azafzaf ₅ , Sami Rebah ₅ , Claudia Feltrup‐Azafzaf ₅ , Naoufel Hamouda ₅ , Wed A.L. Ibrahim ₆ , Hosni H. Asran ₆ , Amr A. Elhady ₆ , Haitham Ibrahim ₆ , Khaled Etayeb _{7,

8} , Essam Bouras _{8,

9} , Almokhtar Saied _{8,

9} , Ashrof Glidan _{8,

9} , Bakar M. Habib ₁₀ , Mohamed S. Sayoud ₁₁ , Nadjiba Bendjedda ₁₂ , Laura Dami ₃ , Clemence Deschamps ₃ , Elie Gaget ₃ , Jean‐Yves Mondain‐Monval ₁₃ , Pierre Defos du Rau ₁₃

Affiliation

Institut Scientifique Université Mohammed V de Rabat Morocco
Inria ‐ Université Gustave Eiffel CERMICS (ENPC) F‐77455 Marne‐la‐Vallée France
Centre de Recherche de la Tour du Valat Le Sambuc 13200 Arles France
Faculté des Sciences Ben M'sik Univ. Hassan II Casablanca Morocco
Association "Les Amis des Oiseaux" (AAO/BirdLife en Tunisie) 14, Rue Ibn El Heni, 2ème étage ‐ Bureau N° 4 2080 Ariana Tunisia
Egyptian Environmental Affairs Agency 30 Misr/Helwan Road PO 11728 El Maadi Helwan Egypt
Zoology Dept Tripoli University PO Box: 13227 Tripoli Libya
Libyan Society for Birds P.O. Box 81417 Tripoli Libya
Environment General Authority Ganzor Algheran PO Box 13793 Tripoli Libya
Conservation des Forêts de la Wilaya d’Oran 31000 Oran Algeria
Centre Cynégétique de Réghaia Direction Générale des Forets BP 54/02 Réghaia 16112 Alger Algeria
Direction Générale des Forêts 11 Chemin Doudou Mokhtar Ben Aknoun 16000 Alger Algeria
Office Français pour la Biodiversité Unité Avifaune Migratrice Le Sambuc 13200 Arles France

In biodiversity monitoring, large datasets are becoming more and more widely available and are increasingly used globally to estimate species trends and conservation status. These large-scale datasets challenge existing statistical analysis methods, many of which are not adapted to their size, incompleteness and heterogeneity. The development of scalable methods to impute missing data in incomplete large-scale monitoring datasets is crucial to balance sampling in time or space and thus better inform conservation policies.
We developed a new method based on penalized Poisson models to impute and analyse incomplete monitoring data in a large-scale framework. The method allows parameterization of (a) space and time factors, (b) the main effects of predictor covariates, as well as (c) space–time interactions. It also benefits from robust statistical and computational capability in large-scale settings.
The method was tested extensively on both simulated and real-life waterbird data, with the findings revealing that it outperforms six existing methods in terms of missing data imputation errors. Applying the method to 16 waterbird species, we estimated their long-term trends for the first time at the entire North African scale, a region where monitoring data suffer from many gaps in space and time series.
This new approach opens promising perspectives to increase the accuracy of species-abundance trend estimations. We made it freely available in the r package ‘lori’ (https://CRAN.R-project.org/package=lori) and recommend its use for large-scale count data, particularly in citizen science monitoring programmes.

中文翻译：

通过惩罚估计对不完整的大规模监测计数数据进行插补

在生物多样性监测中，大型数据集变得越来越广泛可用，并且越来越多地在全球范围内用于估计物种趋势和保护状况。这些大规模数据集挑战了现有的统计分析方法，其中许多方法不适应其规模、不完整性和异质性。开发可扩展的方法来估算不完整的大规模监测数据集中的缺失数据对于平衡时间或空间的采样至关重要，从而更好地为保护政策提供信息。
我们开发了一种基于惩罚泊松模型的新方法，可以在大规模框架中估算和分析不完整的监测数据。该方法允许参数化（a）空间和时间因素，（b）预测协变量的主要影响，以及（c）时空相互作用。它还受益于大规模设置中强大的统计和计算能力。
该方法在模拟和现实生活中的水鸟数据上进行了广泛测试，结果表明它在缺失数据插补错误方面优于六种现有方法。将该方法应用于 16 种水鸟物种，我们首次在整个北非范围内估计了它们的长期趋势，该地区的监测数据在空间和时间序列上存在许多差距。
这种新方法为提高物种丰富度趋势估计的准确性开辟了前景。我们在r包“ lori ” (https://CRAN.R-project.org/package=lori) 中免费提供它，并建议将其用于大规模计数数据，特别是在公民科学监测计划中。

更新日期：2021-03-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>