当前位置: X-MOL 学术Fungal Divers. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The unbearable lightness of sequenced-based identification
Fungal Diversity ( IF 20.3 ) Pub Date : 2019-05-24 , DOI: 10.1007/s13225-019-00428-3
Valérie Hofstetter , Bart Buyck , Guillaume Eyssartier , Sylvain Schnee , Katia Gindro

Using the basic GenBank local alignment search tool program (BLAST) to identify fungi collected in a recently protected beech forest at Montricher (Switzerland), the number of ITS sequences associated to the wrong taxon name appears to be around 30%, even higher than previously estimated. Such results rely on the in-depth re-examination of BLAST results for the most interesting species that were collected, viz. first records for Switzerland, rare or patrimonial species and problematic species (when BLAST top scores were equally high for different species), all belonging to Agaricomycotina. This paper dissects for the first time a number of sequence-based identifications, thereby showing in every detail—particularly to the user community of taxonomic information—why sequence-based identification in the context of a fungal inventory can easily go wrong. Our first conclusion is that in-depth examination of BLAST results is too time consuming to be considered as a routine approach for future inventories: we spent two months on verification of approx. 20 identifications. Apart from the fact that poor taxon coverage in public depositories remains the principal impediment for successful species identification, it can be deplored that even very recent fungal sequence deposits in GenBank involve an uncomfortably high number of misidentifications or errors with associated metadata. While checking the original publications associated with top score sequences for the few examples that were here re-examined, a positive consequence is that we uncovered over 80 type sequences that were not annotated as types in GenBank. Advantages and pitfalls of sequence-based identification are discussed, particularly in the light of undertaking fungal inventories. Recommendations are made to avoid or reduce some of the major problems with sequence-based identification. Nevertheless, the prospects for a more reliable sequence-based identification of fungi remain quite dim, unless authors are ready to check and update the metadata associated with previously deposited sequences in their publications.

中文翻译:

基于序列的识别的轻巧性

使用基本的GenBank本地比对搜索工具程序(BLAST)来识别在瑞士蒙特里赫(Montricher)最近受保护的山毛榉森林中收集的真菌,与错误的分类单元名称相关的ITS序列数量似乎约为30%,甚至比以前更高。估计的。这样的结果依赖于对所收集的最有趣物种的BLAST结果的深入重新检查。瑞士的第一个记录,稀有或世袭物种和有问题的物种(当BLAST最高得分对于不同物种而言都很高),所有这些都属于Agaricomycotina。本文首次剖析了许多基于序列的标识,从而详细显示了每个细节,尤其是对于分类信息的用户社区而言,为什么在真菌清单中基于序列的识别很容易出错。我们的第一个结论是,对BLAST结果进行深入检查非常耗时,因此无法考虑将其作为将来库存的常规方法:我们花了两个月的时间来验证大约 20个标识。除了公共分类库中分类单元覆盖率差仍然是成功进行物种鉴定的主要障碍这一事实之外,值得遗憾的是,即使最近在GenBank中的真菌序列沉积物,也会出现大量令人不安的错误标识或相关元数据的错误。在检查与得分最高的序列相关的原始出版物时,这里重新检查了一些示例,一个积极的结果是,我们发现了80多个在GenBank中未标注为类型的类型序列。讨论了基于序列的识别的优点和陷阱,特别是考虑到进行真菌清单调查。建议避免或减少基于序列的识别的一些主要问题。然而,除非作者准备检查并更新与出版物中先前存放的序列相关的元数据,否则基于序列的真菌鉴定的可靠前景仍然十分渺茫。建议避免或减少基于序列的识别的一些主要问题。然而,除非作者准备检查并更新与出版物中先前存放的序列相关的元数据,否则基于序列的真菌鉴定的可靠前景仍然十分渺茫。建议避免或减少基于序列的识别的一些主要问题。然而,除非作者准备检查并更新与出版物中先前存放的序列相关的元数据,否则基于序列的真菌鉴定的可靠前景仍然十分渺茫。
更新日期:2019-05-24
down
wechat
bug