当前位置: X-MOL 学术Corpus Linguistics and Linguistic Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement
Corpus Linguistics and Linguistic Theory ( IF 1.0 ) Pub Date : 2019-04-16 , DOI: 10.1515/cllt-2018-0078
Stefan Th Gries 1, 2
Affiliation  

Abstract This paper is a discussion of methodological problems that (can) arise in the analysis of multifactorial data analyzed with tree-based or forest-based classifiers in (corpus) linguistics. I showcase a data set that highlights where such methods can fail at providing optimal results and then discuss solutions to this problem as well as the interpretation of random forests more generally.

中文翻译:

论语料库语言学中的分类树和随机森林:一些注意事项和改进建议

摘要 本文讨论了在(语料库)语言学中使用基于树或基于森林的分类器分析的多因素数据分析中(可能)出现的方法问题。我展示了一个数据集,该数据集突出了此类方法无法提供最佳结果的地方,然后讨论了该问题的解决方案以及更一般地对随机森林的解释。
更新日期:2019-04-16
down
wechat
bug