当前位置: X-MOL 学术J. Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feasibility as a mechanism for model identification and validation
Journal of Applied Statistics ( IF 1.2 ) Pub Date : 2020-06-29 , DOI: 10.1080/02664763.2020.1783522
Corrine F Elliott 1 , Joshua W Lambert 1, 2 , Arnold J Stromberg 1 , Pei Wang 1 , Ting Zeng 1 , Katherine L Thompson 1

As new technologies permit the generation of hitherto unprecedented volumes of data (e.g. genome-wide association study data), researchers struggle to keep up with the added complexity and time commitment required for its analysis. For this reason, model selection commonly relies on machine learning and data-reduction techniques, which tend to afford models with obscure interpretations. Even in cases with straightforward explanatory variables, the so-called ‘best’ model produced by a given model-selection technique may fail to capture information of vital importance to the domain-specific questions at hand. Herein we propose a new concept for model selection, feasibility, for use in identifying multiple models that are in some sense optimal and may unite to provide a wider range of information relevant to the topic of interest, including (but not limited to) interaction terms. We further provide an R package and associated Shiny Applications for use in identifying or validating feasible models, the performance of which we demonstrate on both simulated and real-life data.



由于新技术允许生成迄今为止前所未有的大量数据(例如全基因组关联研究数据),研究人员很难跟上分析所需的复杂性和时间投入。因此,模型选择通常依赖于机器学习和数据缩减技术,而这些技术往往会提供解释模糊的模型。即使在具有简单解释变量的情况下,由给定模型选择技术生成的所谓“最佳”模型也可能无法捕获对当前特定领域问题至关重要的信息。在这里,我们提出了模型选择的新概念“可行性” ,用于识别在某种意义上是最优的多个模型,并且可以联合起来提供与感兴趣的主题相关的更广泛的信息,包括(但不限于)交互项。我们还提供了一个R包和相关的 Shiny 应用程序,用于识别或验证可行的模型,我们在模拟和现实数据上展示了其性能。
