当前位置: X-MOL 学术Am. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Facilitating the Calculation of the Efficient Score Using Symbolic Computing
The American Statistician ( IF 1.8 ) Pub Date : 2018-04-03 , DOI: 10.1080/00031305.2017.1392361
Alexander Sibley 1 , Zhiguo Li 2 , Yu Jiang 2 , Yi-Ju Li 2 , Cliburn Chan 2 , Andrew Allen 2 , Kouros Owzar 3
Affiliation  

ABSTRACT The score statistic continues to be a fundamental tool for statistical inference. In the analysis of data from high-throughput genomic assays, inference on the basis of the score usually enjoys greater stability, considerably higher computational efficiency, and lends itself more readily to the use of resampling methods than the asymptotically equivalent Wald or likelihood ratio tests. The score function often depends on a set of unknown nuisance parameters which have to be replaced by estimators, but can be improved by calculating the efficient score, which accounts for the variability induced by estimating these parameters. Manual derivation of the efficient score is tedious and error-prone, so we illustrate using computer algebra to facilitate this derivation. We demonstrate this process within the context of a standard example from genetic association analyses, though the techniques shown here could be applied to any derivation, and have a place in the toolbox of any modern statistician. We further show how the resulting symbolic expressions can be readily ported to compiled languages, to develop fast numerical algorithms for high-throughput genomic analysis. We conclude by considering extensions of this approach. The code featured in this report is available online as part of the supplementary material.

中文翻译:

使用符号计算促进有效分数的计算

摘要 分数统计仍然是统计推断的基本工具。在分析来自高通量基因组分析的数据时,基于分数的推断通常具有更高的稳定性、更高的计算效率,并且比渐近等效的 Wald 或似然比测试更容易使用重采样方法。得分函数通常取决于一组未知的干扰参数,这些参数必须被估计量替换,但可以通过计算有效得分来改进,这说明了估计这些参数引起的可变性。手动推导有效分数既乏味又容易出错,因此我们使用计算机代数进行说明以促进此推导。我们在遗传关联分析的标准示例的背景下演示了这个过程,尽管这里显示的技术可以应用于任何推导,并且在任何现代统计学家的工具箱中都有一席之地。我们进一步展示了如何将生成的符号表达式轻松移植到编译语言,以开发用于高通量基因组分析的快速数值算法。我们通过考虑扩展这种方法得出结论。本报告中的代码可作为补充材料的一部分在线获取。开发用于高通量基因组分析的快速数值算法。我们通过考虑这种方法的扩展来结束。本报告中的代码可作为补充材料的一部分在线获取。开发用于高通量基因组分析的快速数值算法。我们通过考虑扩展这种方法得出结论。本报告中的代码可作为补充材料的一部分在线获取。
更新日期:2018-04-03
down
wechat
bug