当前位置: X-MOL 学术Stat › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A unified approach for outliers and influential data detection: The value of information in retrospect
Stat ( IF 0.7 ) Pub Date : 2021-12-06 , DOI: 10.1002/sta4.442
Jacob Parsons 1 , Le Bao 1
Affiliation  

Identifying influential and outlying data is important as it would guide the effective collection of future data and the proper use of existing information. We develop a unified approach for outlier detection and influence analysis. Our proposed method is grounded in the intuitive value of information concepts and has a distinct advantage in interpretability and flexibility when compared to existing methods: It decomposes the data influence into the leverage effect (expected to be influential) and the outlying effect (surprisingly more influential than being expected); and it applies to all decision problems such as estimation, prediction and hypothesis testing. We study the theoretical properties of three values of information quantities, establish the relationship between the proposed measures and classic measures in the linear regression setting and provide real data analysis examples of how to apply the new value of information approach in the cases of linear regression, generalized linear mixed models and hypothesis testing.

中文翻译:

异常值和有影响力的数据检测的统一方法:回顾信息的价值

识别有影响力和外围的数据很重要,因为它将指导未来数据的有效收集和现有信息的正确使用。我们开发了一种统一的异常值检测和影响分析方法。我们提出的方法基于信息概念的直观价值,与现有方法相比,在可解释性和灵活性方面具有明显的优势:它将数据影响分解为杠杆效应(预期有影响力)和外围效应(令人惊讶的是更有影响力)超出预期);它适用于所有决策问题,例如估计、预测和假设检验。我们研究信息量的三个值的理论属性,建立线性回归设置中所提出的度量和经典度量之间的关系,并提供如何在线性回归的情况下应用新的信息值方法的真实数据分析示例,广义线性混合模型和假设检验。
更新日期:2021-12-06
down
wechat
bug