当前位置: X-MOL 学术Spat. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Imputed spatial data: Cautions arising from response and covariate imputation measurement error
Spatial Statistics ( IF 2.3 ) Pub Date : 2020-02-03 , DOI: 10.1016/j.spasta.2020.100419
Daniel A. Griffith , Yan-Ting Liau

When data for observations are missing, scientists often remove those observations from an analysis, or replace them with imputations, with few other options available to these analysts. Confining an analysis to those observations with complete data can waste resources expended for, especially, those observations with near-complete data. Selectively retaining variables with complete data for observations with incomplete data can compromise mathematical properties of data analyses (e.g., covariance matrices constructed with pairwise deletion having negative eigenvalues). Unless missing data observations are a random subsample of a given sample, entirely removing these observations can result in biased statistical results. Imputations, such as those furnished by kriging, are best linear unbiased predictors (BLUPs), which can be conditional expectations. As such, they are smoothed values that may be viewed as representing attribute values with measurement error, and hence estimated attribute variance based upon them tends to be biased downward. Because, for example, regression coefficients use variance estimates in their calculations, this downward bias can propagate through a data analysis and its statistical inferences. This paper compares spatial regression analyses between complete datasets and the same datasets in which data values are suppressed and then these missing data are imputed, investigating the presence of such imputations in a response variable as well as in covariates. This study employs the following three imputation methods: kriging, spatial autoregression, and Moran eigenvector spatial filtering. Its emphasis is on predictive modeling as well as spatial data quality and uncertainty.



中文翻译:

估算的空间数据:因响应和协变量估算的测量误差引起的注意事项

当缺少观测数据时,科学家通常会从分析中删除这些观测,或将其替换为插补,这些分析师几乎没有其他选择。仅对具有完整数据的观察结果进行分析可能会浪费资源,尤其是那些具有接近完整数据的观察结果。选择性地保留具有完整数据的变量以用于具有不完整数据的观察会损害数据分析的数学特性(例如,使用具有负特征值的成对删除构造的协方差矩阵)。除非缺失的数据观测值是给定样本的随机子样本,否则完全删除这些观测值可能会导致统计结果有偏差。插值(例如由克里金法提供的插值)是最好的线性无偏预测器(BLUP),这可能是有条件的期望。这样,它们是平滑的值,可以看作表示具有测量误差的属性值,因此基于它们的估计属性方差趋于向下偏置。例如,由于回归系数在其计算中使用方差估计,因此这种向下偏差可能会通过数据分析及其统计推断传播。本文比较了完整数据集和相同数据集之间的空间回归分析,在这些数据集中,数据值被抑制,然后对这些缺失的数据进行估算,调查了响应变量和协变量中此类估算的存在。本研究采用以下三种归因方法:克里金法,空间自回归和Moran特征向量空间滤波。

更新日期:2020-02-03
down
wechat
bug