当前位置: X-MOL 学术Biostatistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A graphical model for skewed matrix-variate non-randomly missing data.
Biostatistics ( IF 1.8 ) Pub Date : 2018-10-26 , DOI: 10.1093/biostatistics/kxy056
Lin Zhang 1 , Dipankar Bandyopadhyay 2
Affiliation  

Epidemiological studies on periodontal disease (PD) collect relevant bio-markers, such as the clinical attachment level (CAL) and the probed pocket depth (PPD), at pre-specified tooth sites clustered within a subject's mouth, along with various other demographic and biological risk factors. Routine cross-sectional evaluation are conducted under a linear mixed model (LMM) framework with underlying normality assumptions on the random terms. However, a careful investigation reveals considerable non-normality manifested in those random terms, in the form of skewness and tail behavior. In addition, PD progression is hypothesized to be spatially-referenced, i.e. disease status at proximal tooth-sites may be different from distally located sites, and tooth missingness is non-random (or informative), given that the number and location of missing teeth informs about the periodontal health in that region. To mitigate these complexities, we consider a matrix-variate skew-$t$ formulation of the LMM with a Markov graphical embedding to handle the site-level spatial associations of the bivariate (PPD and CAL) responses. Within the same framework, the non-randomly missing responses are imputed via a latent probit regression of the missingness indicator over the responses. Our hierarchical Bayesian framework powered by relevant Markov chain Monte Carlo steps addresses the aforementioned complexities within an unified paradigm, and estimates model parameters with seamless sharing of information across various stages of the hierarchy. Using both synthetic and real clinical data assessing PD status, we demonstrate a significantly improved fit of our proposition over various other alternative models.

中文翻译:

偏斜矩阵变量非随机缺失数据的图形模型。

牙周病(PD)的流行病学研究收集了相关的生物标志物,例如临床附着水平(CAL)和探查的牙槽深度(PPD),聚集在受试者口腔内的预先指定的牙齿部位,以及各种其他人口统计信息和生物危险因素。常规横截面评估是在线性混合模型(LMM)框架下进行的,其中随机项具有基本的正态性假设。但是,仔细研究发现,以偏斜和尾巴行为的形式,在那些随机术语中显示出相当大的非正态性。另外,PD进展被认为是空间参考的,也就是说,近端牙齿部位的疾病状态可能与远端部位不同,并且牙齿缺失是非随机的(或信息性的),鉴于缺失牙齿的数量和位置可以反映该地区的牙周健康状况。为了减轻这些复杂性,我们考虑了采用马尔可夫图形嵌入的LMM的矩阵变量偏斜-t $公式,以处理双变量(PPD和CAL)响应的站点级空间关联。在同一框架内,通过缺失指标相对于响应的潜在概率回归来估算非随机缺失响应。由相关的马尔可夫链蒙特卡罗步骤提供支持的我们的分层贝叶斯框架,解决了统一范式中的上述复杂性,并通过在分层的各个阶段之间无缝地共享信息来估计模型参数。使用综合和真实的临床数据评估PD状态,
更新日期:2020-04-17
down
wechat
bug