-
Mendelian randomization analysis using multiple biomarkers of an underlying common exposure Biostatistics (IF 2.1) Pub Date : 2024-03-09 Jin Jin, Guanghao Qi, Zhi Yu, Nilanjan Chatterjee
Summary Mendelian randomization (MR) analysis is increasingly popular for testing the causal effect of exposures on disease outcomes using data from genome-wide association studies. In some settings, the underlying exposure, such as systematic inflammation, may not be directly observable, but measurements can be available on multiple biomarkers or other types of traits that are co-regulated by the
-
Projection-based two-sample inference for sparsely observed multivariate functional data Biostatistics (IF 2.1) Pub Date : 2024-02-28 Salil Koner, Sheng Luo
Summary Modern longitudinal studies collect multiple outcomes as the primary endpoints to understand the complex dynamics of the diseases. Oftentimes, especially in clinical trials, the joint variation among the multidimensional responses plays a significant role in assessing the differential characteristics between two or more groups, rather than drawing inferences based on a single outcome. We develop
-
A Bayesian approach for investigating the pharmacogenetics of combination antiretroviral therapy in people with HIV Biostatistics (IF 2.1) Pub Date : 2024-02-17 Wei Jin, Yang Ni, Amanda B Spence, Leah H Rubin, Yanxun Xu
Summary Combination antiretroviral therapy (ART) with at least three different drugs has become the standard of care for people with HIV (PWH) due to its exceptional effectiveness in viral suppression. However, many ART drugs have been reported to associate with neuropsychiatric adverse effects including depression, especially when certain genetic polymorphisms exist. Pharmacogenetics is an important
-
A Bayesian nonparametric approach for multiple mediators with applications in mental health studies Biostatistics (IF 2.1) Pub Date : 2024-02-09 Samrat Roy, Michael J Daniels, Jason Roy
Summary Mediation analysis with contemporaneously observed multiple mediators is a significant area of causal inference. Recent approaches for multiple mediators are often based on parametric models and thus may suffer from model misspecification. Also, much of the existing literature either only allow estimation of the joint mediation effect or estimate the joint mediation effect just as the sum of
-
Estimation of optimal treatment regimes with electronic medical record data using the residual life value estimator Biostatistics (IF 2.1) Pub Date : 2024-02-09 Grace Rhodes, Marie Davidian, Wenbin Lu
Summary Clinicians and patients must make treatment decisions at a series of key decision points throughout disease progression. A dynamic treatment regime is a set of sequential decision rules that return treatment decisions based on accumulating patient information, like that commonly found in electronic medical record (EMR) data. When applied to a patient population, an optimal treatment regime
-
DP2LM: leveraging deep learning approach for estimation and hypothesis testing on mediation effects with high-dimensional mediators and complex confounders Biostatistics (IF 2.1) Pub Date : 2024-02-08 Shuoyang Wang, Yuan Huang
Summary Traditional linear mediation analysis has inherent limitations when it comes to handling high-dimensional mediators. Particularly, accurately estimating and rigorously inferring mediation effects is challenging, primarily due to the intertwined nature of the mediator selection issue. Despite recent developments, the existing methods are inadequate for addressing the complex relationships introduced
-
Uncertainty directed factorial clinical trials Biostatistics (IF 2.1) Pub Date : 2024-02-08 Gopal Kotecha, Steffen Ventz, Sandra Fortini, Lorenzo Trippa
Summary The development and evaluation of novel treatment combinations is a key component of modern clinical research. The primary goals of factorial clinical trials of treatment combinations range from the estimation of intervention-specific effects, or the discovery of potential synergies, to the identification of combinations with the highest response probabilities. Most factorial studies use balanced
-
Bayesian semiparametric model for sequential treatment decisions with informative timing Biostatistics (IF 2.1) Pub Date : 2024-01-17 Arman Oganisian, Kelly D Getz, Todd A Alonzo, Richard Aplenc, Jason A Roy
Summary We develop a Bayesian semiparametric model for the impact of dynamic treatment rules on survival among patients diagnosed with pediatric acute myeloid leukemia (AML). The data consist of a subset of patients enrolled in a phase III clinical trial in which patients move through a sequence of four treatment courses. At each course, they undergo treatment that may or may not include anthracyclines
-
Covariate-guided Bayesian mixture of spline experts for the analysis of multivariate high-density longitudinal data Biostatistics (IF 2.1) Pub Date : 2023-12-23 Haoyi Fu, Lu Tang, Ori Rosen, Alison E Hipwell, Theodore J Huppert, Robert T Krafty
Summary With rapid development of techniques to measure brain activity and structure, statistical methods for analyzing modern brain-imaging data play an important role in the advancement of science. Imaging data that measure brain function are usually multivariate high-density longitudinal data and are heterogeneous across both imaging sources and subjects, which lead to various statistical and computational
-
Scalable kernel balancing weights in a nationwide observational study of hospital profit status and heart attack outcomes Biostatistics (IF 2.1) Pub Date : 2023-12-21 Kwangho Kim, Bijan A Niknam, José R Zubizarreta
Summary Weighting is a general and often-used method for statistical adjustment. Weighting has two objectives: first, to balance covariate distributions, and second, to ensure that the weights have minimal dispersion and thus produce a more stable estimator. A recent, increasingly common approach directly optimizes the weights toward these two objectives. However, this approach has not yet been feasible
-
A Bayesian multivariate factor analysis model for causal inference using time-series observational data on mixed outcomes Biostatistics (IF 2.1) Pub Date : 2023-12-07 Pantelis Samartsidis, Shaun R Seaman, Abbie Harrison, Angelos Alexopoulos, Gareth J Hughes, Christopher Rawlinson, Charlotte Anderson, André Charlett, Isabel Oliver, Daniela De Angelis
Summary Assessing the impact of an intervention by using time-series observational data on multiple units and outcomes is a frequent problem in many fields of scientific research. Here, we propose a novel Bayesian multivariate factor analysis model for estimating intervention effects in such settings and develop an efficient Markov chain Monte Carlo algorithm to sample from the high-dimensional and
-
Similarity-based multimodal regression Biostatistics (IF 2.1) Pub Date : 2023-12-07 Andrew A Chen, Sarah M Weinstein, Azeez Adebimpe, Ruben C Gur, Raquel E Gur, Kathleen R Merikangas, Theodore D Satterthwaite, Russell T Shinohara, Haochang Shou
Summary To better understand complex human phenotypes, large-scale studies have increasingly collected multiple data modalities across domains such as imaging, mobile health, and physical activity. The properties of each data type often differ substantially and require either separate analyses or extensive processing to obtain comparable features for a combined analysis. Multimodal data fusion enables
-
Evaluating dynamic and predictive discrimination for recurrent event models: use of a time-dependent C-index. Biostatistics (IF 2.1) Pub Date : 2023-11-10 Jian Wang,Xinyang Jiang,Jing Ning
Interest in analyzing recurrent event data has increased over the past few decades. One essential aspect of a risk prediction model for recurrent event data is to accurately distinguish individuals with different risks of developing a recurrent event. Although the concordance index (C-index) effectively evaluates the overall discriminative ability of a regression model for recurrent event data, a local
-
Analyzing microbial evolution through gene and genome phylogenies Biostatistics (IF 2.1) Pub Date : 2023-10-28 Sarah Teichman, Michael D Lee, Amy D Willis
Microbiome scientists critically need modern tools to explore and analyze microbial evolution. Often this involves studying the evolution of microbial genomes as a whole. However, different genes in a single genome can be subject to different evolutionary pressures, which can result in distinct gene-level evolutionary histories. To address this challenge, we propose to treat estimated gene-level phylogenies
-
Signal detection statistics of adverse drug events in hierarchical structure for matched case-control data. Biostatistics (IF 2.1) Pub Date : 2023-10-26 Seok-Jae Heo,Sohee Jeong,Dagyeom Jung,Inkyung Jung
The tree-based scan statistic is a data mining method used to identify signals of adverse drug reactions in a database of spontaneous reporting systems. It is particularly beneficial when dealing with hierarchical data structures. One may use a retrospective case-control study design from spontaneous reporting systems (SRS) to investigate whether a specific adverse event of interest is associated with
-
Multiscale analysis of count data through topic alignment. Biostatistics (IF 2.1) Pub Date : 2023-10-18 Julia Fukuyama,Kris Sankaran,Laura Symul
Topic modeling is a popular method used to describe biological count data. With topic models, the user must specify the number of topics $K$. Since there is no definitive way to choose $K$ and since a true value might not exist, we develop a method, which we call topic alignment, to study the relationships across models with different $K$. In addition, we present three diagnostics based on the alignment
-
Spatial Difference Boundary Detection for Multiple Outcomes Using Bayesian Disease Mapping. Biostatistics (IF 2.1) Pub Date : 2023-10-18 Leiwen Gao,Sudipto Banerjee,Beate Ritz
Regional aggregates of health outcomes over delineated administrative units (e.g., states, counties, and zip codes), or areal units, are widely used by epidemiologists to map mortality or incidence rates and capture geographic variation. To capture health disparities over regions, we seek "difference boundaries" that separate neighboring regions with significantly different spatial effects. Matters
-
A controlled effects approach to assessing immune correlates of protection Biostatistics (IF 2.1) Pub Date : 2023-10-18 Peter B Gilbert, Youyi Fong, Avi Kenny, Marco Carone
Summary An immune correlate of risk (CoR) is an immunologic biomarker in vaccine recipients associated with an infectious disease clinical endpoint. An immune correlate of protection (CoP) is a CoR that can be used to reliably predict vaccine efficacy (VE) against the clinical endpoint and hence is accepted as a surrogate endpoint that can be used for accelerated approval or guide use of vaccines.
-
Joint modeling in presence of informative censoring on the retrospective time scale with application to palliative care research. Biostatistics (IF 2.1) Pub Date : 2023-10-06 Quran Wu,Michael Daniels,Areej El-Jawahri,Marie Bakitas,Zhigang Li
Joint modeling of longitudinal data such as quality of life data and survival data is important for palliative care researchers to draw efficient inferences because it can account for the associations between those two types of data. Modeling quality of life on a retrospective from death time scale is useful for investigators to interpret the analysis results of palliative care studies which have relatively
-
A Bayesian nonparametric approach to correct for underreporting in count data. Biostatistics (IF 2.1) Pub Date : 2023-09-16 Serena Arima,Silvia Polettini,Giuseppe Pasculli,Loreto Gesualdo,Francesco Pesce,Deni-Aldo Procaccini
We propose a nonparametric compound Poisson model for underreported count data that introduces a latent clustering structure for the reporting probabilities. The latter are estimated with the model's parameters based on experts' opinion and exploiting a proxy for the reporting process. The proposed model is used to estimate the prevalence of chronic kidney disease in Apulia, Italy, based on a unique
-
Semi-supervised mixture multi-source exchangeability model for leveraging real-world data in clinical trials Biostatistics (IF 2.1) Pub Date : 2023-09-12 Lillian M F Haine, Thomas A Murry, Raquel Nahra, Giota Touloumi, Eduardo Fernández-Cruz, Kathy Petoumenos, Joseph S Koopmeiners
Summary The traditional trial paradigm is often criticized as being slow, inefficient, and costly. Statistical approaches that leverage external trial data have emerged to make trials more efficient by augmenting the sample size. However, these approaches assume that external data are from previously conducted trials, leaving a rich source of untapped real-world data (RWD) that cannot yet be effectively
-
Bayesian joint models for multi-regional clinical trials Biostatistics (IF 2.1) Pub Date : 2023-09-05 Nathan W Bean, Joseph G Ibrahim, Matthew A Psioda
Summary In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across
-
Variable selection in high dimensions for discrete-outcome individualized treatment rules: Reducing severity of depression symptoms. Biostatistics (IF 2.1) Pub Date : 2023-08-31 Erica E M Moodie,Zeyu Bian,Janie Coulombe,Yi Lian,Archer Y Yang,Susan M Shortreed
Despite growing interest in estimating individualized treatment rules, little attention has been given the binary outcome setting. Estimation is challenging with nonlinear link functions, especially when variable selection is needed. We use a new computational approach to solve a recently proposed doubly robust regularized estimating equation to accomplish this difficult task in a case study of depression
-
Identifying predictors of resilience to stressors in single-arm studies of pre-post change. Biostatistics (IF 2.1) Pub Date : 2023-08-05 Ravi Varadhan,Jiafeng Zhu,Karen Bandeen-Roche
Many older adults experience a major stressor at some point in their lives. The ability to recover well after a major stressor is known as resilience. An important goal of geriatric research is to identify factors that influence resilience to stressors. Studies of resilience in older adults are typically conducted with a single-arm where everyone experiences the stressor. The simplistic approach of
-
Correction to: A transformation perspective on marginal and conditional models. Biostatistics (IF 2.1) Pub Date : 2023-08-02
-
Blurring cluster randomized trials and observational studies: Two-Stage TMLE for subsampling, missingness, and few independent units. Biostatistics (IF 2.1) Pub Date : 2023-08-02 Joshua R Nugent,Carina Marquez,Edwin D Charlebois,Rachel Abbott,Laura B Balzer,
Cluster randomized trials (CRTs) often enroll large numbers of participants; yet due to resource constraints, only a subset of participants may be selected for outcome assessment, and those sampled may not be representative of all cluster members. Missing data also present a challenge: if sampled individuals with measured outcomes are dissimilar from those with missing outcomes, unadjusted estimates
-
Fast and flexible inference for joint models of multivariate longitudinal and survival data using integrated nested Laplace approximations Biostatistics (IF 2.1) Pub Date : 2023-08-02 Denis Rustand, Janet van Niekerk, Elias Teixeira Krainski, Håvard Rue, Cécile Proust-Lima
Modeling longitudinal and survival data jointly offers many advantages such as addressing measurement error and missing data in the longitudinal processes, understanding and quantifying the association between the longitudinal markers and the survival events, and predicting the risk of events based on the longitudinal markers. A joint model involves multiple submodels (one for each longitudinal/survival
-
An integrative latent class model of heterogeneous data modalities for diagnosing kidney obstruction Biostatistics (IF 2.1) Pub Date : 2023-07-26 Jeong Hoon Jang, Changgee Chang, Amita K Manatunga, Andrew T Taylor, Qi Long
SUMMARY Radionuclide imaging plays a critical role in the diagnosis and management of kidney obstruction. However, most practicing radiologists in US hospitals have insufficient time and resources to acquire training and experience needed to interpret radionuclide images, leading to increased diagnostic errors. To tackle this problem, Emory University embarked on a study that aims to develop a computer-assisted
-
A scalable approach for continuous time Markov models with covariates Biostatistics (IF 2.1) Pub Date : 2023-07-12 Farhad Hatami, Alex Ocampo, Gordon Graham, Thomas E Nichols, Habib Ganjgahi
Existing methods for fitting continuous time Markov models (CTMM) in the presence of covariates suffer from scalability issues due to high computational cost of matrix exponentials calculated for each observation. In this article, we propose an optimization technique for CTMM which uses a stochastic gradient descent algorithm combined with differentiation of the matrix exponential using a Padé approximation
-
Multivariate spatiotemporal functional principal component analysis for modeling hospitalization and mortality rates in the dialysis population Biostatistics (IF 2.1) Pub Date : 2023-06-20 Qi Qian, Danh V Nguyen, Donatello Telesca, Esra Kurum, Connie M Rhee, Sudipto Banerjee, Yihao Li, Damla Senturk
Summary Dialysis patients experience frequent hospitalizations and a higher mortality rate compared to other Medicare populations, in whom hospitalizations are a major contributor to morbidity, mortality, and healthcare costs. Patients also typically remain on dialysis for the duration of their lives or until kidney transplantation. Hence, there is growing interest in studying the spatiotemporal trends
-
Quantification and statistical modeling of droplet-based single-nucleus RNA-sequencing data Biostatistics (IF 2.1) Pub Date : 2023-05-31 Albert Kuo, Kasper D Hansen, Stephanie C Hicks
Summary In complex tissues containing cells that are difficult to dissociate, single-nucleus RNA-sequencing (snRNA-seq) has become the preferred experimental technology over single-cell RNA-sequencing (scRNA-seq) to measure gene expression. To accurately model these data in downstream analyses, previous work has shown that droplet-based scRNA-seq data are not zero-inflated, but whether droplet-based
-
Modeling biomarker variability in joint analysis of longitudinal and time-to-event data Biostatistics (IF 2.1) Pub Date : 2023-05-25 Chunyu Wang, Jiaming Shen, Christiana Charalambous, Jianxin Pan
Summary The role of visit-to-visit variability of a biomarker in predicting related disease has been recognized in medical science. Existing measures of biological variability are criticized for being entangled with random variability resulted from measurement error or being unreliable due to limited measurements per individual. In this article, we propose a new measure to quantify the biological variability
-
Multiple imputation of more than one environmental exposure with nondifferential measurement error Biostatistics (IF 2.1) Pub Date : 2023-05-25 Yuanzhi Yu, Roderick J Little, Matthew Perzanowski, Qixuan Chen
Summary Measurement error is common in environmental epidemiologic studies, but methods for correcting measurement error in regression models with multiple environmental exposures as covariates have not been well investigated. We consider a multiple imputation approach, combining external or internal calibration samples that contain information on both true and error-prone exposures with the main study
-
Historical controls in clinical trials: a note on linking Pocock's model with the robust mixture priors. Biostatistics (IF 2.1) Pub Date : 2023-04-14 Andrea Callegaro,Nicholas Galwey,Juan J Abellan
Several Bayesian methods have been proposed to borrow information dynamically from historical controls in clinical trials. In this note, we identify key features of the relationship between the first method proposed, the bias-variance method, which is strongly related to the commensurate prior approach, and a more recent and widely used approach called robust mixture priors (RMP). We focus on the two
-
Bayesian finite mixture of regression analysis for cancer based on histopathological imaging-environment interactions. Biostatistics (IF 2.1) Pub Date : 2023-04-14 Yunju Im,Yuan Huang,Aixin Tan,Shuangge Ma
Cancer is a heterogeneous disease. Finite mixture of regression (FMR)-as an important heterogeneity analysis technique when an outcome variable is present-has been extensively employed in cancer research, revealing important differences in the associations between a cancer outcome/phenotype and covariates. Cancer FMR analysis has been based on clinical, demographic, and omics variables. A relatively
-
Differential transcript usage analysis incorporating quantification uncertainty via compositional measurement error regression modeling Biostatistics (IF 2.1) Pub Date : 2023-04-11 Amber M Young, Scott Van Buren, Naim U Rashid
Summary Differential transcript usage (DTU) occurs when the relative expression of multiple transcripts arising from the same gene changes between different conditions. Existing approaches to detect DTU often rely on computational procedures that can have speed and scalability issues as the number of samples increases. Here we propose a new method, CompDTU, that uses compositional regression to model
-
Identifying covariate-related subnetworks for whole-brain connectome analysis Biostatistics (IF 2.1) Pub Date : 2023-04-10 Shuo Chen, Yuan Zhang, Qiong Wu, Chuan Bi, Peter Kochunov, L Elliot Hong
Summary Whole-brain connectome data characterize the connections among distributed neural populations as a set of edges in a large network, and neuroscience research aims to systematically investigate associations between brain connectome and clinical or experimental conditions as covariates. A covariate is often related to a number of edges connecting multiple brain areas in an organized structure
-
Systematically missing data in causally interpretable meta-analysis Biostatistics (IF 2.1) Pub Date : 2023-03-29 Jon A Steingrimsson, David H Barker, Ruofan Bie, Issa J Dahabreh
Summary Causally interpretable meta-analysis combines information from a collection of randomized controlled trials to estimate treatment effects in a target population in which experimentation may not be possible but from which covariate information can be obtained. In such analyses, a key practical challenge is the presence of systematically missing data when some trials have collected data on one
-
Cohort-based smoothing methods for age-specific contact rates Biostatistics (IF 2.1) Pub Date : 2023-03-21 Yannick Vandendijck, Oswaldo Gressani, Christel Faes, Carlo G Camarda, Niel Hens
Summary The use of social contact rates is widespread in infectious disease modeling since it has been shown that they are key driving forces of important epidemiological parameters. Quantification of contact patterns is crucial to parameterize dynamic transmission models and to provide insights on the (basic) reproduction number. Information on social interactions can be obtained from population-based
-
Multi-trait analysis of gene-by-environment interactions in large-scale genetic studies Biostatistics (IF 2.1) Pub Date : 2023-03-10 Lan Luo, Devan V Mehrotra, Judong Shen, Zheng-Zheng Tang
Summary Identifying genotype-by-environment interaction (GEI) is challenging because the GEI analysis generally has low power. Large-scale consortium-based studies are ultimately needed to achieve adequate power for identifying GEI. We introduce Multi-Trait Analysis of Gene–Environment Interactions (MTAGEI), a powerful, robust, and computationally efficient framework to test gene–environment interactions
-
A Bayesian approach to estimating COVID-19 incidence and infection fatality rates Biostatistics (IF 2.1) Pub Date : 2023-03-07 Justin J Slater, Aiyush Bansal, Harlan Campbell, Jeffrey S Rosenthal, Paul Gustafson, Patrick E Brown
Summary Naive estimates of incidence and infection fatality rates (IFR) of coronavirus disease 2019 suffer from a variety of biases, many of which relate to preferential testing. This has motivated epidemiologists from around the globe to conduct serosurveys that measure the immunity of individuals by testing for the presence of SARS-CoV-2 antibodies in the blood. These quantitative measures (titer
-
Assessing the causal effects of a stochastic intervention in time series data: are heat alerts effective in preventing deaths and hospitalizations? Biostatistics (IF 2.1) Pub Date : 2023-02-23 Xiao Wu, Kate R Weinberger, Gregory A Wellenius, Francesca Dominici, Danielle Braun
Summary The methodological development of this article is motivated by the need to address the following scientific question: does the issuance of heat alerts prevent adverse health effects? Our goal is to address this question within a causal inference framework in the context of time series data. A key challenge is that causal inference methods require the overlap assumption to hold: each unit (i
-
Estimating the overall fraction of phenotypic variance attributed to high-dimensional predictors measured with error Biostatistics (IF 2.1) Pub Date : 2023-02-17 Soutrik Mandal, Do Hyun Kim, Xing Hua, Shilan Li, Jianxin Shi
Summary In prospective genomic studies (e.g., DNA methylation, metagenomics, and transcriptomics), it is crucial to estimate the overall fraction of phenotypic variance (OFPV) attributed to the high-dimensional genomic variables, a concept similar to heritability analyses in genome-wide association studies (GWAS). Unlike genetic variants in GWAS, these genomic variables are typically measured with
-
DeLIVR: a deep learning approach to IV regression for testing nonlinear causal effects in transcriptome-wide association studies Biostatistics (IF 2.1) Pub Date : 2023-01-07 Ruoyu He, Mingyang Liu, Zhaotong Lin, Zhong Zhuang, Xiaotong Shen, Wei Pan
Summary Transcriptome-wide association studies (TWAS) have been increasingly applied to identify (putative) causal genes for complex traits and diseases. TWAS can be regarded as a two-sample two-stage least squares method for instrumental variable (IV) regression for causal inference. The standard TWAS (called TWAS-L) only considers a linear relationship between a gene’s expression and a trait in stage
-
Bayesian semiparametric Markov renewal mixed models for vocalization syntax. Biostatistics (IF 2.1) Pub Date : 2022-12-30 Yutong Wu,Erich D Jarvis,Abhra Sarkar
Speech and language play an important role in human vocal communication. Studies have shown that vocal disorders can result from genetic factors. In the absence of high-quality data on humans, mouse vocalization experiments in laboratory settings have been proven useful in providing valuable insights into mammalian vocal development, including especially the impact of certain genetic mutations. Such
-
Flexible evaluation of surrogacy in platform studies Biostatistics (IF 2.1) Pub Date : 2022-12-29 Michael C Sachs, Erin E Gabriel, Alessio Crippa, Michael J Daniels
Summary Trial-level surrogates are useful tools for improving the speed and cost effectiveness of trials but surrogates that have not been properly evaluated can cause misleading results. The evaluation procedure is often contextual and depends on the type of trial setting. There have been many proposed methods for trial-level surrogate evaluation, but none, to our knowledge, for the specific setting
-
An imputation approach for a time-to-event analysis subject to missing outcomes due to noncoverage in disease registries Biostatistics (IF 2.1) Pub Date : 2022-12-19 Joanna H Shih, Paul S Albert, Jason Fine, Danping Liu
Summary Disease incidence data in a national-based cohort study would ideally be obtained through a national disease registry. Unfortunately, no such registry currently exists in the United States. Instead, the results from individual state registries need to be combined to ascertain certain disease diagnoses in the United States. The National Cancer Institute has initiated a program to assemble all
-
A transformation perspective on marginal and conditional models Biostatistics (IF 2.1) Pub Date : 2022-12-19 Luisa Barbanti, Torsten Hothorn
Summary Clustered observations are ubiquitous in controlled and observational studies and arise naturally in multicenter trials or longitudinal surveys. We present a novel model for the analysis of clustered observations where the marginal distributions are described by a linear transformation model and the correlations by a joint multivariate normal distribution. The joint model provides an analytic
-
Inference after latent variable estimation for single-cell RNA sequencing data Biostatistics (IF 2.1) Pub Date : 2022-12-13 Anna Neufeld, Lucy L Gao, Joshua Popp, Alexis Battle, Daniela Witten
Summary In the analysis of single-cell RNA sequencing data, researchers often characterize the variation between cells by estimating a latent variable, such as cell type or pseudotime, representing some aspect of the cell’s state. They then test each gene for association with the estimated latent variable. If the same data are used for both of these steps, then standard methods for computing p-values
-
Shifting-corrected regularized regression for 1H NMR metabolomics identification and quantification. Biostatistics (IF 2.1) Pub Date : 2022-12-12 Thao Vu,Yuhang Xu,Yumou Qiu,Robert Powers
The process of identifying and quantifying metabolites in complex mixtures plays a critical role in metabolomics studies to obtain an informative interpretation of underlying biological processes. Manual approaches are time-consuming and heavily reliant on the knowledge and assessment of nuclear magnetic resonance (NMR) experts. We propose a shifting-corrected regularized regression method, which identifies
-
Spatiotemporal varying coefficient model for respiratory disease mapping in Taiwan Biostatistics (IF 2.1) Pub Date : 2022-12-09 Feifei Wang, Congyuan Duan, Yang Li, Hui Huang, Ben-Chang Shia
Summary Respiratory diseases have been global public health problems for a long time. In recent years, air pollutants as important risk factors have drawn lots of attention. In this study, we investigate the influence of $\pm2.5$ (particulate matters in diameter less than 2.5 ${\rm{\mu }} m$) on hospital visit rates for respiratory diseases in Taiwan. To reveal the spatiotemporal pattern of data, we
-
Longitudinal regression of covariance matrix outcomes. Biostatistics (IF 2.1) Pub Date : 2022-12-01 Yi Zhao,Brian S Caffo,Xi Luo
In this study, a longitudinal regression model for covariance matrix outcomes is introduced. The proposal considers a multilevel generalized linear model for regressing covariance matrices on (time-varying) predictors. This model simultaneously identifies covariate-associated components from covariance matrices, estimates regression coefficients, and captures the within-subject variation in the covariance
-
Time-to-event surrogate endpoint validation using mediation analysis and meta-analytic data Biostatistics (IF 2.1) Pub Date : 2022-11-18 Quentin Le Coënt, Catherine Legrand, Virginie Rondeau
Summary With the ongoing development of treatments and the resulting increase in survival in oncology, clinical trials based on endpoints such as overall survival may require long follow-up periods to observe sufficient events and ensure adequate statistical power. This increase in follow-up time may compromise the feasibility of the study. The use of surrogate endpoints instead of final endpoints
-
Joint modeling of longitudinal and competing-risk data using cumulative incidence functions for the failure submodels accounting for potential failure cause misclassification through double sampling. Biostatistics (IF 2.1) Pub Date : 2022-11-04 Christos Thomadakis,Loukia Meligkotsidou,Constantin T Yiannoutsos,Giota Touloumi
Most of the literature on joint modeling of longitudinal and competing-risk data is based on cause-specific hazards, although modeling of the cumulative incidence function (CIF) is an easier and more direct approach to evaluate the prognosis of an event. We propose a flexible class of shared parameter models to jointly model a normally distributed marker over time and multiple causes of failure using
-
An online framework for survival analysis: reframing Cox proportional hazards model for large data sets and neural networks. Biostatistics (IF 2.1) Pub Date : 2022-10-26 Aliasghar Tarkhan,Noah Simon
In many biomedical applications, outcome is measured as a "time-to-event" (e.g., disease progression or death). To assess the connection between features of a patient and this outcome, it is common to assume a proportional hazards model and fit a proportional hazards regression (or Cox regression). To fit this model, a log-concave objective function known as the "partial likelihood" is maximized. For
-
Multilayer Exponential Family Factor models for integrative analysis and learning disease progression. Biostatistics (IF 2.1) Pub Date : 2022-09-19 Qinxia Wang,Yuanjia Wang
Current diagnosis of neurological disorders often relies on late-stage clinical symptoms, which poses barriers to developing effective interventions at the premanifest stage. Recent research suggests that biomarkers and subtle changes in clinical markers may occur in a time-ordered fashion and can be used as indicators of early disease. In this article, we tackle the challenges to leverage multidomain
-
Multiple exposure distributed lag models with variable selection Biostatistics (IF 2.1) Pub Date : 2022-09-08 Joseph Antonelli, Ander Wilson, Brent A Coull
Summary Distributed lag models are useful in environmental epidemiology as they allow the user to investigate critical windows of exposure, defined as the time periods during which exposure to a pollutant adversely affects health outcomes. Recent studies have focused on estimating the health effects of a large number of environmental exposures, or an environmental mixture, on health outcomes. In such
-
A scalable and unbiased discordance metric with H+ Biostatistics (IF 2.1) Pub Date : 2022-09-05 Nathan Dyjack, Daniel N Baker, Vladimir Braverman, Ben Langmead, Stephanie C Hicks
Summary A standard unsupervised analysis is to cluster observations into discrete groups using a dissimilarity measure, such as Euclidean distance. If there does not exist a ground-truth label for each observation necessary for external validity metrics, then internal validity metrics, such as the tightness or separation of the clusters, are often used. However, the interpretation of these internal
-
Cross-direct effects in settings with two mediators. Biostatistics (IF 2.1) Pub Date : 2023-10-18 Erin E Gabriel,Arvid Sjölander,Dean Follmann,Michael C Sachs
When multiple mediators are present, there are additional effects that may be of interest beyond the well-known natural (NDE) and controlled direct effects (CDE). These effects cross the type of control on the mediators, setting one to a constant level and one to its natural level, which differs across subjects. We introduce five such estimands for the cross-CDE and -NDE when two mediators are measured
-
Differences in set-based tests for sparse alternatives when testing sets of outcomes compared to sets of explanatory factors in genetic association studies Biostatistics (IF 2.1) Pub Date : 2022-08-24 Ryan Sun, Andy Shi, Xihong Lin
Summary Set-based association tests are widely popular in genetic association settings for their ability to aggregate weak signals and reduce multiple testing burdens. In particular, a class of set-based tests including the Higher Criticism, Berk–Jones, and other statistics have recently been popularized for reaching a so-called detection boundary when signals are rare and weak. Such tests have been