显示样式： 排序： IF:  GO 导出

A computationally efficient Bayesian seemingly unrelated regressions model for high‐dimensional quantitative trait loci discovery J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210508
Leonardo Bottolo, Marco Banterle, Sylvia Richardson, Mika Ala‐Korpela, Marjo‐Riitta Järvelin, Alex LewinOur work is motivated by the search for metabolite quantitative trait loci (QTL) in a cohort of more than 5000 people. There are 158 metabolites measured by NMR spectroscopy in the 31‐year follow‐up of the Northern Finland Birth Cohort 1966 (NFBC66). These metabolites, as with many multivariate phenotypes produced by high‐throughput biomarker technology, exhibit strong correlation structures. Existing

A two‐field geostatistical model combining point and areal observations—A case study of annual runoff predictions in the Voss area J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210507
Thea Roksvåg, Ingelin Steinsland, Kolbjørn EngelandWe estimate annual runoff by using a Bayesian geostatistical model for interpolation of hydrological data of different spatial support: streamflow observations from catchments (areal data), and precipitation and evaporation data (point data). The model contains one climatic spatial effect that is common for all years under study, and 1 year specific spatial effect. Hence, the framework enables a quantification

Censored regression for modelling small arms trade volumes and its ‘Forensic’ use for exploring unreported trades J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210506
Michael Lebacher, Paul W. Thurner, Göran KauermannIn this paper, we use a censored regression model to investigate data on the international trade of small arms and ammunition provided by the Norwegian Initiative on Small Arms Transfers. Taking a network‐based view on the transfers, we do not only rely on exogenous covariates but also estimate endogenous network effects. We apply a spatial autocorrelation gravity model with multiple weight matrices

Functional data analysis and visualisation of three‐dimensional surface shape J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210506
Stanislav Katina, Liberty Vittert, Adrian W. BowmanThe advent of high‐resolution imaging has made data on surface shape widespread. Methods for the analysis of shape based on landmarks are well established but high‐resolution data require a functional approach. The starting point is a systematic and consistent description of each surface shape and a method for creating this is described. Three innovative forms of analysis are then introduced. The first

Clustering and automatic labelling within time series of categorical observations—with an application to marine log messages J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210503
Emanuele Gramuglia, Geir Storvik, Morten StakkelandSystem logs or log files containing textual messages with associated time stamps are generated by many technologies and systems. The clustering technique proposed in this paper provides a tool to discover and identify patterns or macrolevel events in this data. The motivating application is logs generated by frequency converters in the propulsion system on a ship, while the general setting is fault

Bayesian criterion‐based variable selection J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210427
Arnab Kumar Maity, Sanjib Basu, Santu GhoshBayesian approaches for criterion based selection include the marginal likelihood based highest posterior model (HPM) and the deviance information criterion (DIC). The DIC is popular in practice as it can often be estimated from sampling‐based methods with relative ease and DIC is readily available in various Bayesian software. We find that sensitivity of DIC‐based selection can be high, in the range

Accelerating Bayesian estimation for network Poisson models using frequentist variational estimates J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210416
Sophie Donnet, Stéphane RobinThis work is motivated by the analysis of ecological interaction networks. Poisson stochastic block models are widely used in this field to decipher the structure that underlies a weighted network, while accounting for covariate effects. Efficient algorithms based on variational approximations exist for frequentist inference, but without statistical guaranties as for the resulting estimates. In the

Likelihood‐free parameter estimation for dynamic queueing networks: Case study of passenger flow in an international airport terminal J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210414
Anthony Ebert, Ritabrata Dutta, Kerrie Mengersen, Antonietta Mira, Fabrizio Ruggeri, Paul WuDynamic queueing networks (DQN) model queueing systems where demand varies strongly with time, such as airport terminals. With rapidly rising global air passenger traffic placing increasing pressure on airport terminals, efficient allocation of resources is more important than ever. Parameter inference and quantification of uncertainty are key challenges for developing decision support tools. The DQN

Mapping malaria by sharing spatial information between incidence and prevalence data sets J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210405
Tim C. D. Lucas, Anita K. Nandi, Elisabeth G. Chestnutt, Katherine A. Twohig, Suzanne H. Keddie, Emma L. Collins, Rosalind E. Howes, Michele Nguyen, Susan F. Rumisha, Andre Python, Rohan Arambepola, Amelia Bertozzi‐Villa, Penelope Hancock, Punam Amratia, Katherine E. Battle, Ewan Cameron, Peter W. Gething, Daniel J. WeissAs malaria incidence decreases and more countries move towards elimination, maps of malaria risk in low‐prevalence areas are increasingly needed. For low‐burden areas, disaggregation regression models have been developed to estimate risk at high spatial resolution from routine surveillance reports aggregated by administrative unit polygons. However, in areas with both routine surveillance data and

Bayesian modelling for spatially misaligned health areal data: A multiple membership approach J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210401
Marco Gramatica, Peter Congdon, Silvia LiveraniDiabetes prevalence is on the rise in the United Kingdom, and for public health strategy, estimation of relative disease risk and subsequent mapping is important. We consider an application to London data on diabetes prevalence and mortality. In order to improve the estimation of relative risks, we analyse jointly prevalence and mortality data to ensure borrowing strength over the two outcomes. The

Adjusting for population differences using machine learning methods J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210401
Lauren Cappiello, Zhiwei Zhang, Changyu Shen, Neel M. Butala, Xinping Cui, Robert W. YehThe use of real‐world data for medical treatment evaluation frequently requires adjusting for population differences. We consider this problem in the context of estimating mean outcomes and treatment differences in a well‐defined target population, using clinical data from a study population that overlaps with but differs from the target population in terms of patient characteristics. The current literature

Phase I clinical trials in adoptive T‐cell therapies J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210329
Sean M. Devlin, Alexia Iasonos, John O’QuigleyWe develop three approaches to phase I dose finding designs for engineered T cells in oncology. Our goal is to address a very particular difficulty in this clinical setting: an inability to fully administer the dose allocated to some patients. Current designs can be biased as a result of this incomplete information being ignored or discarded from the analysis. The performance of the three proposed

Clustering based on Kolmogorov–Smirnov statistic with application to bank card transaction data J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210325
Yingqiu Zhu, Qiong Deng, Danyang Huang, Bingyi Jing, Bo ZhangRapid developments in third‐party online payment platforms now make it possible to record massive bank card transaction data. Clustering on such transaction data is of great importance for the analysis of merchant behaviours. However, traditional methods based on generated features inevitably lead to much loss of information. To make better use of bank card transaction data, this study investigates

Estimation of the size of informal employment based on administrative records with non‐ignorable selection mechanism J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210325
Maciej Berȩsewicz, Dagmara NikulinIn this study, we used company level administrative data from the National Labour Inspectorate and The Polish Social Insurance Institution in order to estimate the prevalence of informal employment in Poland in 2016. Since the selection mechanism is non‐ignorable, we employed a generalization of Heckman’s sample selection model assuming non‐Gaussian correlation of errors and clustering by incorporation

Time matters: How default resolution times impact final loss rates J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210312
Jennifer Betz, Ralf Kellner, Daniel RöschUsing access to a unique bank loss database, we find positive dependencies of default resolution times (DRTs) of defaulted bank loan contracts and final loan loss rates (losses given default, LGDs). Due to this interconnection, LGD predictions made at the time of default and during resolution are subject to censoring. Pure (standard) LGD models are not able to capture effects of censoring. Accordingly

Recurrent events modelling of haemophilia bleeding events J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210107
Andrew C. Titman, Martin J. Wolfsegger, Thomas F. JakiA pharmacokinetic–pharmacodynamic (PK‐PD) approach is developed for modelling the recurrent bleeding events in patients with severe haemophilia to investigate the relationship between factor VIII plasma activity level and the instantaneous risk of a bleed. The model incorporates patient‐level pharmacokinetic (PK) information obtained through measurements taken prior to the study which are used to fit

Multiscale null hypothesis testing for network‐valued data: Analysis of brain networks of patients with autism J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210122
Ilenia Lovato, Alessia Pini, Aymeric Stamm, Maxime Taquet, Simone VantiniNetworks are a natural way of representing the human brain for studying its structure and function and, as such, have been extensively used. In this framework, case–control studies for understanding autism pertain to comparing samples of healthy and autistic brain networks. In order to understand the biological mechanisms involved in the pathology, it is key to localize the differences on the brain

Bayesian semi‐parametric G‐computation for causal inference in a cohort study with MNAR dropout and death J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210106
Maria Josefsson, Michael J. DanielsCausal inference with observational longitudinal data and time‐varying exposures is often complicated by time‐dependent confounding and attrition. The G‐computation formula is one approach for estimating a causal effect in this setting. The parametric modelling approach typically used in practice relies on strong modelling assumptions for valid inference and moreover depends on an assumption of missing

Inferring bivariate association from respondent‐driven sampling data J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210121
Dongah Kim, Krista J. Gile, Honoria Guarino, Pedro Mateu‐GelabertRespondent‐driven sampling (RDS) is an effective method of collecting data from many hard‐to‐reach populations. Valid statistical inference for these data relies on many strong assumptions. In standard samples, we assume observations from pairs of individuals are independent. In RDS, this assumption is violated by the sampling dependence between individuals. We propose a method to semi‐parametrically

Long‐term trend analysis of extreme coastal sea levels with changepoint detection J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210122
Mintaek Lee, Jaechoul LeeSea level rise can bring disastrous outcomes to people living in coastal regions by increasing flood risk or inducing stronger storm surges. We study long‐term linear trends in monthly maximum sea levels by applying extreme value methods. The monthly maximum sea levels are extracted from multiple tide gauges around the coastal regions of the world over a period of as long as 169 years. Due to instrument

A Bayesian nonparametric model for textural pattern heterogeneity J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210212
Xiao Li, Michele Guindani, Chaan S. Ng, Brian P. HobbsCancer radiomics is an emerging discipline promising to elucidate lesion phenotypes and tumour heterogeneity through patterns of enhancement, texture, morphology and shape. The prevailing technique for image texture analysis relies on the construction and synthesis of grey‐level co‐occurrence matrices (GLCM). Practice currently reduces the structured count data of a GLCM to reductive and redundant

Finding your feet: A Gaussian process model for estimating the abilities of batsmen in test cricket J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210210
Oliver G. Stevenson, Brendon J. BrewerIn the sport of cricket, a player’s batting ability is traditionally measured using the batting average. However, the batting average fails to measure both short‐term changes in ability that occur during an innings and long‐term changes in ability that occur between innings due to factors such as age and experience in various match conditions. We derive and fit a Bayesian parametric model that employs

Optimal block designs for experiments on networks J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210301
Vasiliki Koutra, Steven G. Gilmour, Ben M. ParkerWe propose a method for constructing optimal block designs for experiments on networks. The response model for a given network interference structure extends the linear network effects model to incorporate blocks. The optimality criteria are chosen to reflect the experimental objectives and an exchange algorithm is used to search across the design space for obtaining an efficient design when an exhaustive

Assessing daily patterns using home activity sensors and within period changepoint detection J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210224
Simon A. C. Taylor, Rebecca Killick, Jonathan Burr, Louise RogersonWe consider the problem of ascertaining daily patterns using passive sensors to establish a baseline for elderly people living alone. The data are whether or not some movement, or human related activity, has occurred in the previous 15 min. We seek to segment the broad patterns within a day, for example, awake/sleep times or potentially more activity around meal‐times. To address this problem we use

A model‐free approach for testing association J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210224
Saptarshi Chatterjee, Shrabanti Chowdhury, Sanjib BasuThe question of association between outcome and feature is generally framed in the context of a model based on functional and distributional forms. Our motivating application is that of identifying serum biomarkers of angiogenesis, energy metabolism, apoptosis and inflammation, predictive of recurrence after lung resection in node‐negative non‐small cell lung cancer patients with tumour stage T2a or

Bayesian hierarchical factor regression models to infer cause of death from verbal autopsy data J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20210223
Kelly R. Moran, Elizabeth L. Turner, David Dunson, Amy H. HerringIn low‐resource settings where vital registration of death is not routine it is often of critical interest to determine and study the cause of death (COD) for individuals and the cause‐specific mortality fraction (CSMF) for populations. Post‐mortem autopsies, considered the gold standard for COD assignment, are often difficult or impossible to implement due to deaths occurring outside the hospital

Bayesian varying coefficient model with selection: An application to functional mapping J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201120
Benjamin Heuclin, Frédéric Mortier, Catherine Trottier, Marie DenisHow does the genetic architecture of quantitative traits evolve over time? Answering this question is crucial for many applied fields such as human genetics and plant or animal breeding. In the last decades, high‐throughput genome techniques have been used to better understand links between genetic information and quantitative traits. Recently, high‐throughput phenotyping methods are also being used

Stacked inverse probability of censoring weighted bagging: A case study in the InfCareHIV Register J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201122
Pablo Gonzalez Ginestet, Ales Kotalik, David M. Vock, Julian Wolfson, Erin E. GabrielWe propose an inverse probability of censoring weighted (IPCW) bagging (bootstrap aggregation) pre‐processing that enables the application of any machine learning procedure for classification to be used to predict the cause‐specific cumulative incidence, properly accounting for right‐censored observations and competing risks. We consider the IPCW area under the time‐dependent ROC curve (IPCW‐AUC) as

A non‐parametric Hawkes process model of primary and secondary accidents on a UK smart motorway J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201111
Kieran Kalair, Colm Connaughton, Pierfrancesco Alaimo Di LoroA self‐exciting spatiotemporal point process is fitted to incident data from the UK National Traffic Information Service to model the rates of primary and secondary accidents on the M25 motorway in a 12‐month period during 2017–2018. This process uses a background component to represent primary accidents, and a self‐exciting component to represent secondary accidents. The background consists of periodic

Quantifying the trendiness of trends J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201120
Andreas Kryger Jensen, Claus Thorn EkstrømNews media often report that the trend of some public health outcome has changed. These statements are frequently based on longitudinal data, and the change in trend is typically found to have occurred at the most recent data collection time point—if no change had occurred the story is less likely to be reported. Such claims may potentially influence public health decisions on a national level.

M‐quantile regression for multivariate longitudinal data with an application to the Millennium Cohort Study J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201125
Marco Alfò, Maria Francesca Marino, Maria Giovanna Ranalli, Nicola Salvati, Nikos TzavidisMotivated by the analysis of data from the UK Millennium Cohort Study on emotional and behavioural disorders, we develop an M‐quantile regression model for multivariate longitudinal responses. M‐quantile regression is an appealing alternative to standard regression models; it combines features of quantile and expectile regression and it may produce a detailed picture of the conditional response variable

Correcting misclassification errors in crowdsourced ecological data: A Bayesian perspective J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201111
Edgar Santos‐Fernandez, Erin E. Peterson, Julie Vercelloni, Em Rushworth, Kerrie MengersenMany research domains use data elicited from ‘citizen scientists’ when a direct measure of a process is expensive or infeasible. However, participants may report incorrect estimates or classifications due to their lack of skill. We demonstrate how Bayesian hierarchical models can be used to learn about latent variables of interest, while accounting for the participants’ abilities. The model is described

A Bayesian approach for determining player abilities in football J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201125
Gavin A. Whitaker, Ricardo Silva, Daniel Edwards, Ioannis KosmidisWe consider the task of determining a football player’s ability for a given event type, for example, scoring a goal. We propose an interpretable Bayesian model which is fit using variational inference methods. We implement a Poisson model to capture occurrences of event types, from which we infer player abilities. Our approach also allows the visualisation of differences between players, for a specific

Sequential aggregation of probabilistic forecasts—Application to wind speed ensemble forecasts J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201122
Michaël Zamo, Liliane Bel, Olivier MestreIn numerical weather prediction (NWP), the uncertainty about the future state of the atmosphere is described by a set of forecasts (called an ensemble). All ensembles have deficiencies that can be corrected via statistical post‐processing methods. Several ensembles, based on different NWP models, exist and may be corrected using different statistical methods. These raw or post‐processed ensembles can

Future proofing a building design using history matching inspired level‐set techniques J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201219
Evan Baker, Peter Challenor, Matt EamesHow can one design a building that will be sufficiently protected against overheating and sufficiently energy efficient, whilst considering the expected increases in temperature due to climate change? We successfully manage to address this question—greatly reducing a large set of initial candidate building designs down to a small set of acceptable buildings. We do this using a complex computer model

Robust estimation for small domains in business surveys J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201211
Paul A. Smith, Chiara Bocci, Nikos Tzavidis, Sabine Krieg, Marc J. E. SmeetsSmall area (or small domain) estimation is still rarely applied in business statistics, because of challenges arising from the skewness and variability of variables such as turnover. We examine a range of small area estimation methods as the basis for estimating the activity of industries within the retail sector in the Netherlands. We use tax register data and a sampling procedure which replicates

Finite mixtures of semiparametric Bayesian survival kernel machine regressions: Application to breast cancer gene pathway subgroup analysis J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201201
Lin Zhang, Inyoung KimA gene pathway is defined as a set of genes that functionally work together to regulate a certain biological process. Gene pathway expression data, which is a special case of highly correlated high‐dimensional data, exhibits the ‘small n and large p’ problem. Pathway analysis can take into account the dependency structures among genes and the possibility that several moderately regulated genes may

Threshold‐based subgroup testing in logistic regression models in two‐phase sampling designs J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201128
Ying Huang, Juhee Cho, Youyi FongThe effect of treatment on binary disease outcome can differ across subgroups characterised by other covariates. Testing for the existence of subgroups that are associated with heterogeneous treatment effects can provide valuable insight regarding the optimal treatment recommendation in practice. Our research in this paper is motivated by the question of whether host genetics could modify a vaccine's

Quantile‐frequency analysis and spectral measures for diagnostic checks of time series with nonlinear dynamics J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201122
Ta‐Hsin LiNonlinear dynamic volatility has been observed in many financial time series. The recently proposed quantile periodogram offers an alternative way to examine this phenomena in the frequency domain. The quantile periodogram is constructed from trigonometric quantile regression of time series data at different frequencies and quantile levels, enabling the quantile‐frequency analysis (QFA) of nonlinear

Modelling time‐varying mobility flows using function‐on‐function regression: Analysis of a bike sharing system in the city of Milan J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201108
Agostino Torti, Alessia Pini, Simone VantiniIn today's world, bike sharing systems are becoming increasingly common in all main cities around the world. To understand the spatiotemporal patterns of how people move by bike through the city of Milan, we apply functional data analysis to study the flows of a bike sharing mobility network. We introduce a complete pipeline to properly analyse and model functional data through a concurrent functional‐on‐functional

Functional ensemble survival tree: Dynamic prediction of Alzheimer’s disease progression accommodating multiple time‐varying covariates J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201107
Shu Jiang, Yijun Xie, Graham A. ColditzWith the exponential growth in data collection, multiple time‐varying biomarkers are commonly encountered in clinical studies, along with a rich set of baseline covariates. This paper is motivated by addressing a critical issue in the field of Alzheimer’s disease (AD) in which we aim to predict the time for AD conversion in people with mild cognitive impairment to inform prevention and early treatment

Random effects dynamic panel models for unequally spaced multivariate categorical repeated measures: an application to child–parent exchanges of support J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20201107
Fiona Steele, Emily GrundyExchanges of practical or financial help between people living in different households are a major component of intergenerational exchanges within families and an increasingly important source of support for individuals in need. Using longitudinal data, bivariate dynamic panel models can be applied to study the effects of changes in individual circumstances on help given to and received from non‐coresident

Linear mixed effects models for non‐Gaussian continuous repeated measurement data J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200909
Özgür Asar, David Bolin, Peter J. Diggle, Jonas WallinWe consider the analysis of continuous repeated measurement outcomes that are collected longitudinally. A standard framework for analysing data of this kind is a linear Gaussian mixed effects model within which the outcome variable can be decomposed into fixed effects, time invariant and time‐varying random effects, and measurement noise. We develop methodology that, for the first time, allows any

Burglary in London: insights from statistical heterogeneous spatial point processes J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200805
Jan Povala, Seppo Virtanen, Mark GirolamiTo obtain operational insights regarding the crime of burglary in London, we consider the estimation of the effects of covariates on the intensity of spatial point patterns. Inspired by localized properties of criminal behaviour, we propose a spatial extension to mixtures of generalized linear models from the mixture modelling literature. The Bayesian model proposed is a finite mixture of Poisson generalized

Sensitivity analysis for publication bias in meta‐analyses J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200828
Maya B. Mathur, Tyler J. VanderWeeleWe propose sensitivity analyses for publication bias in meta‐analyses. We consider a publication process such that ‘statistically significant’ results are more likely to be published than negative or “non‐significant” results by an unknown ratio, η. Our proposed methods also accommodate some plausible forms of selection based on a study's standard error. Using inverse probability weighting and robust

A hierarchical mixed effect hurdle model for spatiotemporal count data and its application to identifying factors impacting health professional shortages J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200808
Soutik Ghosal, Timothy S. Lau, Jeremy Gaskins, Maiying KongCount data are common in many fields such as public health. Hurdle models have been developed to model count data when the zero count could be either inflated or deflated. However, when data are repeatedly collected over time and spatially correlated, it is very challenging to model the data appropriately. For example, to study health professional shortage areas, the number of primary care physicians

Landmark proportional subdistribution hazards models for dynamic prediction of cumulative incidence functions J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200805
Qing Liu, Gong Tang, Joseph P. Costantino, Chung‐Chou H. ChangAn individualized dynamic risk prediction model that incorporates all available information collected over the follow‐up can be used to choose an optimal treatment strategy in realtime, although existing methods have not been designed to handle competing risks. In this study, we developed a landmark proportional subdistribution hazard (PSH) model and a comprehensive supermodel for dynamic risk prediction

Nested g‐computation: a causal approach to analysis of censored medical costs in the presence of time‐varying treatment J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200825
Andrew J. Spieker, Emily M. Ko, Jason A. Roy, Nandita MitraRising medical costs are an emerging challenge in policy decisions and resource allocation planning. When cumulative medical cost is the outcome, right censoring induces informative missingness due to heterogeneity in cost accumulation rates across subjects. Inverse weighting approaches have been developed to address the challenge of informative cost trajectories in mean cost estimation, though these

One‐class classification with application to forensic analysis J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200826
Francesca Fortunato, Laura Anderlucci, Angela MontanariThe analysis of broken glass is forensically important to reconstruct the events of a criminal act. In particular, the comparison between the glass fragments found on a suspect (recovered cases) and those collected at the crime scene (control cases) may help the police to identify the offender(s) correctly. The forensic issue can be framed as a one‐class classification problem. One‐class classification

Adding measurement error to location data to protect subject confidentiality while allowing for consistent estimation of exposure effects J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200815
Mahesh Karra, David Canning, Ryoko SatoIn public use data sets, it is desirable not to report a respondent's location precisely to protect subject confidentiality. However, the direct use of perturbed location data to construct explanatory exposure variables for regression models will generally make naive estimates of all parameters biased and inconsistent. We propose an approach where a perturbation vector, consisting of a random distance

Bayesian analysis of tests with unknown specificity and sensitivity J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200813
Andrew Gelman, Bob CarpenterWhen testing for a rare disease, prevalence estimates can be highly sensitive to uncertainty in the specificity and sensitivity of the test. Bayesian inference is a natural way to propagate these uncertainties, with hierarchical modelling capturing variation in these parameters across experiments. Another concern is the people in the sample not being representative of the general population. Statistical

A calibrated sensitivity analysis for matched observational studies with application to the effect of second‐hand smoke exposure on blood lead levels in children J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200828
Bo Zhang, Dylan S. SmallWe conducted a matched observational study to investigate the causal relationship between second‐hand smoke and blood lead levels in children. Our first analysis that assumes no unmeasured confounding suggests evidence of a detrimental effect of second‐hand smoke. However, unmeasured confounding is a concern in our study as in other observational studies of second‐hand smoke's effects. A sensitivity

A Bayesian quest for finding a unified model for predicting volleyball games J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200901
Leonardo Egidi, Ioannis NtzoufrasVolleyball is a team sport with unique and specific characteristics. We introduce a new two‐level hierarchical Bayesian model which accounts for these volleyball‐specific characteristics. In the first level, we model the set outcome with a simple logistic regression model. Conditionally on the winner of the set, in the second level, we use a truncated negative binomial distribution for the points earned

Markov switching modelling of shooting performance variability and teammate interactions in basketball J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200827
Marco Sandri, Paola Zuccolotto, Marica ManiseraIn basketball, measures of individual player performance provide critical guidance for a broad spectrum of decisions related to training and game strategy. However, most studies on this topic focus on performance level measurement, neglecting other important factors, such as performance variability. Here we model shooting performance variability by using Markov switching models, assuming the existence

Circular regression trees and forests with an application to probabilistic wind direction forecasting J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200925
Moritz N. Lang, Lisa Schlosser, Torsten Hothorn, Georg J. Mayr, Reto Stauffer, Achim ZeileisAlthough circular data occur in a wide range of scientific fields, the methodology for distributional modelling and probabilistic forecasting of circular response variables is quite limited. Most of the existing methods are built on generalized linear and additive models, which are often challenging to optimize and interpret. Specifically, capturing abrupt changes or interactions is not straightforward

Causal mechanism of extreme river discharges in the upper Danube basin network J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200523
Linda Mhalla, Valérie Chavez‐Demoulin, Debbie J. DupuisExtreme hydrological events in the Danube river basin may severely impact human populations, aquatic organisms and economic activity. One often characterizes the joint structure of extreme events by using the theory of multivariate and spatial extremes and its asymptotically justified models. There is interest, however, in cascading extreme events and whether one event causes another. We argue that

Inference for extreme values under threshold‐based stopping rules J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200612
Anna Maria Barlow, Chris Sherlock, Jonathan TawnThere is a propensity for an extreme value analysis to be conducted as a consequence of a large flooding event. This timing of the analysis introduces bias and poor coverage probabilities into the associated risk assessments and leads subsequently to inefficient flood protection schemes. We explore these problems through studying stochastic stopping criteria and propose new likelihood‐based inferences

A hybrid approach for the stratified mark‐specific proportional hazards model with missing covariates and missing marks, with application to vaccine efficacy trials J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200522
Yanqing Sun, Li Qi, Fei Heng, Peter B. GilbertDeployment of the recently licensed tetravalent dengue vaccine based on a chimeric yellow fever virus, CYD‐TDV, requires understanding of how the risk of dengue disease in vaccine recipients depends jointly on a host biomarker measured after vaccination (neutralization titre—neutralizing antibodies) and on a ‘mark’ feature of the dengue disease failure event (the amino acid sequence distance of the

Global household energy model: a multivariate hierarchical approach to estimating trends in the use of polluting and clean fuels for cooking J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200707
Oliver Stoner, Gavin Shaddick, Theo Economou, Sophie Gumy, Jessica Lewis, Itzel Lucio, Giulia Ruggeri, Heather Adair‐RohaniIn 2017 an estimated 3 billion people used polluting fuels and technologies as their primary cooking solution, with 3.8 million deaths annually attributed to household exposure to the resulting fine particulate matter air pollution. Currently, health burdens are calculated by using aggregations of fuel types, e.g. solid fuels, as country level estimates of the use of specific fuel types, e.g. wood

Modelling fuel injector spray characteristics in jet engines by using vine copulas J. R. Stat. Soc. Ser. C Appl. Stat. (IF 1.59) Pub Date : 20200615
Maximilian Coblenz, Simon Holz, Hans‐Jörg Bauer, Oliver Grothe, Rainer KochThe emission requirements for jet engines are becoming more stringent and the combustion process determines pollutant emissions. Therefore, we model the distribution of fuel drops generated by a fuel injector in a jet engine, which can be assumed to be a five‐dimensional problem in terms of drop size, x‐position, y‐position, x‐velocity and y‐velocity. The data are generated by numerical simulations