-
Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records medRxiv. Health Inform. Pub Date : 2024-09-16 Yi Lian, Xiaoqian Jiang, Qi Long
Large electronic health records (EHR) have been widely implemented and are available for research activities. The magnitude of such databases often requires storage and computing infrastructure that are distributed at different sites. Restrictions on data-sharing due to privacy concerns have been another driving force behind the development of a large class of distributed and/or federated machine learning
-
Characterizing the connection between Parkinson's disease progression and healthcare utilization medRxiv. Health Inform. Pub Date : 2024-09-16 Lane Fitzsimmons, Francesca Frau, Sylvie Bozzi, Karen Chandross, Brett Beaulieu-Jones
Background and Objectives: Parkinson's disease (PD) progression can be characterized in terms of healthcare utilization by analyzing clinical events across different stages of disease. Methods: PD progression was measured by the Hoehn & Yahr (H&Y) clinical rating scale and clinical events at each stage were evaluated. Natural language processing and a large language model were used to extract H&Y values
-
Predicting survival time for critically ill patients with heart failure using conformalized survival analysis medRxiv. Health Inform. Pub Date : 2024-09-16 Xiaomeng Wang, Zhimei Ren, Jiancheng Ye
Heart failure (HF) is a critical public health issue, particularly for critically ill patients in intensive care units (ICUs). Predicting survival outcome in critically ill patients is a difficult yet crucially important task for timely treatment. This study utilizes a novel approach, conformalized survival analysis (CSA), designed to construct lower bounds on the survival time in critically ill HF
-
Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events - A Scoping Review medRxiv. Health Inform. Pub Date : 2024-09-14 Jasmine Chiat Ling Ong, Michael Chen, Ning Ng, Kabilan Elangovan, Nichole Yue Ting Tan, Liyuan Jin, Qihuang Xie, Daniel Shu Wei Ting, Rosa Rodriguez-Monguio, David Bates, Nan Liu
Background: Medication-related harm has a significant impact on global healthcare costs and patient outcomes, accounting for deaths in 4.3 per 1000 patients. Generative artificial intelligence (GenAI) has emerged as a promising tool in mitigating risks of medication-related harm. In particular, large language models (LLMs) and well-developed generative adversarial networks (GANs) showing promise for
-
A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse medRxiv. Health Inform. Pub Date : 2024-09-14 Martin Schoenthaler, Noah Hempen, Maria Weymann, Maximilian Ferry von Bargen, Maximilian Glienke, Antonia Elsaesser, Max Behrens, Harald Binder, Nadine Binder
Background: To provide more evidence in urolithiasis research, we have established the German Nationwide Register for RECurrent URolithiasis (RECUR) using local clinical data warehouses (CDWH). For RECUR and other registers relying on digitalized clinical data, it is crucial to ensure the data's reliability for answering scientific questions. In this work, we aim to compare the results of different
-
The Breast Cancer Genetic Testing Experience: Probing the Potential Utility of an Online Decision Aid in Risk Perception and Decision Making medRxiv. Health Inform. Pub Date : 2024-09-14 Anna Vaynrub, Brian Salazar, Yilin Eileen Feng, Harry West, Alissa Michel, Subiksha Umakanth, Katherine Crew, Rita Kukafka
ABSTRACT (350/350 words) Background: Despite the role of pathogenic variants (PVs) in cancer predisposition genes conferring significantly increased risk of breast cancer (BC), uptake of genetic testing (GT) remains low, especially among ethnic minorities. Our prior study identified that a patient decision aid, RealRisks, improved patient-reported outcomes relative to standard educational materials
-
Accuracy of Online Symptom-Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions: A Systematic Review medRxiv. Health Inform. Pub Date : 2024-09-14 Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel
Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist medical laypeople in deciding if and where to seek care (self-triage) are gaining popularity and their accuracy has been examined in numerous studies. With the public release of Large Language Models (LLMs, e.g., ChatGPT), their use in such decision-making processes is growing as well. However, there is currently no comprehensive
-
Optimizing Large Language Models for Discharge Prediction: Best Practices in Leveraging Electronic Health Record Audit Logs medRxiv. Health Inform. Pub Date : 2024-09-13 Xinmeng Zhang, Chao Yan, Yuyang Yang, Zhuohang Li, Yubo Feng, Bradley A. Malin, You Chen
Electronic Health Record (EHR) audit log data are increasingly utilized for clinical tasks, from workflow modeling to predictive analyses of discharge events, adverse kidney outcomes, and hospital readmissions. These data encapsulate user-EHR interactions, reflecting both healthcare professionals' behavior and patients' health statuses. To harness this temporal information effectively, this study explores
-
Technology-Supported Self-Triage Decision Making: A Mixed-Methods Study medRxiv. Health Inform. Pub Date : 2024-09-13 Marvin Kopka, Sonja Mei Wang, Samira Kunz, Christine Schmid, Markus A. Feufel
Symptom-Assessment Application (SAAs) and Large Language Models (LLMs) are increasingly used by laypeople to navigate care options. Although humans ultimately make a final decision when using these systems, previous research has typically examined the performance of humans and SAAs/LLMs separately. Thus, it is unclear how decision-making unfolds in such hybrid human-technology teams and if SAAs/LLMs
-
Gut-Brain Nexus: Mapping Multi-Modal Links to Neurodegeneration at Biobank Scale medRxiv. Health Inform. Pub Date : 2024-09-13 Mohammad Shafieinouri, Samantha Hong, Artur Schuh, Mary B. Makarious, Rodrigo Sandon, Paul Suhwan Lee, Emily Simmonds, Hirotaka Iwaki, Gracelyn Hill, Cornelis Blauwendraat, Valentina Escott-Price, Yue A. Qi, Alastair J. Noyce, Armando Reyes-Palomares, Hampton Leonard, Malu Tansey, Andrew Singleton, Mike A. Nalls, Kristin S. Levine, Sara Bandres-Ciga
Alzheimer's disease (AD) and Parkinson's disease (PD) are influenced by genetic and environmental factors. Using data from UK Biobank, SAIL Biobank, and FinnGen, we conducted an unbiased, population-scale study to: 1) Investigate how 155 endocrine, nutritional, metabolic, and digestive system disorders are associated with AD and PD risk prior to their diagnosis, considering known genetic influences;
-
Forecasting local surges in COVID-19 hospitalizations through adaptive decision tree classifiers medRxiv. Health Inform. Pub Date : 2024-09-13 Rachel E Murray-Watson, Alyssa Bilinski, Reza Yaesoubi
During the COVID-19 pandemic, many communities across the US experienced surges in hospitalizations, which strained the local hospital capacity and affected the overall quality of care. Even when effective vaccines became available, many communities remained at high risk of surges in COVID-19-related hospitalizations due to waning immunity, low uptake of booster vaccinations, and the continual emergence
-
A Framework to Assess Clinical Safety and Hallucination Rates of LLMs for Medical Text Summarisation medRxiv. Health Inform. Pub Date : 2024-09-13 Elham Asgari, Nina Montana-Brown, Magda Dubois, Saleh Khalil, Jasmine Balloch, Dominic Pimenta
The integration of large language models (LLMs) into healthcare settings holds great promise for improving clinical workflow efficiency and enhancing patient care, with the potential to automate tasks such as text summarisation during consultations. The fidelity between LLM outputs and ground truth information is therefore paramount in healthcare, as errors in medical summary generation can lead to
-
Evidence-based XAI of clinical decision support systems for differential diagnosis: Design, implementation, and evaluation medRxiv. Health Inform. Pub Date : 2024-09-13 Yasuhiko Miyachi, Osamu Ishii, Keijiro Torigoe
Introduction: We propose the Explainable AI (XAI) model for Clinical Decision Support Systems (CDSSs). It supports physician's Differential Diagnosis (DDx) with Evidence-based Medicine (EBM). It identifies instances of the case data contributing to predicted diseases. Each case data is linked to the sourced medical literature. Therefore, this model can provide medical professionals with evidence of
-
Enhancing Dietary Supplement Question Answer via Retrieval-Augmented Generation (RAG) with LLM medRxiv. Health Inform. Pub Date : 2024-09-12 Yu Hou, Rui Zhang
Objective: To enhance the accuracy and reliability of dietary supplement (DS) question answering by integrating a novel Retrieval-Augmented Generation (RAG) LLM system with an updated and integrated DS knowledge base and providing a user-friendly interface. With. Materials and Methods: We developed iDISK2.0 by integrating updated data from multiple trusted sources, including NMCD, MSKCC, DSLD, and
-
Performance evaluation of an under-mattress sleep sensor versus polysomnography in >400 nights with healthy and unhealthy sleep medRxiv. Health Inform. Pub Date : 2024-09-11 Jack Manners, Eva Kemps, Bastien Lechat, Peter Catcheside, Danny Eckert, Hannah Scott
Consumer sleep trackers provide useful insight into sleep. However, large scale performance evaluation studies are needed to properly understand sleep tracker accuracy. This study evaluated performance of an under-mattress sensor to estimate sleep and wake versus polysomnography in a large sample, including individuals with and without sleep disorders and during day versus night sleep opportunities
-
Universal coordinate on wave-shape manifold of cardiovascular waveform signal for dynamic quantification and cross-subject comparison medRxiv. Health Inform. Pub Date : 2024-09-11 Hau-Tieng Wu, Ruey-Hsing Chou, Shen-Chih Wang, Cheng-Hsi Chang, Yu-Ting Lin
Objective: Quantifying physiological dynamics from nonstationary time series for clinical decision-making is challenging, especially when comparing data across different subjects. We propose a solution and validate it using two real-world surgical databases, focusing on underutilized arterial blood pressure (ABP) signals. Method: We apply a manifold learning algorithm, Dynamic Diffusion Maps (DDMap)
-
ROBI: a Robust and Optimized Biomarker Identifier to increase the likelihood of discovering relevant radiomic features. medRxiv. Health Inform. Pub Date : 2024-09-10 Louis Rebaud, Nicolo Capobianco, Clementine Sarkozy, Anne-Segolene Cottereau, Laetitia Vercellino, Olivier Casasnovas, Catherine Thieblemont, Bruce Spottiswoode, Irene Buvat
Objectives: The Robust and Optimized Biomarker Identifier (ROBI) feature selection pipeline is introduced to improve the identification of informative biomarkers coding information not already captured by existing features. It aims to accurately maximize the number of discoveries while minimizing and estimating the number of false positives (FP) with an adjustable selection stringency. Methods: 500
-
Benchmarking the Confidence of Large Language Models in Clinical Questions medRxiv. Health Inform. Pub Date : 2024-09-10 Mahmud Omar, Reem Agbareia, Benjamin S Glicksberg, Girish Nadkarni, Eyal Klang
Background and Aim: Large language models (LLMs) show promise in healthcare, but their self-assessment capabilities remain unclear. This study evaluates the confidence levels and performance of 12 LLMs across five medical specialties to assess their ability to accurately judge their responses. Methods: We used 1965 multiple-choice questions from internal medicine, obstetrics and gynecology, psychiatry
-
Knowledge mobilization with and for equity-deserving communities invested in research: A scoping review protocol medRxiv. Health Inform. Pub Date : 2024-09-07 Ramy Barhouche, Samson Tse, Fiona Inglis, Debbie Chaves, Erin Allison, Tina Colaco, Melody E. Morton Ninomiya
The practice of putting research into action is known by various names, depending on disciplinary norms. Knowledge mobilization, translation, and transfer (collectively referred to as K*) are three common terminologies used in research literature. Knowledge-to-action opportunities and gaps in academic research often remain obscure to non-academic researchers in communities, policy and decision makers
-
Dynamic Graph Transformer for Brain Disorder Diagnosis medRxiv. Health Inform. Pub Date : 2024-09-06 Ahsan Shehzad, Dongyu Zhang, Shuo Yu, Shagufta Abid, Feng Xia
Dynamic brain networks are crucial for diagnosing brain disorders, as they reveal changes in brain activity and connectivity over time. Previous methods exploit the sliding window approach on fMRI data to construct these networks. However, this approach encounters two major issues: fixed temporal length, which inadequately captures the temporal dynamics of brain activity, and global spatial scope,
-
Forecasting High-Risk Behavioral and Medical Events in Children with Autism Using Digital Behavioral Records medRxiv. Health Inform. Pub Date : 2024-09-06 Yashar Kiarashi, Johanna Lantz, Matthew A Reyna, Conor Anderson, Ali Bahrami Rad, Jenny Foster, Tania Villavicencio, Theresa Hamlin, Gari D Clifford
Individuals with Autism Spectrum Disorder may display interfering behaviors that limit their inclusion in educational and community settings, negatively impacting their quality of life. These behaviors may also signal potential medical conditions or indicate upcoming high-risk behaviors. This study explores behavior patterns that precede high-risk, challenging behaviors or seizures the following day
-
A Novel Digital Twin Strategy to Examine the Implications of Randomized Clinical Trials for Real-World Populations medRxiv. Health Inform. Pub Date : 2024-09-06 Phyllis M. Thangaraj, Sumukh Vasisht Shankar, Sicong Huang, Girish N. Nadkarni, Bobak J. Mortazavi, Evangelos K. Oikonomou, Rohan Khera
Randomized clinical trials (RCTs) are essential to guide medical practice; however, their generalizability to a given population is often uncertain. We developed a statistically informed Generative Adversarial Network (GAN) model, RCT-Twin-GAN, that leverages relationships between covariates and outcomes and generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from a
-
Digital risk score sensitively identifies presence of α-synuclein aggregation or dopaminergic deficit medRxiv. Health Inform. Pub Date : 2024-09-06 Ann-Kathrin Schalkamp, Kathryn J Peall, Neil A Harrison, Valentina Escott-Price, Payam Barnaghi, Cynthia Sandor
Background Use of digital sensors to passively collect long-term offers a step change in our ability to screen for early signs of disease in the general population. Smartwatch data has been shown to identify Parkinson’s disease (PD) several years before the clinical diagnosis, however, has not been evaluated in comparison to biological and pathological markers such as dopaminergic imaging (DaTscan)
-
Development and initial evaluation of a conversational agent for Alzheimer’s disease medRxiv. Health Inform. Pub Date : 2024-09-06 Natalia Castano-Villegas, Isabella Llano, Maria Camila Villa, Julian Martinez, Jose Zea, Tatiana Urrea, Alejandra Maria Bañol, Carlos Bohorquez, Nelson Martinez
Background Conversational Agents have attracted attention for personal and professional use. Their specialisation in the medical field is being explored. Conversational Agents (CA) have accomplished passing-level performance in medical school examinations and shown empathy when responding to patient questions. Alzheimer’s disease is characterized by the progression of cognitive and somatic decline
-
CODE - XAI: Construing and Deciphering Treatment Effects via Explainable AI using Real-world Data medRxiv. Health Inform. Pub Date : 2024-09-06 Mingyu Lu, Ian Covert, Nathan J. White, Su-In Lee
Determining which features drive the treatment effect for individual patients has long been a complex and critical question in clinical decision-making. Evidence from randomized controlled trials (RCTs) are the gold standard for guiding treatment decisions. However, individual patient differences often complicate the application of RCT findings, leading to imperfect treatment options. Traditional subgroup
-
Analyzing Diversity in Healthcare LLM Research: A Scientometric Perspective medRxiv. Health Inform. Pub Date : 2024-09-05 David Restrepo, Chenwei Wu, Constanza Vásquez-Venegas, João Matos, Jack Gallifant, Leo Anthony Celi, Danielle S. Bitterman, Luis Filipe Nakayama
The deployment of large language models (LLMs) in healthcare has demonstrated substantial potential for enhancing clinical decision-making, administrative efficiency, and patient outcomes. However, the underrepresentation of diverse groups in the development and application of these models can perpetuate biases, leading to inequitable healthcare delivery. This paper presents a comprehensive scientometric
-
A Common Longitudinal Intensive Care Unit data Format (CLIF) to enable multi-institutional federated critical illness research medRxiv. Health Inform. Pub Date : 2024-09-04 Juan C. Rojas, Patrick G. Lyons, Kaveri Chhikara, Vaishvik Chaudhari, Sivasubramanium V. Bhavani, Muna Nour, Kevin G. Buell, Kevin D. Smith, Catherine A. Gao, Saki Amagai, Chengsheng Mao, Yuan Luo, Anna K Barker, Mark Nuppnau, Haley Beck, Rachel Baccile, Michael Hermsen, Zewei Liao, Brenna Park-Egan, Kyle A Carey, XuanHan, Chad H Hochberg, Nicholas E Ingraham, William F Parker
Background Critical illness, or acute organ failure requiring life support, threatens over five million American lives annually. Electronic health record (EHR) data are a source of granular information that could generate crucial insights into the nature and optimal treatment of critical illness. However, data management, security, and standardization are barriers to large-scale critical illness EHR
-
Designing a computer-assisted diagnosis system for cardiomegaly detection and radiology report generation medRxiv. Health Inform. Pub Date : 2024-09-04 Tianhao Zhu, Kexin Xu, Wonchan Son, Kristofer Linton-Reid, Marc Boubnovski-Martell, Matt Grech-Sollars, Antoine D. Lain, Joram M. Posma
Chest X-ray (CXR) is a conventional diagnostic tool for cardiothoracic assessment, boasting a high degree of costeffectiveness and versatility. However, with an increasing number of scans to be evaluated by radiologists, they can suffer from fatigue which might impede diagnostic accuracy and slow down report generation. We describe a prototype computer-assisted diagnosis (CAD) pipeline employing computer
-
Similar performance of 8 machine learning models on 71 censored medical datasets: a case for simplicity medRxiv. Health Inform. Pub Date : 2024-09-04 Louis Rebaud, Nicolò Capobianco, Nicolas Captier, Thibault Escobar, Bruce Spottiswoode, Irène Buvat
In the analysis of medical data with censored outcomes, identifying the optimal machine learning pipeline is a challenging task, often requiring extensive preprocessing, feature selection, model testing, and tuning. To investigate the impact of the choice of pipeline on prediction performance, we evaluated 9 machine learning models on 71 medical datasets with censored targets. Only the decision tree
-
Visual-Textual Integration in LLMs for Medical Diagnosis: A Quantitative Analysis medRxiv. Health Inform. Pub Date : 2024-09-03 Reem Agbareia, Mahmud Omar, Shelly Soffer, Benjamin S Glicksberg, Girish N Nadkarni, Eyal Klang
Background and Aim Visual data from images is essential for many medical diagnoses. This study evaluates the performance of multimodal Large Language Models (LLMs) in integrating textual and visual information for diagnostic purposes.
-
Understanding technology-related prescribing errors for system optimisation: the Technology-Related Error Mechanism (TREM) classification medRxiv. Health Inform. Pub Date : 2024-09-03 Magdalena Z. Raban, Alison Merchant, Erin Fitzpatrick, Melissa T. Baysari, Ling Li, Peter J. Gates, Johanna I. Westbrook
Objectives Technology-related prescribing errors curtail the positive impacts of computerised provider order entry (CPOE) on medication safety. Understanding how technology-related errors occur can inform CPOE optimisation. Previously, we developed a classification of the underlying mechanisms of technology-related errors using prescribing error data from two adult hospitals. Our objective was to update
-
Dengue nowcasting in Brazil by combining official surveillance data and Google Trends information medRxiv. Health Inform. Pub Date : 2024-09-03 Yang Xiao, Guilherme Soares, Leonardo Bastos, Rafael Izbicki, Paula Moraga
Dengue is a mosquito-borne viral disease that poses significant public health challenges in tropical and sub-tropical regions worldwide. Surveillance systems are essential for dengue prevention and control. However, traditional systems often rely on delayed data, limiting their effectiveness. To address this, nowcasting methods are needed to estimate underreported cases, enabling more timely decision-making
-
LLM-AIx: An open source pipeline for Information Extraction from unstructured medical text based on privacy preserving Large Language Models medRxiv. Health Inform. Pub Date : 2024-09-03 Isabella Catharina Wiest, Fabian Wolf, Marie-Elisabeth Leßmann, Marko van Treeck, Dyke Ferber, Jiefu Zhu, Heiko Boehme, Keno K. Bressem, Hannes Ulrich, Matthias P. Ebert, Jakob Nikolas Kather
In clinical science and practice, text data, such as clinical letters or procedure reports, is stored in an unstructured way. This type of data is not a quantifiable resource for any kind of quantitative investigations and any manual review or structured information retrieval is time-consuming and costly. The capabilities of Large Language Models (LLMs) mark a paradigm shift in natural language processing
-
Multinational attitudes towards AI in healthcare and diagnostics among hospital patients medRxiv. Health Inform. Pub Date : 2024-09-02 Felix Busch, Lena Hoffmann, Lina Xu, Longjiang Zhang, Bin Hu, Ignacio García-Juárez, Liz N Toapanta-Yanchapaxi, Natalia Gorelik, Valérie Gorelik, Gaston A Rodriguez-Granillo, Carlos Ferrarotti, Nguyen N Cuong, Chau AP Thi, Murat Tuncel, Gürsan Kaya, Sergio M Solis-Barquero, Maria C Mendez Avila, Nevena G Ivanova, Felipe C Kitamura, Karina YI Hayama, Monserrat L Puntunet Bates, Pedro Iturralde Torres
The successful implementation of artificial intelligence (AI) in healthcare is dependent upon the acceptance of this technology by key stakeholders, particularly patients, who are the primary beneficiaries of AI-driven outcomes. This international, multicenter, cross-sectional study assessed the attitudes of hospital patients towards AI in healthcare across 43 countries. A total of 13806 patients at
-
The Use of Conversational Agents in Self-Management: A Retrospective Analysis medRxiv. Health Inform. Pub Date : 2024-09-02 Selahattin Colakoglu, Mustafa Durmus, Zeynep Pelin Polat, Asli Yildiz, Emre Sezgin
Background Understanding user engagement with conversational agents (CAs) in mobile health apps is crucial for improving sustained usage. We analyzed CA interactions in a mobile health app to identify usage patterns and potential barriers.
-
GenAI Exceeds Clinical Experts in Predicting Acute Kidney Injury following Paediatric Cardiopulmonary Bypass2 medRxiv. Health Inform. Pub Date : 2024-09-02 Mansour Sharabiani, Alireza Mahani, Alex Bottle, Yadav Srinivasan, Richard Issitt, Serban Stoica
The emergence of large language models (LLMs) offers new opportunities to leverage, often unused, information in clinical text. This study examines the utility of text embeddings generated by LLMs in predicting postoperative acute kidney injury (AKI) in paediatric cardiopulmonary bypass (CPB) patients using electronic health record (EHR) text, and to explore methods for explaining their output. AKI
-
What Constitutes High Risk for Venous Thromboembolism? Comparing Approaches to Determining an Appropriate Threshold medRxiv. Health Inform. Pub Date : 2024-09-01 Benjamin G Mittman, Bo Hu, Rebecca Schulte, Phuc Le, Matthew A Pappas, Aaron Hamilton, Michael B Rothberg
Background Guidelines recommend pharmacological venous thromboembolism (VTE) prophylaxis only for high-risk patients, but the probability of VTE considered “high-risk” is not specified. Our objective was to define an appropriate probability threshold (or range) for VTE risk stratification and corresponding prophylaxis in medical inpatients.
-
CarD-T: Interpreting Carcinomic Lexicon via Transformers medRxiv. Health Inform. Pub Date : 2024-08-31 Jamey O’Neill, Gudur Ashrith Reddy, Nermeeta Dhillon, Osika Tripathi, Ludmil Alexandrov, Parag Katira
The identification and classification of carcinogens is critical in cancer epidemiology, necessitating updated methodologies to manage the burgeoning biomedical literature. Current systems, like those run by the International Agency for Research on Cancer (IARC) and the National Toxicology Program (NTP), face challenges due to manual vetting and disparities in carcinogen classification spurred by the
-
Open Access Data Repository and Common Data Model for Pulse Oximeter Performance Data medRxiv. Health Inform. Pub Date : 2024-08-31 Nicholas Fong, Michael S. Lipnick, Ella Behnke, Yu Chou, Seif Elmankabadi, Lily Ortiz, Christopher S. Almond, Isabella Auchus, Garrett W. Burnett, Ronald Bisegerwa, Desireé R Conrad, Carolyn M. Hendrickson, Shubhada Hooli, Robert Kopotic, Gregory Leeb, Daniel Martin, Eric D. McCollum, Ellis P. Monk, Kelvin L. Moore, Leonid Shmuylovich, J. Brady Scott, An-Kwok Ian Wong, Tianyue Zhou, Romain Pirracchio
The OpenOximetry Repository is a structured database storing clinical and lab pulse oximetry data, serving as a centralized repository and data model for pulse oximetry initiatives. It supports measurements of arterial oxygen saturation (SaO2) by arterial blood gas co-oximetry and pulse oximetry (SpO2), alongside processed and unprocessed photoplethysmography (PPG) data and other metadata. This includes
-
Ubie Symptom Checker: A Clinical Vignette Simulation Study medRxiv. Health Inform. Pub Date : 2024-08-31 N. Kenji Taylor, Takashi Nishibayashi
Background AI-driven symptom checkers (SC) are increasingly adopted in healthcare for their potential to provide users with accessible and immediate preliminary health education. These tools, powered by advanced artificial intelligence algorithms, assist patients in quickly assessing their symptoms. Previous studies using clinical vignette approaches have evaluated SC accuracy, highlighting both strengths
-
BrainGT: Multifunctional Brain Graph Transformer for Brain Disorder Diagnosis medRxiv. Health Inform. Pub Date : 2024-08-31 Ahsan Shehzad, Shuo Yu, Dongyu Zhang, Shagufta Abid, Xinrui Cheng, Jingjing Zhou, Feng Xia
Brain networks play a crucial role in the diagnosis of brain disorders by enabling the identification of abnormal patterns and connections in brain activities. Previous studies exploit the Pearson’s correlation coefficient to construct functional brain networks from fMRI data and use graph learning to diagnose brain diseases. However, correlation-based brain networks are overly dense (often fully connected)
-
Natural language processing to evaluate texting conversations between patients and healthcare providers during COVID-19 Home-Based Care in Rwanda at scale medRxiv. Health Inform. Pub Date : 2024-08-31 Richard T Lester, Matthew Manson, Muhammed Semakula, Hyeju Jang, Hassan Mugabo, Ali Magzari, Junhong Ma Blackmer, Fanan Fattah, Simon Pierre Niyonsenga, Edson Rwagasore, Charles Ruranga, Eric Remera, Jean Claude S. Ngabonziza, Giuseppe Carenini, Sabin Nsanzimana
Isolation of patients with communicable infectious diseases limits spread of pathogens but can be difficult to manage outside hospitals. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and
-
LSD600: the first corpus of biomedical abstracts annotated with lifestyle–disease relations medRxiv. Health Inform. Pub Date : 2024-08-31 Esmaeil Nourani, Evangelia-Mantelena Makri, Xiqing Mao, Sampo Pyysalo, Søren Brunak, Katerina Nastou, Lars Juhl Jensen
Lifestyle factors (LSFs) are increasingly recognized as instrumental in both the development and control of diseases. Despite their importance, there is a lack of methods to extract relations between LSFs and diseases from the literature, a step necessary to consolidate the currently available knowledge into a structured form. As simple co-occurrence-based relation extraction (RE) approaches are unable
-
Studying Veteran food insecurity longitudinally using electronic health record data and natural language processing medRxiv. Health Inform. Pub Date : 2024-08-31 Alec B. Chapman, Talia Panadero, Rachel Dalrymple, Alicia Cohen, Nipa Kamdar, Farhana Pethani, Andrea Kalvesmaki, Richard E. Nelson, Jorie Butler
Food insecurity is an important social risk factor that is directly linked to patient health and well-being. The Department of Veterans Affairs (VA) aims to identify and resolve food insecurity through social and clinical interventions. However, evaluating the impact of such interventions is made challenging by the lack of follow-up data on Veteran food insecurity status. One potential solution is
-
The challenges of replication: a worked example of methods reproducibility using electronic health record data medRxiv. Health Inform. Pub Date : 2024-08-30 Richard Williams, Thomas Bolton, David Jenkins, Mehrdad A Mizani, Matthew Sperrin, Cathie Sudlow FMedSci, Angela Wood, Adrian Heald, Niels Peek, CVD-COVID-UK/COVID-IMPACT Consortium
Objective The ability to reproduce the work of others is an essential part of the scientific disciplines. Replicating observational studies using electronic health record (EHR) data can be challenging due to complexities in data access, variations in EHR systems across institutions, and the potential for unaccounted confounding variables. Our aim is to identify the barriers to methods reproducibility
-
Statistical refinement of case vignettes for digital health research medRxiv. Health Inform. Pub Date : 2024-08-30 Marvin Kopka, Markus A. Feufel
Digital health research often relies on case vignettes (descriptions of fictitious or real patients) to navigate ethical and practical challenges. Despite their utility, the quality and lack of standardization of these vignettes has often been criticized, especially in studies on symptom-assessment applications (SAAs) and triage decision-making. To address this, our paper introduces a method to refine
-
Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data medRxiv. Health Inform. Pub Date : 2024-08-30 Richard Williams, David Jenkins, Thomas Bolton, Adrian Heald, Mehrdad A Mizani, Matthew Sperrin, Niels Peek, the CVD-COVID-UK/COVID-IMPACT Consortium
Objectives To assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the United Kingdom.
-
Perceptions about the Use of Virtual Assistants for Seeking Health Information among Caregivers of Young Childhood Cancer Survivors medRxiv. Health Inform. Pub Date : 2024-08-29 Emre Sezgin, Daniel I. Jackson, Kate Kaufman, Micah Skeens, Cynthia A. Gerhardt, Emily L. Moscato
Purpose This study examined the perceptions of caregivers of young childhood cancer survivors (YCCS) regarding the use of virtual assistant (VA) technology for health information seeking and care management. The study aim was to understand how VAs can support caregivers, especially those from underserved communities, in navigating health information related to cancer survivorship.
-
Accurate Skin Lesion Classification Using Multimodal Learning on the HAM10000 Dataset medRxiv. Health Inform. Pub Date : 2024-08-28 Abdulmateen Adebiyi, Nader Abdalnabi, Emily Hoffman Smith, Jesse Hirner, Eduardo J. Simoes, Mirna Becevic, Praveen Rao
Objectives Our aim is to evaluate the performance of multimodal deep learning to classify skin lesions using both images and textual descriptions compared to learning only on images.
-
A Scoping Review of Privacy and Utility Metrics in Medical Synthetic Data medRxiv. Health Inform. Pub Date : 2024-08-28 Bayrem Kaabachi, Jérémie Despraz, Thierry Meurers, Karen Otte, Mehmed Halilovic, Bogdan Kulynych, Fabian Prasser, Jean Louis Raisaro
The use of synthetic data is a promising solution to facilitate the sharing and reuse of health-related data beyond its initial collection while addressing privacy concerns. However, there is still no consensus on a standardized approach for systematically evaluating the privacy and utility of synthetic data, impeding its broader adoption. In this work, we present a comprehensive review and systematization
-
Development of a cloud framework for training and deployment of deep learning models in Radiology: automatic segmentation of the human spine from CT-scans as a case-study medRxiv. Health Inform. Pub Date : 2024-08-28 Rui Santos, Nicholas Bünger, Benedikt Herzog, Sebastiano Caprara
Advancements in artificial intelligence (AI) and the digitalization of healthcare are revolutionizing clinical practices, with the deployment of AI models playing a crucial role in enhancing diagnostic accuracy and treatment outcomes. Our current study aims at bridging image data collected in a clinical setting, with deployment of deep learning algorithms for the segmentation of the human spine. The
-
Large Language Model Augmented Clinical Trial Screening medRxiv. Health Inform. Pub Date : 2024-08-28 Jacob Beattie, Dylan Owens, Ann Marie Navar, Luiza Giuliani Schmitt, Kimberly Taing, Sarah Neufeld, Daniel Yang, Christian Chukwuma, Ahmed Gul, Dong Soo Lee, Neil Desai, Dominic Moon, Jing Wang, Steve Jiang, Michael Dohopolski
Purpose: Identifying potential participants for clinical trials using traditional manual screening methods is time-consuming and expensive. Structured data in electronic health records (EHR) are often insufficient to capture trial inclusion and exclusion criteria adequately. Large language models (LLMs) offer the potential for improved participant screening by searching text notes in the EHR, but optimal
-
Evaluation synthesis analysis can be accelerated through text mining, searching, and highlighting: A case-study on data extraction from 631 UNICEF evaluation reports medRxiv. Health Inform. Pub Date : 2024-08-28 Lena Schmidt, Pauline Addis, Erica Mattellone, Hannah OKeefe, Kamilla Nabiyeva, Uyen Kim Huynh, Nabamallika Dehingia, Dawn Craig, Fiona Campbell
Background: The United Nations Children's Fund (UNICEF) is the United Nations agency dedicated to promoting and advocating for the protection of children's rights, meeting their basic needs, and expanding their opportunities to reach their full potential. They achieve this by working with governments, communities, and other partners via programmes that safeguard children from violence, provide access
-
MedSegBench: A Comprehensive Benchmark for Medical Image Segmentation in Diverse Data Modalities medRxiv. Health Inform. Pub Date : 2024-08-28 Zeki Kuş, Musa Aydin
MedSegBench is a comprehensive benchmark designed to evaluate deep learning models for medical image segmentation across a wide range of modalities. It covers a wide range of modalities, including 35 datasets with over 60,000 images from ultrasound, MRI, and X-ray. The benchmark addresses challenges in medical imaging by providing standardized datasets with train/validation/test splits, considering
-
Evaluating Anti-LGBTQIA+ Medical Bias in Large Language Models medRxiv. Health Inform. Pub Date : 2024-08-27 Crystal Tin-Tin Chang, Neha Srivathsa, Charbel Bou-Khalil, Akshay Swaminathan, Mitchell R Lunn, Kavita Mishra, Roxana Daneshjou, Sanmi Koyejo
From drafting responses to patient messages to clinical decision support to patient-facing educational chatbots, Large Language Models (LLMs) present many opportunities for use in clinical situations. In these applications, we must consider potential harms to minoritized groups through the propagation of medical misinformation or previously-held misconceptions. In this work, we evaluate the potential
-
Enhancing Clinical Documentation Workflow with Ambient Artificial Intelligence: Clinician Perspectives on Work Burden, Burnout, and Job Satisfaction medRxiv. Health Inform. Pub Date : 2024-08-26 Michael Albrecht, Denton Shanks, Tina Shah, Taina Hudson, Jeffrey Thompson, Tanya Filardi, Kelli Wright, Greg Ator, Timothy Ryan Smith
Objective: This study assessed the effects of an ambient artificial intelligence (AI) documentation platform on clinicians' perceptions of documentation workflow. Materials and Methods: A pre- and post-implementation survey evaluated ambulatory clinician perceptions on impact of Abridge, an ambient AI documentation platform. Outcomes included clinical documentation burden, work after-hours, clinician
-
The MSPTDfast photoplethysmography beat detection algorithm: Design, benchmarking, and open-source distribution medRxiv. Health Inform. Pub Date : 2024-08-26 Peter H Charlton, Erick Javier Arguello Prada, Jonathan Mant, Panicos A Kyriacou
Objective: Photoplethysmography is widely used for physiological monitoring, whether in clinical devices such as pulse oximeters, or consumer devices such as smartwatches. A key step in the analysis of photoplethysmogram (PPG) signals is detecting heartbeats. The MSPTD algorithm has been found to be one of the most accurate PPG beat detection algorithms, but is less computationally efficient than other
-
Health Data Nexus: An Open Data Platform for AI Research and Education in Medicine medRxiv. Health Inform. Pub Date : 2024-08-23 January L Adams, Rafal Cymerys, Karol Szuster, Daniel Hekman, Zoryana Salo, Rutvik Solanki, Muhammad Mamdani, Alistair Johnson, Katarzyna Ryniak, Tom L Pollard, David Rotenberg, Benjamin Haibe-Kains
We outline the development of the Health Data Nexus, a data platform which enables data storage and access management with a cloud-based computational environment. We describe the importance of this secure platform in an evolving public sector research landscape that utilizes significant quantities of data, particularly clinical data acquired from health systems, as well as the importance of providing
-
U-Net as a deep learning-based method for platelets segmentation in microscopic images medRxiv. Health Inform. Pub Date : 2024-08-23 Eva Maria Valerio de Sousa, Ajay Kumar, Charlie Coupland, Tânia F. Vaz, Will Jones, Rubén Valcarce-Diñeiro, Simon D.J. Calaminus
Manual counting of platelets, in microscopy images, is greatly time-consuming. Our goal was to automatically segment and count platelets images using a deep learning approach, applying U-Net and Fully Convolutional Network (FCN) modelling. Data preprocessing was done by creating binary masks and utilizing supervised learning with ground-truth labels. Data augmentation was implemented, for improved
-
Leveraging Large Language Models for Identifying Interpretable Linguistic Markers and Enhancing Alzheimer's Disease Diagnostics medRxiv. Health Inform. Pub Date : 2024-08-23 Tingyu Mo, Jacqueline Lam, Victor Li, Lawrence Cheung
Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disorder. Early detection of AD is crucial for timely disease intervention. This study proposes a novel LLM framework, which extracts interpretable linguistic markers from LLM models and incorporates them into supervised AD detection models, while evaluating their model performance and interpretability. Our work consists of