-
MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder Biodata Min. (IF 4.5) Pub Date : 2024-03-05 Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong
Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize
-
Interpreting drug synergy in breast cancer with deep learning using target-protein inhibition profiles Biodata Min. (IF 4.5) Pub Date : 2024-02-29 Thanyawee Srithanyarat, Kittisak Taoma, Thana Sutthibutpong, Marasri Ruengjitchatchawalya, Monrudee Liangruksa, Teeraphan Laomettachit
Breast cancer is the most common malignancy among women worldwide. Despite advances in treating breast cancer over the past decades, drug resistance and adverse effects remain challenging. Recent therapeutic progress has shifted toward using drug combinations for better treatment efficiency. However, with a growing number of potential small-molecule cancer inhibitors, in silico strategies to predict
-
Interaction models matter: an efficient, flexible computational framework for model-specific investigation of epistasis Biodata Min. (IF 4.5) Pub Date : 2024-02-28 Sandra Batista, Vered Senderovich Madar, Philip J. Freda, Priyanka Bhandary, Attri Ghosh, Nicholas Matsumoto, Apurva S. Chitre, Abraham A. Palmer, Jason H. Moore
Epistasis, the interaction between two or more genes, is integral to the study of genetics and is present throughout nature. Yet, it is seldom fully explored as most approaches primarily focus on single-locus effects, partly because analyzing all pairwise and higher-order interactions requires significant computational resources. Furthermore, existing methods for epistasis detection only consider a
-
Assessment of the causal relationship between gut microbiota and cardiovascular diseases: a bidirectional Mendelian randomization analysis Biodata Min. (IF 4.5) Pub Date : 2024-02-26 Xiao-Ce Dai, Yi Yu, Si-Yu Zhou, Shuo Yu, Mei-Xiang Xiang, Hong Ma
Previous studies have shown an association between gut microbiota and cardiovascular diseases (CVDs). However, the underlying causal relationship remains unclear. This study aims to elucidate the causal relationship between gut microbiota and CVDs and to explore the pathogenic role of gut microbiota in CVDs. In this two-sample Mendelian randomization study, we used genetic instruments from publicly
-
A network-based drug prioritization and combination analysis for the MEK5/ERK5 pathway in breast cancer Biodata Min. (IF 4.5) Pub Date : 2024-02-21 Regan Odongo, Asuman Demiroglu-Zergeroglu, Tunahan Çakır
Prioritizing candidate drugs based on genome-wide expression data is an emerging approach in systems pharmacology due to its holistic perspective for preclinical drug evaluation. In the current study, a network-based approach was proposed and applied to prioritize plant polyphenols and identify potential drug combinations in breast cancer. We focused on MEK5/ERK5 signalling pathway genes, a recently
-
m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models Biodata Min. (IF 4.5) Pub Date : 2024-02-15 Muhammad Taseer Suleman, Fahad Alturise, Tamim Alkhalifah, Yaser Daanial Khan
1-methyladenosine (m1A) is a variant of methyladenosine that holds a methyl substituent in the 1st position having a prominent role in RNA stability and human metabolites. Traditional approaches, such as mass spectrometry and site-directed mutagenesis, proved to be time-consuming and complicated. The present research focused on the identification of m1A sites within RNA sequences using novel feature
-
Revealing third-order interactions through the integration of machine learning and entropy methods in genomic studies Biodata Min. (IF 4.5) Pub Date : 2024-01-30 Burcu Yaldız, Onur Erdoğan, Sevda Rafatov, Cem Iyigün, Yeşim Aydın Son
Non-linear relationships at the genotype level are essential in understanding the genetic interactions of complex disease traits. Genome-wide association Studies (GWAS) have revealed statistical association of the SNPs in many complex diseases. As GWAS results could not thoroughly reveal the genetic background of these disorders, Genome-Wide Interaction Studies have started to gain importance. In recent
-
Antibody selection strategies and their impact in predicting clinical malaria based on multi-sera data Biodata Min. (IF 4.5) Pub Date : 2024-01-25 André Fonseca, Mikolaj Spytek, Przemysław Biecek, Clara Cordeiro, Nuno Sepúlveda
Nowadays, the chance of discovering the best antibody candidates for predicting clinical malaria has notably increased due to the availability of multi-sera data. The analysis of these data is typically divided into a feature selection phase followed by a predictive one where several models are constructed for predicting the outcome of interest. A key question in the analysis is to determine which
-
Machine learning approaches to identify systemic lupus erythematosus in anti-nuclear antibody-positive patients using genomic data and electronic health records Biodata Min. (IF 4.5) Pub Date : 2024-01-05 Chih-Wei Chung, Seng-Cho Chou, Tzu-Hung Hsiao, Grace Joyce Zhang, Yu-Fang Chung, Yi-Ming Chen
Although the 2019 EULAR/ACR classification criteria for systemic lupus erythematosus (SLE) has required at least a positive anti-nuclear antibody (ANA) titer (≥ 1:80), it remains challenging for clinicians to identify patients with SLE. This study aimed to develop a machine learning (ML) approach to assist in the detection of SLE patients using genomic data and electronic health records. Participants
-
Optimizing age-related hearing risk predictions: an advanced machine learning integration with HHIE-S Biodata Min. (IF 4.5) Pub Date : 2023-12-14 Tzong-Hann Yang, Yu-Fu Chen, Yen-Fu Cheng, Jue-Ni Huang, Chuan-Song Wu, Yuan-Chia Chu
The elderly are disproportionately affected by age-related hearing loss (ARHL). Despite being a well-known tool for ARHL evaluation, the Hearing Handicap Inventory for the Elderly Screening version (HHIE-S) has only traditionally been used for direct screening using self-reported outcomes. This work uses a novel integration of machine learning approaches to improve the predicted accuracy of the HHIE-S
-
6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site Biodata Min. (IF 4.5) Pub Date : 2023-11-27 Guohua Huang, Xiaohong Huang, Wei Luo
DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We presented a cross validation-based stacking ensemble model
-
Endoscopy-based IBD identification by a quantized deep learning pipeline Biodata Min. (IF 4.5) Pub Date : 2023-11-25 Massimiliano Datres, Elisa Paolazzi, Marco Chierici, Matteo Pozzi, Antonio Colangelo, Marcello Dorian Donzella, Giuseppe Jurman
Discrimination between patients affected by inflammatory bowel diseases and healthy controls on the basis of endoscopic imaging is an challenging problem for machine learning models. Such task is used here as the testbed for a novel deep learning classification pipeline, powered by a set of solutions enhancing characterising elements such as reproducibility, interpretability, reduced computational
-
DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping Biodata Min. (IF 4.5) Pub Date : 2023-11-15 Sana Munquad, Asim Bikas Das
The classification of glioma subtypes is essential for precision therapy. Due to the heterogeneity of gliomas, the subtype-specific molecular pattern can be captured by integrating and analyzing high-throughput omics data from different genomic layers. The development of a deep-learning framework enables the integration of multi-omics data to classify the glioma subtypes to support the clinical diagnosis
-
Research agenda for using artificial intelligence in health governance: interpretive scoping review and framework Biodata Min. (IF 4.5) Pub Date : 2023-10-31 Maryam Ramezani, Amirhossein Takian, Ahad Bakhtiari, Hamid R. Rabiee, Sadegh Ghazanfari, Saharnaz Sazgarnejad
The governance of health systems is complex in nature due to several intertwined and multi-dimensional factors contributing to it. Recent challenges of health systems reflect the need for innovative approaches that can minimize adverse consequences of policies. Hence, there is compelling evidence of a distinct outlook on the health ecosystem using artificial intelligence (AI). Therefore, this study
-
Correction: A prognostic model based on seven immune-related genes predicts the overall survival of patients with hepatocellular carcinoma Biodata Min. (IF 4.5) Pub Date : 2023-10-26 Qian Yan, Wenjiang Zheng, Boqing Wang, Baoqian Ye, Huiyan Luo, Xinqian Yang, Ping Zhang, Xiongwen Wang
Correction: BioData Mining 14, 29 (2021) https://doi.org/10.1186/s13040-021-00261-y Following publication of the original article [1], the authors found that the pathology images of normal liver tissue and liver cancer tissue of FABP6 in Fig. 13B were duplicated. The pathological images of normal FABP6 and liver cancer patients were both sourced from public databases (The Human Protein Atlas, https://www
-
Prescription pattern analysis of Type 2 Diabetes Mellitus: a cross-sectional study in Isfahan, Iran Biodata Min. (IF 4.5) Pub Date : 2023-10-20 Elnaz Ziad, Somayeh Sadat, Farshad Farzadfar, Mohammad-Reza Malekpour
Patients with Type 2 Diabetes Mellitus (T2DM) are at a higher risk of polypharmacy and more susceptible to irrational prescriptions; therefore, pharmacological therapy patterns are important to be monitored. The primary objective of this study was to highlight current prescription patterns in T2DM patients and compare them with existing Standards of Medical Care in Diabetes. The second objective was
-
Attention-based dual-path feature fusion network for automatic skin lesion segmentation Biodata Min. (IF 4.5) Pub Date : 2023-10-09 Zhenxiang He, Xiaoxia Li, Yuling Chen, Nianzu Lv, Yong Cai
Automatic segmentation of skin lesions is a critical step in Computer Aided Diagnosis (CAD) of melanoma. However, due to the blurring of the lesion boundary, uneven color distribution, and low image contrast, resulting in poor segmentation result. Aiming at the problem of difficult segmentation of skin lesions, this paper proposes an Attention-based Dual-path Feature Fusion Network (ADFFNet) for automatic
-
Quantum analysis of squiggle data Biodata Min. (IF 4.5) Pub Date : 2023-10-06 Naya Nagy, Matthew Stuart-Edwards, Marius Nagy, Liam Mitchell, Athanasios Zovoilis
Squiggle data is the numerical output of DNA and RNA sequencing by the Nanopore next generation sequencing platform. Nanopore sequencing offers expanded applications compared to previous sequencing techniques but produces a large amount of data in the form of current measurements over time. The analysis of these segments of current measurements require more complex and computationally intensive algorithms
-
Disclosing transcriptomics network-based signatures of glioma heterogeneity using sparse methods Biodata Min. (IF 4.5) Pub Date : 2023-09-26 Sofia Martins, Roberta Coletti, Marta B. Lopes
Gliomas are primary malignant brain tumors with poor survival and high resistance to available treatments. Improving the molecular understanding of glioma and disclosing novel biomarkers of tumor development and progression could help to find novel targeted therapies for this type of cancer. Public databases such as The Cancer Genome Atlas (TCGA) provide an invaluable source of molecular information
-
STAR_outliers: a python package that separates univariate outliers from non-normal distributions Biodata Min. (IF 4.5) Pub Date : 2023-09-04 John T. Gregg, Jason H. Moore
There are not currently any univariate outlier detection algorithms that transform and model arbitrarily shaped distributions to remove univariate outliers. Some algorithms model skew, even fewer model kurtosis, and none of them model bimodality and monotonicity. To overcome these challenges, we have implemented an algorithm for Skew and Tail-heaviness Adjusted Removal of Outliers (STAR_outliers) that
-
Machine learning based study for the classification of Type 2 diabetes mellitus subtypes Biodata Min. (IF 4.5) Pub Date : 2023-08-22 Nelson E. Ordoñez-Guillen, Jose Luis Gonzalez-Compean, Ivan Lopez-Arevalo, Miguel Contreras-Murillo, Edwin Aldana-Bobadilla
Data-driven diabetes research has increased its interest in exploring the heterogeneity of the disease, aiming to support in the development of more specific prognoses and treatments within the so-called precision medicine. Recently, one of these studies found five diabetes subgroups with varying risks of complications and treatment responses. Here, we tackle the development and assessment of different
-
Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication Biodata Min. (IF 4.5) Pub Date : 2023-07-22 Zarif L. Azher, Anish Suvarna, Ji-Qing Chen, Ze Zhang, Brock C. Christensen, Lucas A. Salas, Louis J. Vaickus, Joshua J. Levy
Deep learning models can infer cancer patient prognosis from molecular and anatomic pathology information. Recent studies that leveraged information from complementary multimodal data improved prognostication, further illustrating the potential utility of such methods. However, current approaches: 1) do not comprehensively leverage biological and histomorphological relationships and 2) make use of
-
Inverse problem for parameters identification in a modified SIRD epidemic model using ensemble neural networks Biodata Min. (IF 4.5) Pub Date : 2023-07-18 Marian Petrica, Ionel Popescu
In this paper, we propose a parameter identification methodology of the SIRD model, an extension of the classical SIR model, that considers the deceased as a separate category. In addition, our model includes one parameter which is the ratio between the real total number of infected and the number of infected that were documented in the official statistics. Due to many factors, like governmental decisions
-
Neural network-based prognostic predictive tool for gastric cardiac cancer: the worldwide retrospective study Biodata Min. (IF 4.5) Pub Date : 2023-07-18 Wei Li, Minghang Zhang, Siyu Cai, Liangliang Wu, Chao Li, Yuqi He, Guibin Yang, Jinghui Wang, Yuanming Pan
The incidence of gastric cardiac cancer (GCC) has obviously increased recently with poor prognosis. It’s necessary to compare GCC prognosis with other gastric sites carcinoma and set up an effective prognostic model based on a neural network to predict the survival of GCC patients. In the population-based cohort study, we first enrolled the clinical features from the Surveillance, Epidemiology and
-
ChatGPT and large language models in academia: opportunities and challenges Biodata Min. (IF 4.5) Pub Date : 2023-07-13 Jesse G. Meyer, Ryan J. Urbanowicz, Patrick C. N. Martin, Karen O’Connor, Ruowang Li, Pei-Chen Peng, Tiffani J. Bright, Nicholas Tatonetti, Kyoung Jae Won, Graciela Gonzalez-Hernandez, Jason H. Moore
The introduction of large language models (LLMs) that allow iterative “chat” in late 2022 is a paradigm shift that enables generation of text often indistinguishable from that written by humans. LLM-based chatbots have immense potential to improve academic work efficiency, but the ethical implications of their fair use and inherent bias must be considered. In this editorial, we discuss this technology
-
Overlapping filter bank convolutional neural network for multisubject multicategory motor imagery brain-computer interface Biodata Min. (IF 4.5) Pub Date : 2023-07-11 Jing Luo, Jundong Li, Qi Mao, Zhenghao Shi, Haiqin Liu, Xiaoyong Ren, Xinhong Hei
Motor imagery brain-computer interfaces (BCIs) is a classic and potential BCI technology achieving brain computer integration. In motor imagery BCI, the operational frequency band of the EEG greatly affects the performance of motor imagery EEG recognition model. However, as most algorithms used a broad frequency band, the discrimination from multiple sub-bands were not fully utilized. Thus, using convolutional
-
Comparison of cancer subtype identification methods combined with feature selection methods in omics data analysis Biodata Min. (IF 4.5) Pub Date : 2023-07-07 JiYoon Park, Jae Won Lee, Mira Park
Cancer subtype identification is important for the early diagnosis of cancer and the provision of adequate treatment. Prior to identifying the subtype of cancer in a patient, feature selection is also crucial for reducing the dimensionality of the data by detecting genes that contain important information about the cancer subtype. Numerous cancer subtyping methods have been developed, and their performance
-
ScInfoVAE: interpretable dimensional reduction of single cell transcription data with variational autoencoders and extended mutual information regularization Biodata Min. (IF 4.5) Pub Date : 2023-06-10 Weiquan Pan, Faning Long, Jian Pan
Single-cell RNA-sequencing (scRNA-seq) data can serve as a good indicator of cell-to-cell heterogeneity and can aid in the study of cell growth by identifying cell types. Recently, advances in Variational Autoencoder (VAE) have demonstrated their ability to learn robust feature representations for scRNA-seq. However, it has been observed that VAEs tend to ignore the latent variables when combined with
-
Changing word meanings in biomedical literature reveal pandemics and new technologies Biodata Min. (IF 4.5) Pub Date : 2023-05-05 David N. Nicholson, Faisal Alquaddoomi, Vincent Rubinetti, Casey S. Greene
While we often think of words as having a fixed meaning that we use to describe a changing world, words are also dynamic and changing. Scientific research can also be remarkably fast-moving, with new concepts or approaches rapidly gaining mind share. We examined scientific writing, both preprint and pre-publication peer-reviewed text, to identify terms that have changed and examine their use. One particular
-
A self-inspected adaptive SMOTE algorithm (SASMOTE) for highly imbalanced data classification in healthcare Biodata Min. (IF 4.5) Pub Date : 2023-04-25 Tanapol Kosolwattana, Chenang Liu, Renjie Hu, Shizhong Han, Hua Chen, Ying Lin
In many healthcare applications, datasets for classification may be highly imbalanced due to the rare occurrence of target events such as disease onset. The SMOTE (Synthetic Minority Over-sampling Technique) algorithm has been developed as an effective resampling method for imbalanced data classification by oversampling samples from the minority class. However, samples generated by SMOTE may be ambiguous
-
Automated quantitative trait locus analysis (AutoQTL) Biodata Min. (IF 4.5) Pub Date : 2023-04-10 Philip J. Freda, Attri Ghosh, Elizabeth Zhang, Tianhao Luo, Apurva S. Chitre, Oksana Polesskaya, Celine L. St. Pierre, Jianjun Gao, Connor D. Martin, Hao Chen, Angel G. Garcia-Martinez, Tengfei Wang, Wenyan Han, Keita Ishiwari, Paul Meyer, Alexander Lamparelli, Christopher P. King, Abraham A. Palmer, Ruowang Li, Jason H. Moore
Quantitative Trait Locus (QTL) analysis and Genome-Wide Association Studies (GWAS) have the power to identify variants that capture significant levels of phenotypic variance in complex traits. However, effort and time are required to select the best methods and optimize parameters and pre-processing steps. Although machine learning approaches have been shown to greatly assist in optimization and data
-
Reference-free phylogeny from sequencing data Biodata Min. (IF 4.5) Pub Date : 2023-03-27 Petr Ryšavý, Filip Železný
Clustering of genetic sequences is one of the key parts of bioinformatics analyses. Resulting phylogenetic trees are beneficial for solving many research questions, including tracing the history of species, studying migration in the past, or tracing a source of a virus outbreak. At the same time, biologists provide more data in the raw form of reads or only on contig-level assembly. Therefore, tools
-
Algorithm-based detection of acute kidney injury according to full KDIGO criteria including urine output following cardiac surgery: a descriptive analysis Biodata Min. (IF 4.5) Pub Date : 2023-03-16 Nico Schmid, Mihnea Ghinescu, Moritz Schanz, Micha Christ, Severin Schricker, Markus Ketteler, Mark Dominik Alscher, Ulrich Franke, Nora Goebel
Automated data analysis and processing has the potential to assist, improve and guide decision making in medical practice. However, by now it has not yet been fully integrated in a clinical setting. Herein we present the first results of applying algorithm-based detection to the diagnosis of postoperative acute kidney injury (AKI) comprising patient data from a cardiac surgical intensive care unit
-
Clinical assistant decision-making model of tuberculosis based on electronic health records Biodata Min. (IF 4.5) Pub Date : 2023-03-16 Mengying Wang, Cuixia Lee, Zhenhao Wei, Hong Ji, Yingyun Yang, Cheng Yang
Tuberculosis is a dangerous infectious disease with the largest number of reported cases in China every year. Preventing missed diagnosis has an important impact on the prevention, treatment, and recovery of tuberculosis. The earliest pulmonary tuberculosis prediction models mainly used traditional image data combined with neural network models. However, a single data source tends to miss important
-
Unsupervised encoding selection through ensemble pruning for biomedical classification Biodata Min. (IF 4.5) Pub Date : 2023-03-16 Sebastian Spänig, Alexander Michel, Dominik Heider
Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverage artificial intelligence to circumvent the cumbersome, wet-lab-based identification and automate the detection of
-
Genetics and precision health: the ecological fallacy and artificial intelligence solutions Biodata Min. (IF 4.5) Pub Date : 2023-03-13 Scott M. Williams, Jason H. Moore
Human genetics began as a field that used family-based linkage studies to identify genes associated with specific phenotypes. This approach was highly successful in finding genes of high penetrance that were essentially Mendelian in their inheritance patterns and highly predictive of phenotypes. This approach truly identified individuals at risk of disease within families based usually on a single
-
Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms Biodata Min. (IF 4.5) Pub Date : 2023-03-10 Shuo-Ming Ou, Ming-Tsun Tsai, Kuo-Hua Lee, Wei-Cheng Tseng, Chih-Yu Yang, Tz-Heng Chen, Pin-Jie Bin, Tzeng-Ji Chen, Yao-Ping Lin, Wayne Huey-Herng Sheu, Yuan-Chia Chu, Der-Cherng Tarng
Type 2 diabetes mellitus (T2DM) imposes a great burden on healthcare systems, and these patients experience higher long-term risks for developing end-stage renal disease (ESRD). Managing diabetic nephropathy becomes more challenging when kidney function starts declining. Therefore, developing predictive models for the risk of developing ESRD in newly diagnosed T2DM patients may be helpful in clinical
-
Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma Biodata Min. (IF 4.5) Pub Date : 2023-03-04 Davide Chicco, Tiziana Sanavia, Giuseppe Jurman
Neuroblastoma is a childhood neurological tumor which affects hundreds of thousands of children worldwide, and information about its prognosis can be pivotal for patients, their families, and clinicians. One of the main goals in the related bioinformatics analyses is to provide stable genetic signatures able to include genes whose expression levels can be effective to predict the prognosis of the patients
-
Ten simple rules for providing bioinformatics support within a hospital Biodata Min. (IF 4.5) Pub Date : 2023-02-23 Davide Chicco, Giuseppe Jurman
Bioinformatics has become a key aspect of the biomedical research programmes of many hospitals’ scientific centres, and the establishment of bioinformatics facilities within hospitals has become a common practice worldwide. Bioinformaticians working in these facilities provide computational biology support to medical doctors and principal investigators who are daily dealing with data of patients to
-
iU-Net: a hybrid structured network with a novel feature fusion approach for medical image segmentation Biodata Min. (IF 4.5) Pub Date : 2023-02-21 Yun Jiang, Jinkun Dong, Tongtong Cheng, Yuan Zhang, Xin Lin, Jing Liang
In recent years, convolutional neural networks (CNNs) have made great achievements in the field of medical image segmentation, especially full convolutional neural networks based on U-shaped structures and skip connections. However, limited by the inherent limitations of convolution, CNNs-based methods usually exhibit limitations in modeling long-range dependencies and are unable to extract large amounts
-
The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification Biodata Min. (IF 4.5) Pub Date : 2023-02-17 Davide Chicco, Giuseppe Jurman
Binary classification is a common task for which machine learning and computational statistics are used, and the area under the receiver operating characteristic curve (ROC AUC) has become the common standard metric to evaluate binary classifications in most scientific fields. The ROC curve has true positive rate (also called sensitivity or recall) on the y axis and false positive rate on the x axis
-
LoFTK: a framework for fully automated calculation of predicted Loss-of-Function variants and genes Biodata Min. (IF 4.5) Pub Date : 2023-02-02 Abdulrahman Alasiri, Konrad J. Karczewski, Brian Cole, Bao-Li Loza, Jason H. Moore, Sander W. van der Laan, Folkert W. Asselbergs, Brendan J. Keating, Jessica van Setten
Loss-of-Function (LoF) variants in human genes are important due to their impact on clinical phenotypes and frequent occurrence in the genomes of healthy individuals. The association of LoF variants with complex diseases and traits may lead to the discovery and validation of novel therapeutic targets. Current approaches predict high-confidence LoF variants without identifying the specific genes or
-
Detection of iron deficiency anemia by medical images: a comparative study of machine learning algorithms Biodata Min. (IF 4.5) Pub Date : 2023-01-24 Peter Appiahene, Justice Williams Asare, Emmanuel Timmy Donkoh, Giovanni Dimauro, Rosalia Maglietta
Anemia is one of the global public health problems that affect children and pregnant women. Anemia occurs when the level of red blood cells within the body decreases or when the structure of the red blood cells is destroyed or when the Hb level in the red blood cell is below the normal threshold, which results from one or more increased red cell destructions, blood loss, defective cell production or
-
Bacteria spatial tracking in Urban Park soils with MALDI-TOF Mass Spectrometry and Specific PCR Biodata Min. (IF 4.5) Pub Date : 2023-01-14 Arnal, Diego, Moya, Celeste, Filippelli, Luigi, Segura-Garcia, Jaume, Maicas, Sergi
Urban parks constitute one of the main leisure areas, especially for the most vulnerable people in our society, children, and the elderly. Contact with soils can pose a health risk. Microbiological testing is a key aspect in determining whether they are suitable for public use. The aim of this work is to map the spatial distribution of potential dangerous Enterobacteria but also bioremediation useful
-
Robust and rigorous identification of tissue-specific genes by statistically extending tau score Biodata Min. (IF 4.5) Pub Date : 2022-12-09 Lüleci, Hatice Büşra, Yılmaz, Alper
In this study, we aimed to identify tissue-specific genes for various human tissues/organs more robustly and rigorously by extending the tau score algorithm. Tissue-specific genes are a class of genes whose functions and expressions are preferred in one or several tissues restrictedly. Identification of tissue-specific genes is essential for discovering multi-cellular biological processes such as tissue-specific
-
Classification of breast cancer recurrence based on imputed data: a simulation study Biodata Min. (IF 4.5) Pub Date : 2022-12-07 Abassi, Rahibu A., Msengwa, Amina S.
Several studies have been conducted to classify various real life events but few are in medical fields; particularly about breast recurrence under statistical techniques. To our knowledge, there is no reported comparison of statistical classification accuracy and classifiers’ discriminative ability on breast cancer recurrence in presence of imputed missing data. Therefore, this article aims to fill
-
Detecting diseases in medical prescriptions using data mining methods Biodata Min. (IF 4.5) Pub Date : 2022-11-24 Nazari Nezhad, Sana, Zahedi, Mohammad H., Farahani, Elham
Every year, the health of millions of people around the world is compromised by misdiagnosis, which sometimes could even lead to death. In addition, it entails huge financial costs for patients, insurance companies, and governments. Furthermore, many physicians’ professional life is adversely affected by unintended errors in prescribing medication or misdiagnosing a disease. Our aim in this paper is
-
Towards a potential pan-cancer prognostic signature for gene expression based on probesets and ensemble machine learning Biodata Min. (IF 4.5) Pub Date : 2022-11-03 Chicco, Davide, Alameer, Abbas, Rahmati, Sara, Jurman, Giuseppe
Cancer is one of the leading causes of death worldwide and can be caused by environmental aspects (for example, exposure to asbestos), by human behavior (such as smoking), or by genetic factors. To understand which genes might be involved in patients’ survival, researchers have invented prognostic genetic signatures: lists of genes that can be used in scientific analyses to predict if a patient will
-
An unsupervised image segmentation algorithm for coronary angiography Biodata Min. (IF 4.5) Pub Date : 2022-10-21 Yin, Zong-Xian, Xu, Hong-Ming
Computer visual systems can rapidly obtain a large amount of data and automatically process them with ease. These characteristics constitute advantages for the application of such systems in the automatic analysis of medical images, as well as in processing technology. The precision of image segmentation, which plays a critical role in computer visual systems, directly affects the quality of processing
-
Expanding a database-derived biomedical knowledge graph via multi-relation extraction from biomedical abstracts Biodata Min. (IF 4.5) Pub Date : 2022-10-18 Nicholson, David N., Himmelstein, Daniel S., Greene, Casey S.
Knowledge graphs support biomedical research efforts by providing contextual information for biomedical entities, constructing networks, and supporting the interpretation of high-throughput analyses. These databases are populated via manual curation, which is challenging to scale with an exponentially rising publication rate. Data programming is a paradigm that circumvents this arduous manual process
-
EZCancerTarget: an open-access drug repurposing and data-collection tool to enhance target validation and optimize international research efforts against highly progressive cancers Biodata Min. (IF 4.5) Pub Date : 2022-10-01 Dora, David, Dora, Timea, Szegvari, Gabor, Gerdán, Csongor, Lohinai, Zoltan
The expanding body of potential therapeutic targets requires easily accessible, structured, and transparent real-time interpretation of molecular data. Open-access genomic, proteomic and drug-repurposing databases transformed the landscape of cancer research, but most of them are difficult and time-consuming for casual users. Furthermore, to conduct systematic searches and data retrieval on multiple
-
Effective hybrid feature selection using different bootstrap enhances cancers classification performance Biodata Min. (IF 4.5) Pub Date : 2022-09-30 Abdelwahed, Noura Mohammed, El-Tawel, Gh. S., Makhlouf, M. A.
Machine learning can be used to predict the different onset of human cancers. Highly dimensional data have enormous, complicated problems. One of these is an excessive number of genes plus over-fitting, fitting time, and classification accuracy. Recursive Feature Elimination (RFE) is a wrapper method for selecting the best subset of features that cause the best accuracy. Despite the high performance
-
Polygenic risk modeling of tumor stage and survival in bladder cancer Biodata Min. (IF 4.5) Pub Date : 2022-09-30 Nascimben, Mauro, Rimondini, Lia, Corà, Davide, Venturin, Manolo
Bladder cancer assessment with non-invasive gene expression signatures facilitates the detection of patients at risk and surveillance of their status, bypassing the discomforts given by cystoscopy. To achieve accurate cancer estimation, analysis pipelines for gene expression data (GED) may integrate a sequence of several machine learning and bio-statistical techniques to model complex characteristics
-
A Gated Recurrent Unit based architecture for recognizing ontology concepts from biological literature Biodata Min. (IF 4.5) Pub Date : 2022-09-28 Devkota, Pratik, Mohanty, Somya D., Manda, Prashanti
Annotating scientific literature with ontology concepts is a critical task in biology and several other domains for knowledge discovery. Ontology based annotations can power large-scale comparative analyses in a wide range of applications ranging from evolutionary phenotypes to rare human diseases to the study of protein functions. Computational methods that can tag scientific text with ontology terms
-
Interpretable recurrent neural network models for dynamic prediction of the extubation failure risk in patients with invasive mechanical ventilation in the intensive care unit Biodata Min. (IF 4.5) Pub Date : 2022-09-27 Zeng, Zhixuan, Tang, Xianming, Liu, Yang, He, Zhengkun, Gong, Xun
Clinical decision of extubation is a challenge in the treatment of patient with invasive mechanical ventilation (IMV), since existing extubation protocols are not capable of precisely predicting extubation failure (EF). This study aims to develop and validate interpretable recurrent neural network (RNN) models for dynamically predicting EF risk. A retrospective cohort study was conducted on IMV patients
-
Machine Learning Algorithms for understanding the determinants of under-five Mortality Biodata Min. (IF 4.5) Pub Date : 2022-09-24 Saroj, Rakesh Kumar, Yadav, Pawan Kumar, Singh, Rajneesh, Chilyabanyama, Obvious.N.
Under-five mortality is a matter of serious concern for child health as well as the social development of any country. The paper aimed to find the accuracy of machine learning models in predicting under-five mortality and identify the most significant factors associated with under-five mortality. The data was taken from the National Family Health Survey (NFHS-IV) of Uttar Pradesh. First, we used multivariate
-
ParticleChromo3D: a Particle Swarm Optimization algorithm for chromosome 3D structure prediction from Hi-C data Biodata Min. (IF 4.5) Pub Date : 2022-09-21 Vadnais, David, Middleton, Michael, Oluwadare, Oluwatosin
The three-dimensional (3D) structure of chromatin has a massive effect on its function. Because of this, it is desirable to have an understanding of the 3D structural organization of chromatin. To gain greater insight into the spatial organization of chromosomes and genomes and the functions they perform, chromosome conformation capture (3C) techniques, particularly Hi-C, have been developed. The Hi-C
-
Learning and visualizing chronic latent representations using electronic health records Biodata Min. (IF 4.5) Pub Date : 2022-09-05 Chushig-Muzo, David, Soguero-Ruiz, Cristina, Miguel Bohoyo, Pablo de, Mora-Jiménez, Inmaculada
Nowadays, patients with chronic diseases such as diabetes and hypertension have reached alarming numbers worldwide. These diseases increase the risk of developing acute complications and involve a substantial economic burden and demand for health resources. The widespread adoption of Electronic Health Records (EHRs) is opening great opportunities for supporting decision-making. Nevertheless, data extracted
-
Analysis of risk factors progression of preterm delivery using electronic health records Biodata Min. (IF 4.5) Pub Date : 2022-08-17 Safi, Zeineb, Venugopal, Neethu, Ali, Haytham, Makhlouf, Michel, Farooq, Faisal, Boughorbel, Sabri
Preterm deliveries have many negative health implications on both mother and child. Identifying the population level factors that increase the risk of preterm deliveries is an important step in the direction of mitigating the impact and reducing the frequency of occurrence of preterm deliveries. The purpose of this work is to identify preterm delivery risk factors and their progression throughout the
-
Neural network methods for diagnosing patient conditions from cardiopulmonary exercise testing data Biodata Min. (IF 4.5) Pub Date : 2022-08-13 Brown, Donald E., Sharma, Suchetha, Jablonski, James A., Weltman, Arthur
Cardiopulmonary exercise testing (CPET) provides a reliable and reproducible approach to measuring fitness in patients and diagnosing their health problems. However, the data from CPET consist of multiple time series that require training to interpret. Part of this training teaches the use of flow charts or nested decision trees to interpret the CPET results. This paper investigates the use of two