-
Drosophila melanogaster Set8 and L(3)mbt function in gene expression independently of histone H4 lysine 20 methylation bioRxiv. Genom. Pub Date : 2024-03-18 Aaron T Crain, Megan B Butler, Christina A Hill, Mai Huynh, Robert Kendall McGinty, Robert J Duronio
Mono-methylation of Lysine 20 of histone H4 (H4K20me1) is catalyzed by Set8 and thought to play important roles in many aspects of genome function that are mediated by H4K20me-binding proteins. We interrogated this model in a developing animal by comparing in parallel the transcriptomes of Set8null, H4K20R/A, and l(3)mbt mutant Drosophila melanogaster. We found that the gene expression profiles of
-
Long-read single-cell RNA sequencing enables the study of cancer subclone-specific genotype and phenotype in chronic lymphocytic leukemia bioRxiv. Genom. Pub Date : 2024-03-18 Gage S. Black, Xiaomeng Huang, Yi Qiao, Philip Moos, Deepa Sampath, Deborah M. Stephens, Jennifer A. Woyach, Gabor T. Marth
Bruton's tyrosine kinase (BTK) inhibitors are effective for the treatment of chronic lymphocytic leukemia (CLL) due to BTK's role in B cell survival and proliferation. Treatment resistance is most commonly caused by the emergence of the hallmark BTKC481S mutation that inhibits drug binding. In this study, we aimed to investigate whether the presence of additional CLL driver mutations in cancer subclones
-
Biosurfer for systematic tracking of regulatory mechanisms leading to protein isoform diversity bioRxiv. Genom. Pub Date : 2024-03-18 Mayank Murali, Jamie Saquing, Senbao Lu, Ziyang Gao, Ben Jordan, Zachary Wakefield, Ana Fiszbein, David Cooper, Peter Castaldi, Dmitry Korkin, Gloria Sheynkman
Long-read RNA sequencing has shed light on transcriptomic complexity, but questions remain about the functionality of downstream protein products. We introduce Biosurfer, a computational approach for comparing protein isoforms, while systematically tracking the transcriptional, splicing, and translational variations that underlie differences in the sequences of the protein products. Using Biosurfer
-
Delineating the Role of ITGAM in Macrophage Dynamics and Cardiac Modulation during Sepsis-Induced Cardiomyopathy bioRxiv. Genom. Pub Date : 2024-03-18 Qinxue Wang, Haobin Huang
Background: Sepsis-induced cardiomyopathy (SIC) represents a critical complication of sepsis, characterized by reversible myocardial dysfunction and alterations. Despite extensive research, the molecular mechanisms underlying SIC remain poorly understood. Methods: Utilizing bioinformatics analysis of RNA-seq and scRNA-seq data from the GEO database, we identified key immune cell populations and molecular
-
Integrated Analysis Reveals Immunogenic Cell Death in Sepsis-induced Cardiomyopathy bioRxiv. Genom. Pub Date : 2024-03-18 qinxue wang, haobin huang
Background: Sepsis-induced cardiomyopathy (SIC) poses a significant challenge in critical care, necessitating comprehensive understanding and innovative diagnostic approaches. This study explores the immune-related molecular intricacies underlying SIC, employing bioinformatics analyses and machine learning techniques. Methods: RNA-seq and scRNA-seq datasets (GSE79962 and GSE190856) were obtained from
-
CROPseq-multi: a versatile solution for multiplexed perturbation and decoding in pooled CRISPR screens bioRxiv. Genom. Pub Date : 2024-03-17 Russell T Walton, Yue Qin, Paul C Blainey
Forward genetic screens seek to dissect complex biological systems by systematically perturbing genetic elements and observing the resulting phenotypes. While standard screening methodologies introduce individual perturbations, multiplexing perturbations improves the performance of single-target screens and enables combinatorial screens for the study of genetic interactions. Current tools for multiplexing
-
Large-scale composite hypothesis testing for omics analyses. bioRxiv. Genom. Pub Date : 2024-03-17 Annaïg De Walsche, Franck Gauthier, Alain Charcosset, Tristan Mary-Huard
Composite Hypothesis Testing (CHT) based on summary statistics has become a popular strategy to assess the effect of the same marker (or gene) jointly across multiple traits or at different omics levels. Although significant efforts have been made to develop efficient CHT procedures, most approaches face scalability constraints in terms of the number of traits/omics and markers to handle, or fail to
-
EPA induces an anti-inflammatory transcriptomic landscape in T cells implicating a pathway independent of triglyceride lowering in CVD risk reduction bioRxiv. Genom. Pub Date : 2024-03-17 Nathalie Amara Reilly, Koen Dekkers, Jeroen Molenaar, Sinthuja Arumugam, Thomas B. Kuipers, Yavuz Ariyurek, Marten A. Hoeksema, J. Wouter Jukema, Bastiaan T Heijmans
A twice-daily dose of highly purified eicosapentaenoic acid (EPA) reduces the risk of atherosclerotic cardiovascular disease among patients with high triglycerides and either known cardiovascular disease or those at high risk for developing it. However, the process by which EPA exerts its beneficial effects remains poorly understood. Here, we show that EPA can induce an anti-inflammatory transcriptional
-
Interpretable and predictive models to harness the life science data revolution bioRxiv. Genom. Pub Date : 2024-03-17 Joshua P Jahner, C. Alex Buerkle, Dustin G Gannon, Eliza M Grames, S. Eryn McFarlane, Andrew Siefert, Katherine L Bell, Victoria L DeLeo, Matthew L Forister, Joshua G Harrison, Daniel C Laughlin, Amy C Patterson, Breanna F Powers, Chhaya M Werner, Isabella A Oleksy
The proliferation of biological data with large numbers of samples and many dimensions is kindling hope that life scientists will be able to fit statistical and machine learning models that are highly predictive and interpretable. However, large biological data sets are commonly burdened with an inherent trade-off: in-sample prediction will improve as additional predictors are included in the model
-
Identification of novel genes in cattle (Bos taurus) and biological insights into their function in embryo development bioRxiv. Genom. Pub Date : 2024-03-17 Gustavo P Schettini, Michael Morozyuk, Fernando H Biase
Appropriate regulation of genes expressed in oocytes and embryos is essential for acquisition of developmental competence in mammals. Here, we hypothesized that several genes expressed in oocytes and pre-implantation embryos remain unknown. Our goal was to reconstruct the transcriptome of oocytes (germinal vesicle and metaphase II) and pre-implantation cattle embryos (blastocysts) using short-read
-
Systematic Analysis of Biological Processes Reveals Gene Co-expression Modules Driving Pathway Dysregulation in Alzheimer's Disease bioRxiv. Genom. Pub Date : 2024-03-17 Temitope O Adeoye, Syed I Shah, Ghanim Ullah
Alzheimer's disease (AD) manifests as a complex systems pathology with intricate interplay among various genes and biological processes. Traditional differential gene expression (DEG) analysis, while commonly employed to characterize AD-driven perturbations, does not sufficiently capture the full spectrum of underlying biological processes. Utilizing single-nucleus RNA-sequencing data from postmortem
-
A transcription factor (TF) inference method that broadly measures TF activity and identifies mechanistically distinct TF networks bioRxiv. Genom. Pub Date : 2024-03-16 Taylor Jones, Rutendo F Sigauke, Lynn Sanford, Dylan J Taatjes, Mary A Allen, Robin D Dowell
TF profiler is a method of inferring transcription factor regulatory activity, i.e. when a TF is present and actively regulating transcription, directly directly from nascent sequencing assays such as PRO-seq and GRO-seq. Transcription factors orchestrate transcription and play a critical role in cellular maintenance, iden- tity and response to external stimuli. While ChIP assays have measured DNA
-
A first look at the genome structure of hexaploid 'Black Mitcham' peppermint (Mentha piperita L.) bioRxiv. Genom. Pub Date : 2024-03-16 Samuel C. Talbot, Kelly J. Vining, B. Markus Lange, Iovanna Pandelova
Peppermint, Mentha xpiperita L., is a hexaploid (2n = 6x = 72) and the predominant cultivar of commercial mint oil production in the US. This cultivar is threatened because of high susceptibility to the fungal disease Verticillium wilt, caused by Verticillium dahliae. This report details the first draft polyploid chromosome-level genome assembly for this mint species. The Black Mitcham genome resource
-
Transposable element methylation state predicts age and disease bioRxiv. Genom. Pub Date : 2024-03-16 Francesco Morandini, Jinlong Yuyang Lu, Cheyenne Rechsteiner, Aladdin H. Shadyab, Ramon Casanova, Beverly Snively, Andrei Seluanov, Vera Gorbunova
Transposable elements (TEs) are DNA sequences that expand selfishly in the genome, possibly causing severe cellular damage. While normally silenced, TEs have been shown to activate during aging. DNA methylation is one of the main mechanisms by which TEs are silenced and has been used to train highly accurate age predictors. Yet, one common criticism of such predictors is that they lack interpretability
-
Exploring the probiotic potential, antioxidant capacity, and healthy aging based on whole genome analysis of Lactiplantibacillus plantarum LPJBC5 isolated from fermented milk product bioRxiv. Genom. Pub Date : 2024-03-16 Anupam Bhattacharya, Tulsi K. Joishy, Mojibur R. Khan
Lactiplantibacillus plantarum is a beneficial bacterium commonly found in fermented foods, including fermented milk products. In the present study, we reported the whole genome sequence of L. plantarum LPJBC5. The complete genome sequence of LPJBC5 was 3.23Mb, and the average GC% was found to be 44.55% encoding a total of 3016 genes. A comprehensive analysis of the LPJBC5 genome detected major carbohydrate-active
-
Sparse Testing Designs for Optimizing Predictive Ability in Sugarcane Populations bioRxiv. Genom. Pub Date : 2024-03-16 Julian Garcia-Abadillo, Paul Adunola, Fernando S Aguilar, Jhon Henry Trujillo-Montenegro, John Jaime Riascos, Reyna Persa, Julio Isidro Y Sanchez, Diego Jarquin
Sugarcane is a crucial crop for sugar and bioenergy production. Saccharose content and total weight are the two main key commercial traits that compose yield sugarcane. These traits are under complex genetic control and their response patterns are influenced by the genotype-by-environment (G*E) interaction. An efficient breeding of sugarcane demands an accurate assessment of the genotype stability
-
Dissection of core promoter syntax through single nucleotide resolution modeling of transcription initiation bioRxiv. Genom. Pub Date : 2024-03-16 Adam Y He, Charles G Danko
Our understanding of how the DNA sequences of cis-regulatory elements encode transcription initiation patterns remains limited. Here we introduce CLIPNET, a deep learning model trained on population-scale PRO-cap data that accurately predicts the position and quantity of transcription initiation with single nucleotide resolution from DNA sequence. Interpretation of CLIPNET revealed a complex regulatory
-
Pivotal role of biallelic frequency analysis in identifying copy number alterations using genome-wide methods in tumors with a high level of aneuploidy bioRxiv. Genom. Pub Date : 2024-03-16 Julia Rymuza, Renata Woroniecka, Beata Grygalewicz, Mateusz Bujko
Abstract: Chromosome number abnormalities is one of the hallmarks of cancer. DNA copy number alterations (CNA) are studied using various genome-wide methods. In our study we investigated CNA in human pituitary tumors using three platforms CytoSNP-850K microarrays, low-pass whole-genome sequencing (average x7 coverage, LPWGS), and Infinium Methylation EPIC array. Virtual karyotypes based on each dataset
-
SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models bioRxiv. Genom. Pub Date : 2024-03-15 Bernardo P de Almeida, Hugo Dalla-Torre, Guillaume Richard, Christopher Blum, Lorenz Hexemer, Maxence Gelard, Priyanka Pandey, Stefan Laurent, Alexandre Laterre, Maren Lang, Ugur Sahin, Karim Beguir, Thomas Pierrot
Foundation models have achieved remarkable success in several fields such as natural language processing, computer vision and more recently biology. DNA foundation models in particular are emerging as a promising approach for genomics. However, so far no model has delivered granular, nucleotide-level predictions across a wide range of genomic and regulatory elements, limiting its practical usefulness
-
Comparison of Single-cell Long-read and Short-read Transcriptome Sequencing of Patient-derived Organoid Cells of ccRCC: Quality Evaluation of the MAS-ISO-seq Approach bioRxiv. Genom. Pub Date : 2024-03-15 Natalia Zajac, Qin Zhang, Anna Bratus-Neuenschwander, Weihong Qi, Hella Anna Bolck, Tulay Karakulak, Tamara Carrasco Oltra, Holger Moch, Abdullah Kahraman, Hubert Rehrauer
Single-cell RNA sequencing is used in profiling gene expression differences between cells. Short-read sequencing platforms provide high throughput and high-quality information at the gene-level, but the technique is hindered by limited read length, failing in providing an understanding of the cell heterogeneity at the isoform level. This gap has recently been addressed by the long-read sequencing platforms
-
Development of a novel GWAS method to detect QTL effects interacting with the discrete and continuous population structure bioRxiv. Genom. Pub Date : 2024-03-15 Kosuke Hamazaki, Hiroyoshi Iwata, Tristan Mary-Huard
Although GWAS has been a key technology to identify causal genes, the current standard GWAS model still has problems that need to be solved. Among them, the population structure is one of the most severe problems when detecting QTLs in GWAS since the GWAS model is statistically confounded by effects derived from the population structure. Further, the existence of QTLs, whose effects depend on the genetic
-
Spaceflight causes strain-dependent gene expression changes associated with lipid and extracellular matrix dysregulation in the mouse kidney in vivo bioRxiv. Genom. Pub Date : 2024-03-15 Rebecca H. Finch, Geraldine Vitry, Keith Siew, Stephen B Walsh, Afshin Beheshti, Gary Hardiman, Willian Abraham da Silveira
To explore new worlds we must ensure humans can survive and thrive in the space environment. Incidence of kidney stones in astronauts is a major risk factor associated with long term missions, caused by increased blood calcium levels due to bone demineralisation triggered by microgravity and space radiation. Transcriptomic changes have been observed in other tissues during spaceflight, including the
-
Genome sequences of four Ixodes species expands understanding of tick evolution bioRxiv. Genom. Pub Date : 2024-03-15 Alexandra Cerqueira de Araujo, Benjamin Noel, Anthony Bretaudeau, Karine Labadie, Mateo Boudet, Nachida Tadrent, Benjamin Istace, Salima Kritli, Corinne Cruaud, Robert Olaso, Jean-Francois Deleuze, Maarten Voordouw, Caroline Hervet, Olivier Plantard, Aya Zamoto-Niikura, Thomas Chertemps, Martine Maibeche, Frederique Hilliou, Gaelle Le Goff, Petr Kopacek, Jan Perner, Jindrich Chmelar, Vilem Mazak, Mohammed
Ticks, hematophagous acari, pose a significant threat by transmitting various pathogens to their vertebrate hosts during feeding. Despite advances in tick genomics, high-quality genomes were lacking until recently, particularly in the genus Ixodes, which includes the main vectors of Lyme disease. Here, we present the complete genome sequences of four tick species, derived from a single female individual
-
Transcriptomic and epigenomic consequences of heterozygous loss of function mutations in AKAP11, the first large-effect shared risk gene for bipolar disorder and schizophrenia bioRxiv. Genom. Pub Date : 2024-03-15 Nargess Farhangdoost, Calwing Liao, Yumin Liu, Martin Alda, Patrick A. Dion, Guy Rouleau, Anouar Khayachi, Boris Chaumette
The gene A-kinase anchoring protein 11 (AKAP11) recently emerged as a shared risk factor between bipolar disorder and schizophrenia, driven by large-effect loss-of-function (LoF) variants. Recent research has uncovered the neurophysiological characteristics and synapse proteomics profile of Akap11-mutant mouse models. Considering the role of AKAP11 in binding cAMP-dependent protein kinase A (PKA) and
-
Gene expression signatures of response to fluoxetine treatment: systematic review and meta-analyses bioRxiv. Genom. Pub Date : 2024-03-15 David G. Cooper, J. Paige Cowden, Parker A. Stanley, Jack T. Karbowski, Victoria S. Gaertig, Caiden J. Lukan, Patrick M. Vo, Ariel D. Worthington, Caleb A. Class
Background: Selecting the best antidepressant for a patient with major depressive disorder (MDD) remains a challenge, and some have turned to genomic (and other 'omic) data to identify an optimal therapy. In this work, we synthesized gene expression data for fluoxetine treatment in both human patients and rodent models, to better understand biological pathways affected by treatment, as well as those
-
The Histone Chaperone Spn1 Preserves Subnucleosomal Structures at Promoters and Nucleosome Positioning in Open Reading Frames bioRxiv. Genom. Pub Date : 2024-03-14 Andrew J Tonsager, Alexis Zukowski, Catherine A Radebaugh, Abigail Weirich, Laurie A Stargell, Srinivas Ramachandran
Spn1 is a multifunctional histone chaperone essential for life in eukaryotes. While previous work has elucidated regions of the protein important for its many interactions, it is unknown how these domains contribute to the maintenance of chromatin structure. Here, we employ digestion by micrococcal nuclease followed by single-stranded library preparation and sequencing (MNase-SSP) to characterize chromatin
-
Exploring the Genomic Landscape of the GP63 family in Trypanosoma cruzi: Evolutionary Dynamics and Functional Peculiarities bioRxiv. Genom. Pub Date : 2024-03-14 Luisa Berna, Maria Laura Chiribao, Sebastian Pita, Fernando Alvarez-Valin, Adriana Parodi-Talice
We analyzed the complete set of GP63 sequences from the parasitic protozoa Trypanosoma cruzi. Our analysis allowed us to refine annotation of sequences previously identified as functional and pseudogenes. Concerning the latter, we unified pseudogenic fragments derived from the same functional gene and excluded sequences incorrectly annotated as GP63 pseudogenes. We were able to identify eleven GP63
-
GEGA (Gallus Enriched Gene Annotation): an online tool providing genomics and functional information across 47 tissues for a chicken gene-enriched atlas gathering Ensembl & Refseq genome annotations bioRxiv. Genom. Pub Date : 2024-03-14 Fabien DEGALEZ, Philippe Bardou, Sandrine Lagarrigue
GEGA is a user-friendly tool to navigate through different genomics and functional information related to an enriched gene atlas in chicken that unifies the gene catalogues from the two reference databases, NCBI-RefSeq & EMBL-Ensembl/GENCODE, and four other additional rich resources as FAANG and NONCODE. Using the latest GRCg7b genome assembly, GEGA offers a total of 78,323 genes, including 24,102
-
Spotlight on 10x Visium: a multi-sample protocol comparison of spatial technologies bioRxiv. Genom. Pub Date : 2024-03-14 Mei R.M. Du, Changqing Wang, Charity W. Law, Daniela Amann-Zalcenstein, Casey J.A. Anttila, Ling Ling, Peter F. Hickey, Callum J. Sargeant, Yunshun Chen, Lisa J. Ioannidis, Pradeep Rajasekhar, Raymond K.H. Yip, Kelly L. Rogers, Diana S. Hansen, Rory Bowden, Matthew E. Ritchie
Background: Spatial transcriptomics allows gene expression to be measured within complex tissue contexts. Among the array of spatial capture technologies available is 10x Genomics' Visium platform, a popular method which enables transcriptome-wide profiling of tissue sections. Visium offers a range of sample handling and library construction methods which introduces a need for benchmarking to compare
-
Dimeric R25CPTH(1-34) Activates the Parathyroid Hormone-1 Receptor in vitro and Stimulates Bone Formation in Osteoporotic Female Mice bioRxiv. Genom. Pub Date : 2024-03-14 Minsoo Noh, Xiangguo Che, Xian Jin, Dong-Kyo Lee, Hyun-Ju Kim, Doo Ri Park, Soo Young Lee, Hunsang Lee, Thomas J. Gardella, Je-Yong Choi, Sihoon Lee
Osteoporosis, characterized by reduced bone density and strength, increases fracture risk, pain, and limits mobility. Established therapies of Parathyroid hormone (PTH) analogs effectively promote bone formation and reduce fractures in severe osteoporosis, their use is limited by potential adverse effects. In the pursuit of safer osteoporosis treatments, we investigated R25CPTH, a PTH variant wherein
-
WGBS of Differentiating Adipocytes Reveals Variations in DMRs and Context-Dependent Gene Expression bioRxiv. Genom. Pub Date : 2024-03-14 Binduma Yadav, Dalwinder Singh, Shrikant Mantri, Vikas Rishi
Obesity, characterised by the accumulation of excess fat, is a complex condition resulting from the combination of genetic and epigenetic factors. Recent studies have found correspondence between DNA methylation and cell differentiation, suggesting a role of the former in cell fate determination. There is a lack of comprehensive understanding concerning the underpinnings of preadipocyte differentiation
-
Automated high-throughput profiling of single-cell total transcriptome with scComplete-seq bioRxiv. Genom. Pub Date : 2024-03-14 Fatma Betul Dincaslan, Shaun Wei Yang Ngang, Rui Zhen Tan, Lih Feng Cheow
Detecting the complete portrait of the transcriptome is essential to understanding the roles of both polyadenylated and non-polyadenylated RNA species. However, current efforts to investigate the heterogeneity of the total cellular transcriptome in single cells are limited by the lack of an automated, high-throughput assay that can be carried out on existing platforms. To address this issue, we developed
-
Deciphering Immune Complexity: Single-Cell Insights into Autoimmune Myocarditis Progression bioRxiv. Genom. Pub Date : 2024-03-14 Farag Mamdouh, Waleed K. Abdulsahib, Dalal Sulaiman Alshaya, Eman Fayad, Refaat A. Eid, Ghadi Alsharif, Mohammed A. Alshehri, Hassan M. Otifi, Mohamed A. Soltan, Muhammad Alaa Eldeen
Autoimmune myocarditis is a complex inflammatory response in the heart caused by abnormal immune system activity. We used modern single-cell technologies to analyze the complex gene expression patterns in autoimmune myocarditis tissue samples during several stages of inflammation: acute, subacute, and chronic. We identified the presence of T cell-monocyte complexes in both control and myocarditis samples
-
SLRfinder: a method to detect candidate sex-linked regions with linkage disequilibrium clustering bioRxiv. Genom. Pub Date : 2024-03-14 Xueling Yi, Petri Kemppainen, Juha Merila
Despite their critical roles in genetic sex determination, sex chromosomes remain unknown in many non-model organisms. In contrast to conserved sex chromosomes in mammals and birds, studies of fish, amphibians, and reptiles have found highly labile sex chromosomes with newly evolved sex-linked regions (SLRs). These labile sex chromosomes are important for understanding early sex chromosome evolution
-
The GC-content at the 5'ends of human protein-coding genes is undergoing mutational decay bioRxiv. Genom. Pub Date : 2024-03-14 Yi Qiu, Yoon Mo Kang, Christopher Korfmann, Fanny Pouyet, Andrew Eckford, Alexander F Palazzo
In vertebrates, most protein-coding genes have a peak of GC-content near their 5' transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigated the evolutionary forces shaping GC-content at the transcriptional
-
The dynamic genomes of Hydra and the anciently active repeat complement of animal chromosomes bioRxiv. Genom. Pub Date : 2024-03-14 Koto Kon-Nanjo, Tetsuo Kon, Tracy Chih-Ting Koubkova Yu, Diego Rodriguez-Terrones, Francisco Falcon, Daniel E. Martinez, Robert E. Steele, Elly Margaret Tanaka, Thomas W Holstein, Oleg Simakov
Many animal genomes are characterized by highly conserved chromosomal homologies that pre-date the ancient origin of this clade. Despite such conservation, the evolutionary forces behind the retention, expansion, and contraction of chromosomal elements, and the resulting macro-evolutionary implications, are unknown. Here we present a comprehensive stem-cell resolved genomic and transcriptomic study
-
The repetitive genome of the Ixodes ricinus tick reveals transposable elements have driven genome evolution in ticks bioRxiv. Genom. Pub Date : 2024-03-14 Isobel Ronai, Rodrigo de Paula Baptista, Nicole S. Paulat, Julia Frederick, Tal Azagi, Julian Bakker, Katie C. Dillon, Hein Sprong, David A Ray, Travis Glenn
Ticks are obligate blood-feeding parasites associated with a huge diversity of diseases globally. The hard tick Ixodes ricinus is the key vector of Lyme borreliosis and tick-borne encephalitis in Western Eurasia. Ixodes ticks have large and repetitive genomes that are not yet well characterized. Here we generate two high-quality I. ricinus genome assemblies, with haploid genome sizes of approximately
-
QClus: A droplet-filtering algorithm for enhanced snRNA-seq data quality in challenging samples bioRxiv. Genom. Pub Date : 2024-03-13 Eloi Schmauch, Johannes Ojanen, Kyriakitsa Galani, Juho Jalkanen, Kristiina Harju, Maija Hollmen, Hannu Kokki, Jarmo Gunn, Jari Halonen, Juha Hartikainen, Tuomas Kiviniemi, Pasi Tavi, Minna U Kaikkonen, Manolis Kellis, Suvi Linna-Kuosmanen
Single nuclei RNA sequencing (snRNA-seq) remains a challenge for many human tissues, as incomplete removal of background signal masks cell-type-specific signals and interferes with downstream analyses. Here, we present QClus, a droplet-filtering algorithm targeted toward challenging samples, using cardiac tissue as an example. QClus uses specific metrics such as cell-type-specific marker gene expression
-
Identification of novel myelodysplastic syndromes prognostic subgroups by integration of inflammation, cell-type composition, and immune signatures in the bone marrow bioRxiv. Genom. Pub Date : 2024-03-13 Sila Gerlevik, Shan Hama, Nogayhan Seymen, Warisha Mumtaz, Ian Richard Thompson, Seyed R Jalili, Deniz E Kaya, Alfredo Iacoangeli, Andrea Pellagatti, Jacqueline Boultwood, Giorgio Napolitani, Ghulam J. Mufti, Mohammad M Karimi
Mutational profiles of Myelodysplastic syndromes (MDS) have established that a relatively small number of genetic aberrations, including SF3B1 and SRSF2 spliceosome mutations, lead to specific phenotypes and prognostic subgrouping. We performed a Multi-Omics Factor Analysis (MOFA) on two published MDS cohorts of bone marrow mononuclear cells (BMMNCs) and CD34+ cells with three data modalities (clinical
-
Copy number variants underlie the major selective sweeps in insecticide resistance genes in Anopheles arabiensis from Tanzania. bioRxiv. Genom. Pub Date : 2024-03-13 Eric R Lucas, Sanjay C Nagi, Bilali Kabula, Bernard Batengana, William Kisinza, Alexander Egyir-Yawson, John Essandoh, Sam Dadzie, Joseph Chabi, Arjen E Van't Hof, Emily J Rippon, Dimitra Pipini, Nicholas J Harding, Naomi A Dyer, Chris S Clarkson, Alistair Miles, David Weetman, Martin J Donnelly
To keep ahead of the evolution of resistance to insecticides in mosquitoes, national malaria control programmes must make use of a range of insecticides, both old and new, while monitoring resistance mechanisms. Knowledge of the mechanisms of resistance remains limited in Anopheles arabiensis, which in many parts of Africa is of increasing importance because it is apparently less susceptible to many
-
NEMO: Improved and accurate models for identification of 6mA using Nanopore sequencing bioRxiv. Genom. Pub Date : 2024-03-13 Onkar Vasantrao Kulkarni, Lamuk Zaveri, Nitesh Kumar Singh, Reuben Jacob Mathew, Sreenivas Ara, Shambhavi Garde, Manjula Reddy, Karthik Bharadwaj Tallapaka, Divya Tej Sowpati
DNA methylation plays a key role in epigenetic regulation across lifeforms. Nanopore sequencing enables direct detection of base modifications. While multiple tools are currently available for studying 5-methylcytosine (5mC), there is a paucity of models that can detect 6-methyladenine (6mA) from raw nanopore data. Leveraging the motif-driven nature of bacterial methylation systems, we generated 6mA
-
Unravelling the Transcriptomic Symphony of Sarcopenia: Key Pathways and Hub Genes Altered by Muscle Ageing and Caloric Restriction Revealed by RNA Sequencing bioRxiv. Genom. Pub Date : 2024-03-13 Gulam Altab, Brian J. Merry, Charles W. Beckett, Priyanka Raina, Ines Lopes, Katarzyna Goljanek-Whysall, Joao Pedro de Magalhaes
Sarcopenia is a disease involving extensive loss of muscle mass and strength with age and is a major cause of disability and accidents in the elderly. Mechanisms purported to be involved in muscle ageing and sarcopenia are numerous but poorly understood, necessitating deeper study. Hence, we employed high-throughput RNA sequencing to explicate the global changes in protein-coding gene expression occurring
-
Transcriptional bursting, gene activation, and roles of SAGA and Mediator Tail measured using nucleotide recoding single cell RNA-seq bioRxiv. Genom. Pub Date : 2024-03-13 Jeremy A Schofield, Steven Hahn
A time resolved nascent single-cell RNA-seq approach was developed to dissect gene-specific transcriptional bursting and the roles of SAGA and Mediator Tail (the activator-binding module). Most yeast genes show near-constitutive behavior while only a subset of genes show high mRNA variance suggestive of transcription bursting. Bursting behavior is highest in the coactivator redundant (CR) gene class
-
Chromosome-scale genome assembly of Apocynum pictum, a drought-tolerant medicinal plant from the Tarim Basin bioRxiv. Genom. Pub Date : 2024-03-13 Wenlong Xie, Baowei Bai, Yanqing Wang
Apocynum pictum Schrenk is a semi-shrub of the Apocynaceae family with a wide distribution throughout the Tarim Basin that holds significant ecological, medicinal, and economic values. Here, we report the assembly of its chromosome-level reference genome using Nanopore long-read, Illumina HiSeq paired-end, and high-throughput chromosome conformation capture sequencing. The final assembly is 225.32
-
Genomic context sensitizes regulatory elements to genetic disruption bioRxiv. Genom. Pub Date : 2024-03-12 Raquel Ordoñez, Weimin Zhang, Gwen Ellis, Yinan Zhu, Hannah J Ashe, André M Ribeiro-dos-Santos, Ran Brosh, Emily Huang, Megan S Hogan, Jef D Boeke, Matthew T Maurano
Enhancer function is frequently investigated piecemeal using truncated reporter assays or single deletion analysis. Thus it remains unclear to what extent enhancer function at native loci relies on surrounding genomic context. Using the Big-IN technology for targeted integration of large DNAs, we analyzed the regulatory architecture of the murine Igf2/H19 locus, a paradigmatic model of enhancer selectivity
-
Globally distributed bacteriophage genomes reveal mechanisms of tripartite phage-bacteria-coral interactions bioRxiv. Genom. Pub Date : 2024-03-12 Bailey A. Wallace, Natascha S Varona, Poppy Hesketh-Best, Cynthia Silveira
Reef-building corals depend on an intricate community of microorganisms for functioning and resilience. Bacteriophages are the most abundant and diverse members of these communities, yet very little is known about their functions in the holobiont due to methodological limitations that have prevented the recovery of high-quality viral genomes and bacterial host assignment from coral samples. Here, we
-
De novo assembly and characterization of a highly degenerated ZW sex chromosome in the fish Megaleporinus macrocephalus bioRxiv. Genom. Pub Date : 2024-03-12 Carolina Heloisa de Souza Borges, Ricardo Utsunomia, Alessandro M Varani, Marcela Uliano da Silva, Lieschen Valeria Guerra, Arno Juliano Butzge, John Fredy Gomez Agudelo, Shisley Manso, Milena Vieira Freitas, Raquel Belini Ariede, Vito Antonio Mastrochirico-Filho, Carolina Penaloza, Agustin Barria, Fabio Porto-Foresti, Fausto Foresti, Ricardo Shohei Hattori, Yann Guiguen, Ross D. Houston, Diogo Teruo
Background: Megaleporinus macrocephalus (piaucu) is a Neotropical fish within Characoidei that presents a well-established heteromorphic ZZ/ZW sex-determination system and thus, constitutes a good model for studying W and Z chromosomes in fishes. We used PacBio reads and Hi-C to assemble a chromosome-level reference genome for M. macrocephalus. We generated family segregation information to construct
-
Inference of Transcriptional Regulation From STARR-seq Data bioRxiv. Genom. Pub Date : 2024-03-12 Amin Safaeesirat, Hoda Taeb, Emirhan Tekoglu, Tunc Morova, Nathan Lack, Eldon Emberly
One of the primary regulatory processes in cells is transcription, during which RNA polymerase II (Pol-II) transcribes DNA into RNA. The binding of Pol-II to its site is regulated through interactions with transcription factors (TFs) that bind to DNA at enhancer cis-regulatory elements. Measuring the enhancer activity of large libraries of distinct DNA sequences is now possible using Massively Parallel
-
Revisiting Y-chromosome detection methods: R-CQ and KAMY efficiently identify Y chromosome sequences in Tephritidae insect pests bioRxiv. Genom. Pub Date : 2024-03-11 Dimitris Rallis, Konstantina T Tsoumani, Flavia Krsticevic, Philippos Aris Papathanos, Kostas D Mathiopoulos, Alexie Papanicolaou
Background: The repetitive and heterochromatic nature of Y chromosomes poses challenges for genome assembly methods which can lead to fragmented or misassembled scaffolds. While new sequencing technologies and assembly techniques becoming popular, tools for improving the generation of an accurate Y chromosome are limited, especially for species, such as insects, with a frequent occurrence of heterochromatic
-
Imaging genetics of language network functional connectivity reveals links with language-related abilities, dyslexia and handedness bioRxiv. Genom. Pub Date : 2024-03-11 Jitse S. Amelink, Merel C. Postema, Xiang-Zhen Kong, Dick Schijven, Amaia Carrion-Castillo, Sourena Soheili-Nezhad, Zhiqiang Sha, Barbara Molz, Marc Joliot, Simon E. Fisher, Clyde Francks
Language is supported by a distributed network of brain regions with a particular contribution from the left hemisphere. A multi-level understanding of this network requires studying its genetic architecture. We used resting-state imaging data from 29,681 participants (UK Biobank) to measure connectivity between 18 left-hemisphere regions involved in multimodal sentence-level processing, as well as
-
MotifScope: a multi-sample motif discovery and visualization tool for tandem repeats bioRxiv. Genom. Pub Date : 2024-03-11 Yaran Zhang, Marc Hulsman, Alex Salazar, Niccolo Tesi, Lydian Knoop, Sven van der Lee, Sanduni Wijesekera, Jana Krizova, Erik-Jan Kamsteeg, Henne Holstege
Tandem repeats (TRs) constitute a significant portion of the human genome, exhibiting high levels of polymorphism due to variations in size and motif composition. These variations have been associated with various neuropathological disorders, underscoring the clinical importance of TRs. Furthermore, the motif structure of these repeats can offer valuable insights into evolutionary dynamics and population
-
Scalable summary statistics-based heritability estimation method with individual genotype level accuracy bioRxiv. Genom. Pub Date : 2024-03-11 Moonseong Jeong, Ali Pazokitoroudi, Zhengtong Liu, Sriram Sankararaman
SNP heritability, the proportion of phenotypic variation explained by genotyped SNPs, is an important parameter in understanding the genetic architecture underlying various diseases and traits. Methods that aim to estimate SNP heritability from individual genotype and phenotype data are limited by their ability to scale to Biobank-scale datasets and by the restrictions in access to individual-level
-
Introducing CHiDO a No Code Genomic Prediction Software implementation for the Characterization & Integration of Driven Omics bioRxiv. Genom. Pub Date : 2024-03-11 Francisco Gonzalez, Julian Garcia-Abadillo, Diego Jarquin
Climate change represents a significant challenge to global food security by altering environmental conditions critical to crop growth. Plant breeders can play a key role in mitigating these challenges by developing more resilient crop varieties; however, these efforts require significant investments in resources and time. In response, it is imperative to use current technologies that assimilate large
-
A single cell RNA sequence atlas of the early Drosophila larval eye bioRxiv. Genom. Pub Date : 2024-03-10 Komal Kumar Bollepogu Raja, Kelvin Yeung, Yumei Li, Rui Chen, Graeme Mardon
The Drosophila eye has been an important model to understand principles of differentiation, proliferation, apoptosis and tissue morphogenesis. However, a single cell RNA sequence resource that captures gene expression dynamics from the initiation of differentiation to the specification of different cell types in the larval eye disc is lacking. Here, we report transcriptomic data from 13,000 cells that
-
A metagenomics pipeline reveals insertion sequence-driven evolution of the microbiota bioRxiv. Genom. Pub Date : 2024-03-09 Joshua M Kirsch, Andrew J Hryckowian, Breck A Duerkop
Insertion sequence (IS) elements are mobile genetic elements in bacterial genomes that support adaptation. We developed a database of IS elements coupled to a computational pipeline that identifies IS element insertions in the microbiota. We discovered that diverse IS elements insert into the genomes of intestinal bacteria regardless of human host lifestyle. These insertions target bacterial accessory
-
Discovering Root Causal Genes with High Throughput Perturbations bioRxiv. Genom. Pub Date : 2024-03-09 Eric V Strobl, Eric R Gamazon
Root causal gene expression levels -- or root causal genes for short -- correspond to the initial changes to gene expression that generate patient symptoms as a downstream effect. Identifying root causal genes is critical towards developing treatments that modify disease near its onset, but no existing algorithms attempt to identify root causal genes from data. RNA-sequencing (RNA-seq) data introduces
-
Signatures of transposon-mediated genome inflation, host specialization, and photoentrainment in Entomophthora muscae and allied entomophthoralean fungi bioRxiv. Genom. Pub Date : 2024-03-09 Jason E Stajich, Brian Lovett, Emily Lee, Angie M Macias, Ann E Hajek, Benjamin L de Bivort, Matt T Kasson, Henrik H De Fine Licht, Carolyn Elya
Despite over a century of observations, the obligate insect parasites within the order Entomophthorales remain poorly characterized at the genetic level. This is in part due to their large genome sizes and difficulty in obtaining sequenceable material. In this manuscript, we leveraged a recently-isolated, laboratory-tractable Entomophthora muscae isolate and improved long-read sequencing to obtain
-
Regional Plasmodium falciparum subpopulations and malaria transmission connectivity in Africa were detected with an enlarged panel of genome-wide microsatellite loci bioRxiv. Genom. Pub Date : 2024-03-08 Martha Anita Demba, Edwin Kamau, Jaishree Raman, Karim Mane, Lucas Amenga-Etego, Tobias Apinjo, Deus Judith Ishengoma, Lemu Golassa, Oumou Maiga, Anita Ghansa, Marielle Bouyou-Akotet, William Yavo, Milijoana Randrianarivelojosia, Fadel Muhammadou Diop, Eniyou Cheryll Oriero, David Jeffries, Umberto D'Alessandro, Abdoulaye Djimde, Alfred Amambua-Ngwa
Unravelling the genetic diversity of Plasmodium falciparum malaria parasite provides critical information on how populations are affected by interventions and the environment, especially the evolution of molecular markers associated with parasite fitness and adaptation to drugs and vaccines. This study expands previous studies based on small sets of microsatellite loci, which often showed limited substructure
-
HyperGen: Compact and Efficient Genome Sketching using Hyperdimensional Vectors bioRxiv. Genom. Pub Date : 2024-03-08 Weihong Xu, Po-Kai Hsu, Niema Moshiri, Shimeng Yu, Tajana Rosing
Motivation: Genomic distance estimation is a critical workload since exact computation for whole-genome similarity metrics such as Average Nucleotide Identity (ANI) incurs exhibitive runtime overhead. Genome sketching is a fast and memory-efficient solution to estimate ANI similarity by distilling representative kmers from the original sequences. In this work, we present HyperGen that improves accuracy
-
Addressing technical pitfalls in pursuit of molecular factors that mediate immunoglobulin gene regulation bioRxiv. Genom. Pub Date : 2024-03-08 Eric T Engelbrecht, Oscar L. Rodriguez, Corey T. Watson
The expressed antibody repertoire is a critical determinant of immune-related phenotypes. Antibody-encoding transcripts are distinct from other expressed genes because they are transcribed from somatically rearranged gene segments. Human antibodies are composed of two identical heavy and light chain polypeptides derived from genes in the immunoglobulin heavy chain (IGH) locus and one of two light chain