Gene prediction in the immunoglobulin loci

  1. Pavel A. Pevzner1
  1. 1Computer Science and Engineering Department, University of California San Diego, San Diego, California 92093, USA;
  2. 2The Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
  • Corresponding author: ppevzner{at}ucsd.edu
  • Abstract

    The V(D)J recombination process rearranges the variable (V), diversity (D), and joining (J) genes in the immunoglobulin (IG) loci to generate antibody repertoires. Annotation of these loci across various species and predicting the V, D, and J genes (IG genes) are critical for studies of the adaptive immune system. However, because the standard gene finding algorithms are not suitable for predicting IG genes, they have been semimanually annotated in very few species. We developed the IGDetective algorithm for predicting IG genes and applied it to species with the assembled IG loci. IGDetective generated the first large collection of IG genes across many species and enabled their evolutionary analysis, including the analysis of the “bat IG diversity” hypothesis. This analysis revealed extremely conserved V genes in evolutionary distant species, indicating that these genes may be subjected to the same selective pressure, for example, pressure driven by common pathogens. IGDetective also revealed extremely diverged V genes and a new family of evolutionary conserved V genes in bats with unusual noncanonical cysteines. Moreover, unlike all other previously reported antibodies, these cysteines are located within complementarity-determining regions. Because cysteines form disulfide bonds, we hypothesize that these cysteine-rich V genes might generate antibodies with noncanonical conformations and could potentially form a unique part of the immune repertoire in bats. We also analyzed the diversity landscape of the recombination signal sequences and revealed their features that trigger the high/low usage of the IG genes.

    Footnotes

    • Received February 9, 2022.
    • Accepted May 6, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server