Trends in Chemistry
Volume 2, Issue 7, July 2020, Pages 609-622
Journal home page for Trends in Chemistry

Opinion
Topology of Folded Molecular Chains: From Single Biomolecules to Engineered Origami

https://doi.org/10.1016/j.trechm.2020.04.009Get rights and content

Highlights

  • Circuit topology and knot theory are mathematically rigorous ways of describing the topology of a folded molecular chain. Conversions between topological states can be understood in terms of simple rules within developed mathematical frameworks.

  • The circuit topology of proteins and changes to their topology can be readily extracted from Protein Data Bank structures. The circuit topology of proteins underlies their evolution, folding, functionally relevant structures, and dynamics.

  • Knotted proteins exhibit distinct cellular, thermodynamic, and kinetic properties and are evolutionarily conserved. Studies of knotted polymers yield information about folding and molecular structure more generally.

  • Protein origami design principles were defined and provided in the form of a computational platform for the design of arbitrary complex CCPO polyhedra.

The topology of biological polymers such as proteins and nucleic acids is an important aspect of their 3D structure. Recently, two applications of topology to molecular chains have emerged as important theoretical developments that are beginning to find utility in heteropolymer characterization and design: namely, circuit topology (CT) and knot theory. Here, we review the application of these two theories to protein, RNA, and DNA/genome structure, focusing on connections to conventional 3D structural information and relevance to function and highlighting recent experimental findings. We conclude with a discussion of recent applications to molecular origami and engineering.

Section snippets

Topology: A Key Property to Disentangle Folding Complexity

Despite their apparent simplicity, linear heteropolymer chains may fold into distinct topologically diverse structures. In polymer chemistry, the diverse collection of linear polymers is supplemented by branched and cyclical structures, while in biological chemistry linear protein and nucleic acid chains adopt various topologies via chain folding. Folding involves rearrangements of the chain and the formation of contacts. In biology, we encounter a vast multiplicity of folded polymer

Knot Theory and CT: Basic Definitions

Formally, a knot is an embedding of the circle in 3D space. A knot may be equivalent (through stretching and bending operations, without allowing the knot to pass through itself) to the trivial knot, or circle, or to other knots with greater minimal numbers of crossings in their projections onto the plane. In contrast to proteins, RNA, and linear DNA, such knots lack a start and end point. However, linear molecules, on connecting the endpoints across an external arc traversing the 3D surface,

Topological Analysis of Proteins

Proteins, known as the primary machinery of life [14], often need to fold transiently or permanently into one or more specific spatial conformations, mostly driven by noncovalent interactions [15,16]. Among the unlimited possibilities of arrangements, a limited number of motifs and domains is exhibited by nature, evidencing some general rules that govern the complexities of protein structure [17]. Various theoretical methods, including knot theory [18]⁠, knotoids [19], and, recently, CT [5],

Topological Analysis of Nucleic Acids

Cellular nucleic acids often fold into globular structures to achieve function. Folding happens at various scales, from small RNA molecules to large eukaryotic genomes. Various topological concepts, including supercoiling, knot theory, and contact arrangement, have been developed to describe folded nucleic acids. In what follows, we summarize these developments and discuss how CT can be used as a universal topology framework.

Topology of Organic and Bioinspired Polymers

Advances in molecular-engineering-enabled synthesis of molecular knots and topological polymers have led the way towards applications in several fields, including chemical biology, medicine, and materials science.

Concluding Remarks and Future Perspectives

Contact-based CT and knot theory form two complementary frameworks for describing, understanding, and engineering linear biopolymers such as proteins and nucleic acids, as summarized in the Outstanding Questions. An important future development will be further integration of these two applied theories and the establishment of how they can be more generally utilized in prediction and design. Towards this goal, it is likely that machine learning and artificial intelligence (AI), including recent

References (91)

  • C. Cubeñas-Potts et al.

    Architectural proteins, transcription, and the three-dimensional organization of the genome

    FEBS Lett.

    (2015)
  • R.T. Dame et al.

    Bacterial chromatin: converging views at different scales

    Curr. Opin. Cell Biol.

    (2016)
  • Z. Tang

    CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription

    Cell

    (2015)
  • J.R. Dixon

    Chromatin domains: the unit of chromosome organization

    Mol. Cell

    (2016)
  • P. Dabrowski-Tumanski et al.

    Topological knots and links in proteins

    Proc. Natl. Acad. Sci. U. S. A.

    (2017)
  • A. Mugler

    Circuit topology of self-interacting chains: implications for folding and unfolding dynamics

    Phys. Chem. Chem. Phys.

    (2014)
  • E. Flapan

    Topological descriptions of protein folding

    Proc. Natl. Acad. Sci. U. S. A.

    (2019)
  • O. Schullian

    A circuit topology approach to categorizing changes in biomolecular structure

    Front. Phys.

    (2020)
  • D.D. Nguyen

    A review of mathematical representations of biomolecular data

    Phys. Chem. Chem. Phys.

    (2020)
  • K. Xia et al.

    Persistent homology analysis of protein structure, flexibility, and folding

    Int. J. Numer. Method. Biomed. Eng.

    (2014)
  • Z. Cang et al.

    TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions

    PLoS Comput. Biol.

    (2017)
  • D.D. Nguyen

    Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges

    J. Comput. Aided Mol. Des.

    (2019)
  • S.K. Verovšek et al.

    Extended topological persistence and contact arrangements in folded linear molecules

    Front. Appl. Math. Stat.

    (2016)
  • L. Brocchieri

    Protein length in eukaryotic and prokaryotic proteomes

    Nucleic Acids Res.

    (2005)
  • L. Mollica

    Binding mechanisms of intrinsically disordered proteins: theory, simulation, and experiment

    Front. Mol. Biosci.

    (2016)
  • D.-H. Kim et al.

    Transient secondary structures as general target-binding motifs in intrinsically disordered proteins

    Int. J. Mol. Sci.

    (2018)
  • J. Tubiana

    Learning protein constitutive motifs from sequence data

    Elife

    (2019)
  • R. Mishra et al.

    Knot theory in understanding proteins

    J. Math. Biol.

    (2012)
  • D. Goundaroulis

    Topological models for open-knotted protein chains using the concepts of knotoids and bonded knotoids

    Polymers (Basel)

    (2017)
  • J.I. Sułkowska

    Conservation of complex knotting and slipknotting patterns in proteins

    Proc. Natl Acad. Sci. U. S. A.

    (2012)
  • J.I. Sulkowska

    Conservation of complex knotting and slipknotting patterns in proteins

    Proc. Natl. Acad. Sci. U. S. A.

    (2012)
  • V.P. Patil

    Topological mechanics of knots and tangles

    Science

    (2020)
  • P. Dabrowski-Tumanski

    In search of functional advantages of knots in proteins

    PLoS One

    (2016)
  • K. Alexander

    Proteins analysed as virtual knots

    Sci. Rep.

    (2017)
  • T. Vladimir

    Knotoids

    Osaka J. Math.

    (2012)
  • D. Goundaroulis

    Studies of global and local entanglements of individual protein chains using the concept of knotoids

    Sci. Rep.

    (2017)
  • P. Dabrowski-Tumanski

    KnotProt 2.0: a database of proteins with knots and other entangled structures

    Nucleic Acids Res.

    (2019)
  • J. Dorier

    Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids

    Bioinformatics

    (2018)
  • T. Suren

    Single-molecule force spectroscopy reveals folding steps associated with hormone binding and activation of the glucocorticoid receptor

    Proc. Natl. Acad. Sci. U. S. A.

    (2018)
  • A. Mashaghi

    Misfolding of luciferase at the single-molecule level

    Angew. Chem. Int. Ed.

    (2014)
  • M. Heidari

    Mapping a single-molecule folding process onto a topological space

    Phys. Chem. Chem. Phys.

    (2019)
  • A. Mashaghi et al.

    Distance measures and evolution of polymer chains in their topological space

    Soft Matter

    (2015)
  • V. Satarifard

    Topology of polymer chains under nanoscale confinement

    Nanoscale

    (2017)
  • M. Heidari

    Topology of internally constrained polymer chains

    Phys. Chem. Chem. Phys.

    (2017)
  • A. Mashaghi

    Reshaping of the conformational search of a protein by the chaperone trigger factor

    Nature

    (2013)
  • Cited by (19)

    • Circuit topology analysis of cellular genome reveals signature motifs, conformational heterogeneity, and scaling

      2022, iScience
      Citation Excerpt :

      Both network topology and persistent homology are mostly focused on connectivity, which cannot describe the actual arrangement of the fold and provide a qualitative description of three-dimensional motifs. Our aim is to propose a topological toolbox based on circuit topology (CT) (Golovnev and Mashaghi, 2020; Heidari et al., 2020; Mashaghi et al., 2014; Scalvini et al., 2020; Schullian et al., 2020), capable of detecting not only recurring topological features in genome structure but also of quantifying cell-to-cell variability. CT is the only topology framework for folded linear polymers that categorizes the arrangement of polymer loops or their associated contacts and complements the well-established knot theory (where contacts are typically ignored).

    • Macromolecular Topology Engineering

      2021, Trends in Chemistry
      Citation Excerpt :

      In biology, topology is often loosely used to describe spatial relationships. For example, genome topology refers to the spatial genome organization as shaped by the long-range interactions of chromatins in the intact cell nucleus and protein topology refers to the mutual orientations of secondary structural elements of proteins in 3D space [3,4]. In chemistry, since its first introduction by Frisch and Wasserman in 1961 [5], chemical topology has grown into a diverse and popular research field covering rather broad and distinct topics, including the supercoiling of macromolecules, shapes of curves and surfaces, and molecules with nonplanar graphs [6–11].

    • Generalized Circuit Topology of Folded Linear Chains

      2020, iScience
      Citation Excerpt :

      In order to describe the immense structural diversity of proteins, nucleic acids or other linear molecular chains, the concept of circuit topology was recently introduced to formally categorize the arrangement of intrachain contacts (Heidari et al., 2020) (Scalvini et al., 2020) (Mashaghi et al., 2014).

    View all citing articles on Scopus
    View full text