Elsevier

Mitochondrion

Volume 60, September 2021, Pages 121-128
Mitochondrion

Multifractal and cross-correlation analysis on mitochondrial genome sequences using chaos game representation

https://doi.org/10.1016/j.mito.2021.08.006Get rights and content

Highlights

  • We have studied the cross-correlation behaviour among 32 mitochondrial genomes.

  • Integrative CGR and MF-X-DFA approach was used to quantify the cross-correlations.

  • We found the existence of multifractal behaviour between all these time series.

  • Cluster analysis was performed to find the class affiliation.

Abstract

We characterized the multifractality and power-law cross-correlation of mitochondrial genomes of various species through the recently developed method which combines the chaos game representation theory and 2D-multifractal detrended cross-correlation analysis. In the present paper, we analyzed 32 mitochondrial genomes of different species and the obtained results show that all the analyzed data exhibit multifractal nature and power-law cross-correlation behaviour. Further, we performed a cluster analysis from the calculated scaling exponents to identify the class affiliation and its outcome is represented as a dendrogram. We suggest that this integrative approach may help the researchers to understand the phylogeny of any kingdom with their varying genome lengths and also this approach may find applications in characterizing the protein sequences, mRNA sequences, next-generation sequencing, and drug development, etc.

Introduction

Understanding the dynamics of complex and stretchy genomic sequences has always been intriguing as the subtle change in the sequence would bring about a considerable variation in the process of physiology, behaviour, anatomy, and evolution as a whole. Although it is cumbersome and complicated to address this notion, the advent of bioinformatics tools has not just made it easier but has given multifaceted solutions and perceptions to these problems. Genome sequences show fractal nature, and several methods were put forward to comprehend and visualize the dynamics of genomic sequences (Almassalha et al., 2017, Campbell et al., 1999, Sandberg et al., 2001). Chaos Game Representation (CGR) has been widely used to visualize linear DNA sequences in a graphical form since its inception (Jeffrey, 1990). It provides easy recognition of patterns in long nucleotide sequences and serves in an alignment-free and alignment-based comparison between different genomes (Joseph and Sasikumar, 2006). CGR was also applied to protein sequences and structures (Fiser et al., 1994).

Several methods such as structure–function, R/S analysis, average wavelet coefficient method, WTMM, detrended moving average analysis and its variants, detrended fluctuation analysis, and its variants, etc., were used to characterize the fractal nature and correlation behaviour in one dimensional and two-dimensional data sets (Hurst, 1951, Muzy et al., 1991, Peng et al., 1994, Simonsen et al., 1998, Guihong et al., 2001, Kantelhardt et al., 2002, Alessio et al., 2002, Patrick, 2002, Manimaran et al., 2005, Jose et al., 2006, Jose et al., 2008, Manimaran et al., 2008, Manimaran et al., 2009, Gu and Zhou, 2010, Zhou et al., 2013, Alpatov et al., 2013, Wang et al., 2015). Later, various multifractal cross-correlation analysis methods were introduced to explore the multifractal nature in the power-law cross-correlations between non-stationary time series (Podobnik and Stanley, 2008, Zhou, 2008, Podobnik et al., 2009, Jiang and Zhou, 2011, Kristoufek, 2011, Xie et al., 2015, Qian et al., 2015). The multifractal detrended cross-correlation analysis (MF-X-DFA) has been widely used in many fields, including biology, social sciences, physics, etc., as it has shown diverse applications. However, it is imperative to have the data sets of equal lengths to perform MF-X-DFA analysis (Manimaran and Narayana, 2018, Hema Sri Sai et al., 2019). Recently, Pal and co-workers developed an integrative approach to examine multifractal behaviour in the power-law cross-correlation by combining CGR and 2D MF-X-DFA to analyse the characteristic behaviour of coding and non-coding DNA sequences, which are of varying lengths in various prokaryotes (Pal et al., 2015). We have also examined some prokaryote genomes and showed the existence of multifractal nature and power-law cross-correlation behaviour among them (Pal et al., 2016).

In the current paper, we extended our approach to characterize the cross-correlation and multifractal nature of mitochondrial genomes in various species. Mitochondria, the powerhouse of the cell, are thought to be the lineage of an endosymbiotic α-proteobacterium that was submerged into a eukaryotic or archaebacteria-like cell (Bullerwell and Lang, 2005). Mitochondria mainly serve in energy homeostasis and have double-stranded circular DNA. Its genome serves for phylogenetic analysis besides the nuclear genome (van de Sande, 2012). Mitochondria show high fidelity for phylogenetic analysis as it has no recombination processes in contrast to the nuclear genome that shows high recombination and mutations (da Silva et al., 2020). On the other hand, in the context of the same reasons, mitochondria were considered the worst alternative for phylogenetic analysis (Galtier et al., 2009). However, several studies have shown the importance of the mitochondrial genome for phylogenetic analysis in vertebrates and invertebrates (da Silva et al., 2020, Avise, 2009). Most of the phylogenetic studies on mitochondria are on alignment-based methods. It is of our interest to use an alignment-free method to analyse the mitochondrial genome of various species and construct a phylogenetic tree using the novel approach of combining chaos game representation and two-dimensional multifractal detrended cross-correlation analyses.

Section snippets

Methods

We combined the Chaos game representation (CGR) algorithm and the two-dimensional multifractal detrended cross-correlation analysis (2D MF-X-DFA) method (Pal et al., 2015) to analyse mitochondrial genomes for class affiliation and classification. The details of the methodology steps are described here.

Results and discussion

We obtained the mitochondria genomes of various species from the National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov/). The details of the species are given in Table 1. Note that we have removed ‘N’ at the position ‘3107′ in the human mitochondria genome (NC_012920.1) before performing CGR analysis.

We applied CGR on all the mitochondrial genome sequences to get 2D images, and a couple of them are given as representation in Fig. 1. The frequency CGR matrix

Conclusions

In conclusion, by combining chaos game representation and 2-D multifractal detrended cross-correlation analysis, we observed that mitochondrial genomes of all the analysed species exhibit multifractal nature. Our work has corroborated that the mitochondrial genome can be used as a marker for phylogenetic study across the species and to generate a phylogenetic tree. However, thorough research is imperative within the genus or family for a detailed understanding of evolution. We suggest that the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The author PM would like to thank the Department of Science and Technology, Government of India, (DST-MATRICS GoI Project No. SERB/F/506/2019-2020 Dated 15th May 2019 for their financial support.

References (47)

  • R. Sandberg et al.

    Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier

    Genome Res.

    (2001)
  • H.J. Jeffrey

    Chaos game representation of gene structure

    Nucleic Acids Res.

    (1990)
  • J. Joseph et al.

    Chaos game representation for comparison of whole genomes

    BMC Bioinf.

    (2006)
  • H.E. Hurst

    Long-term storage capacity of reservoirs

    Trans. Am. Soc. Civ. Eng.

    (1951)
  • J.F. Muzy et al.

    Wavelets and multifractal formalism for singular signals: Application to turbulence data

    Phys. Rev. Lett.

    (1991)
  • C.K. Peng et al.

    Mosaic organization of DNA nucleotides

    Phys. Rev. E

    (1994)
  • I. Simonsen et al.

    Determination of the Hurst exponent by use of wavelet transforms

    Phys. Rev. E

    (1998)
  • Q.u. Guihong et al.

    Medical image fusion by wavelet transform modulus maxima

    Opt. Express

    (2001)
  • E. Alessio et al.

    Second-order moving average and scaling of stochastic time series

    Eur. Phys. J. B

    (2002)
  • T. Patrick

    Two-dimensional turbulence: A Physicist approach

    Phys. Rep.

    (2002)
  • P. Manimaran et al.

    Wavelet analysis and scaling properties of time series

    Phys. Rev. E

    (2005)
  • A.-R. Jose et al.

    Scaling properties of image textures: A detrending fluctuation analysis approach

    Phys. A

    (2006)
  • A.-R. Jose et al.

    Performance of a high-dimensional R/S method for Hurst exponent estimation

    Physica A Physica A

    (2008)
  • Cited by (3)

    • Chaos game in an extended hyperbolic plane

      2023, Theoretical and Mathematical Physics(Russian Federation)
    View full text