Databases/Web Servers
DEPICTER: Intrinsic Disorder and Disorder Function Prediction Server

https://doi.org/10.1016/j.jmb.2019.12.030Get rights and content

Highlights

  • New DisorderEd PredictIon CenTER (DEPICTER) server was developed.

  • DEPICTER provides accurate predictions of disorder and disorder functions.

  • DEPICTER covers protein and nucleic acids binding, linker, and moonlighting functions.

  • DEPICTER makes predictions in under 1 min for an average-size protein sequence.

Abstract

Computational predictions of the intrinsic disorder and its functions are instrumental to facilitate annotation for the millions of unannotated proteins. However, access to these predictors is fragmented and requires substantial effort to find them and to collect and combine their results. The DEPICTER (DisorderEd PredictIon CenTER) server provides first-of-its-kind centralized access to 10 popular disorder and disorder function predictions that cover protein and nucleic acids binding, linkers, and moonlighting regions. It automates the prediction process, runs user-selected methods on the server side, visualizes the results, and outputs all predictions in a consistent and easy-to-parse format. DEPICTER also includes two accurate consensus predictors of disorder and disordered protein binding. Empirical tests on an independent (low similarity) benchmark dataset reveal that the computational tools included in DEPICTER generate accurate predictions that are significantly better than the results secured using sequence alignment. The DEPICTER server is freely available at http://biomine.cs.vcu.edu/servers/DEPICTER/.

Introduction

Intrinsic disordered proteins (IDPs) and intrinsically disordered protein regions (IDRs) lack stable tertiary structure and form dynamic conformational ensembles under physiological conditions [[1], [2], [3]]. Recent computational studies estimate that they are highly abundant in nature, with up to 17% of eukaryotic proteins (depending on an organism) that are entirely disordered [4] and between 30 and 50% that have at least one long IDR (≥30 consecutive residues) [5,6]. IDPs and IDRs are instrumental for a wide range of cellular functions that include signaling and molecular recognition [7,8], translation [[9], [10], [11]], regulation of transcription [12], cell death processes [[13], [14], [15]], innate immune response [16], viral life cycle [[17], [18], [19]], and many others. They are implicated in the dark proteome [20,21], often found to contribute to human diseases [22,23], and are being considered as attractive new class of targets for drug discovery [24,25]. However, experimental annotations of IDRs and their functions are limited to only about 1600 proteins that are stored in the DisProt database [26,27]. This gave rise to the development of a large collection of over 70 computational methods that predict IDRs and IDPs from the protein sequences [3,[28], [29], [30], [31], [32], [33]]. Recent empirical studies have shown that some of these methods provide highly accurate predictions [[34], [35], [36], [37]]. Moreover, two dozen computational tools that predict several functions of IDRs were published and released over the last decade [32,38,39]. These methods address sequence-based prediction of molecular partners that interact with IDRs, including proteins, DNA and RNAs, and a selected set of cellular functions, such as flexible linkers and moonlighting regions. These computational predictors can be used to accurately and in cost- and runtime-efficient way predict and functionally annotate IDPs and IDRs for the millions of proteins that lack these annotations.

The predictions produced by these methods can be collected either by the means of their webservers/implementations provided and supported by the authors and by using popular and large databases of precomputed predictions: D2P2 [40] and MobiDB [41]. Both databases offer access to results generated by a large pool of disorder predictors for millions of sequences proteins. More specifically, D2P2 covers 9 disorder predictions for about 10.5 million proteins, while MobiDB provides 10 disorder predictions for the contents of the October 2017 version of the UniProt repository [42], which includes around 90 million proteins. While these two resources provide unrivaled coverage of the disorder predictions, they offer a rather limited selection of the disorder function predictions that includes only the protein-binding regions predicted by the ANCHOR method [43]. Moreover, they are constrained to the set of proteins that they currently include. This means that the users must use the webservers or implementations of the available predictors for the huge number of proteins that are not currently included in these databases. More specifically, the current version of UniProt includes over 171 million proteins compared to 90 million in MobiDB. Collection of these predictions is rather difficult since it demands finding their websites/implementations, running the predictions using multiple different interfaces that require reformatting the input protein data, and assembling the results that are provided in a wide range of formats. The aforementioned issues can be alleviated by the development of a predictive resource that provides integrated access to a comprehensive set of disorder and disorder function predictors. While there are no such resources for the disorder prediction, they are already available for the prediction of various aspects of protein structure, including PSIPRED workbench [44], SCRATCH [45], PredictProtein [46], and MULTICOM [47].

To this end, we provide first-of-its-kind webserver for the sequence-based prediction of intrinsic disorder and disorder functions. The DEPICTER (DisorderEd PredictIon CenTER) server integrates predictions of intrinsic disorder and several disorder functions using several popular and runtime-efficient methods. The server automates the entire process of prediction across all these tools, visualizes the results, and outputs an easy-to-parse text file that provides all predictions in a consistent format. DEPICTER is freely available at http://biomine.cs.vcu.edu/servers/DEPICTER/.

Section snippets

Selection of predictors for inclusion into the DEPICTER webserver

There are over 70 disorder predictors and another 25 predictors of disorder function [28,29,[31], [32], [33],38]. It would be infeasible and unnecessary to develop a platform that provides access to all these tools. We select a collection of fast (needing short runtime to make predictions), recently published and empirically shown to provide accurate predictions tools that predict disorder and that provide a comprehensive coverage of the currently predicted disorder functions. In total, the

Assessment of predictive quality

Table 1 summarizes the quality of the results produced by the predictors included in the DEPICTER server on the test dataset. This dataset shares low (<30%) similarity to the proteins that were used to develop these predictors. We compare the results generated by the server against a sequence alignment-based prediction. Details concerning the calculation of the alignment-based predictor and the experimental setup are described in the Supplement. Table 1 reveals that each of the 10 predictors

Summary

The access to the current methods that predict disorder and disorder functions is currently fragmented and requires substantial amount of effort. Users must find and visit multiple websites that require inputs in different formats, collect their results, and reformat and combine these results. The exception are the two comprehensive databases of disorder predictions, D2P2 and MobiDB. However, they offer limited scope of the function predictions and are constrained to a subset of proteins that

Acknowledgments

This research was supported in part by the United States National Science Foundation (grant 1617369) and the Robert J. Mattauch Endowment funds to LK, and by the Australian Research Council (DP180102060) to YZ and KP.

References (69)

  • H.J. Dyson et al.

    Intrinsically unstructured proteins and their functions

    Nat. Rev. Mol. Cell Biol.

    (2005)
  • V.N. Uversky et al.

    Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling

    J. Mol. Recognit.

    (2005)
  • Z. Peng et al.

    A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome

    Cell. Mol. Life Sci.

    (2014)
  • Z. Peng et al.

    More than just tails: intrinsic disorder in histone proteins

    Mol. Biosyst.

    (2012)
  • C. Wang et al.

    Disordered nucleiome: abundance of intrinsic disorder in the DNA- and RNA-binding proteins in 1121 species from Eukaryota, Bacteria and Archaea

    Proteomics

    (2016)
  • J. Liu et al.

    Intrinsic disorder in transcription factors

    Biochemistry

    (2006)
  • I. Na et al.

    Autophagy-related intrinsically disordered proteins in intra-nuclear compartments

    Mol. Biosyst.

    (2016)
  • Z. Peng et al.

    Resilience of death: intrinsic disorder in proteins involved in the programmed cell death

    Cell Death Differ.

    (2013)
  • A.V. Uversky et al.

    On the intrinsic disorder status of the major players in programmed cell death pathways

    F1000Res

    (2013)
  • B. Xue et al.

    Protein intrinsic disorder as a flexible armor and a weapon of HIV-1

    Cell. Mol. Life Sci.

    (2012)
  • X. Fan et al.

    The intrinsic disorder status of the human hepatitis C virus proteome

    Mol. Biosyst.

    (2014)
  • F. Meng et al.

    Unstructural biology of the Dengue virus proteins

    FEBS J.

    (2015)
  • G. Hu et al.

    Taxonomic landscape of the dark proteomes: whole-proteome scale interplay between structural darkness, intrinsic disorder, and crystallization propensity

    Proteomics

    (2018)
  • A. Bhowmick et al.

    Finding our way in the dark proteome

    J. Am. Chem. Soc.

    (2016)
  • V.N. Uversky et al.

    Pathological unfoldomics of uncontrolled chaos: intrinsically disordered proteins and human diseases

    Chem. Rev.

    (2014)
  • V.N. Uversky et al.

    Intrinsically disordered proteins in human diseases: introducing the D2 concept

    Annu. Rev. Biophys.

    (2008)
  • G. Hu et al.

    Untapped potential of disordered proteins in current druggable human proteome

    Curr. Drug Targets

    (2016)
  • S. Ambadipudi et al.

    Targeting intrinsically disordered proteins in rational drug discovery

    Expert Opin. Drug Discov.

    (2015)
  • D. Piovesan et al.

    DisProt 7.0: a major update of the database of disordered proteins

    Nucleic Acids Res.

    (2016)
  • M. Sickmeier et al.

    DisProt: the database of disordered proteins

    Nucleic Acids Res.

    (2007)
  • B. He et al.

    Predicting intrinsic disorder in proteins: an overview

    Cell Res.

    (2009)
  • Z. Dosztányi et al.

    Bioinformatical approaches to characterize intrinsically disordered/unstructured proteins

    Briefings Bioinf.

    (2010)
  • M. Pentony et al.

    Computational resources for the prediction and analysis of native disorder in proteins

  • X. Deng et al.

    A comprehensive overview of computational protein disorder prediction methods

    Mol. Biosyst.

    (2012)
  • Cited by (46)

    • Computational prediction of disordered binding regions

      2023, Computational and Structural Biotechnology Journal
    • Resources for computational prediction of intrinsic disorder in proteins

      2022, Methods
      Citation Excerpt :

      The key benefit of having access to multiple results is that they can be assessed for convergence in order to boost confidence in the resulting disorder prediction. The latter is motivated by a few studies that empirically demonstrate that the use of a consensus-based disorder prediction typically leads to an improved predictive performance [42,110-113]. We found four meta webservers, which we summarize in Table 2.

    View all citing articles on Scopus

    These authors contributed equally.

    View full text