Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses

  1. Michael R. Brent1,2
  1. 1Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
  2. 2Department of Computer Science and Engineering, Washington University, St. Louis, Missouri 63130, USA;
  3. 3Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
  4. 4Calico Life Sciences LLC, South San Francisco, California 94080, USA
  1. 5 These authors contributed equally to this work.

  • Corresponding author: brent{at}wustl.edu
  • Abstract

    A high-confidence map of the direct, functional targets of each transcription factor (TF) requires convergent evidence from independent sources. Two significant sources of evidence are TF binding locations and the transcriptional responses to direct TF perturbations. Systematic data sets of both types exist for yeast and human, but they rarely converge on a common set of direct, functional targets for a TF. Even the few genes that are both bound and responsive may not be direct functional targets. Our analysis shows that when there are many nonfunctional binding sites and many indirect targets, nonfunctional sites are expected to occur in the cis-regulatory DNA of indirect targets by chance. To address this problem, we introduce dual threshold optimization (DTO), a new method for setting significance thresholds on binding and perturbation-response data, and show that it improves convergence. It also enables comparison of binding data to perturbation-response data that have been processed by network inference algorithms, which further improves convergence. The combination of dual threshold optimization and network inference greatly expands the high-confidence TF network map in both yeast and human. Next, we analyze a comprehensive new data set measuring the transcriptional response shortly after inducing overexpression of a yeast TF. We also present a new yeast binding location data set obtained by transposon calling cards and compare it to recent ChIP-exo data. These new data sets improve convergence and expand the high-confidence network synergistically.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.259655.119.

    • Freely available online through the Genome Research Open Access option.

    • Received November 27, 2019.
    • Accepted February 11, 2020.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server