A proximity-dependent biotinylation map of a human cell

Go, Christopher D.; Knight, James D. R.; Rajasekharan, Archita; Rathod, Bhavisha; Hesketh, Geoffrey G.; Abe, Kento T.; Youn, Ji-Young; Samavarchi-Tehrani, Payman; Zhang, Hui; Zhu, Lucie Y.; Popiel, Evelyn; Lambert, Jean-Philippe; Coyaud, Étienne; Cheung, Sally W. T.; Rajendran, Dushyandi; Wong, Cassandra J.; Antonicka, Hana; Pelletier, Laurence; Palazzo, Alexander F.; Shoubridge, Eric A.; Raught, Brian; Gingras, Anne-Claude

doi:10.1038/s41586-021-03592-2

Article
Published: 02 June 2021

A proximity-dependent biotinylation map of a human cell

Nature volume 595, pages 120–124 (2021)Cite this article

49k Accesses
184 Citations
339 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 11 January 2022

This article has been updated

Abstract

Compartmentalization is a defining characteristic of eukaryotic cells, and partitions distinct biochemical processes into discrete subcellular locations. Microscopy¹ and biochemical fractionation coupled with mass spectrometry^2,3,4 have defined the proteomes of a variety of different organelles, but many intracellular compartments have remained refractory to such approaches. Proximity-dependent biotinylation techniques such as BioID provide an alternative approach to define the composition of cellular compartments in living cells^5,6,7. Here we present a BioID-based map of a human cell on the basis of 192 subcellular markers, and define the intracellular locations of 4,145 unique proteins in HEK293 cells. Our localization predictions exceed the specificity of previous approaches, and enabled the discovery of proteins at the interface between the mitochondrial outer membrane and the endoplasmic reticulum that are crucial for mitochondrial homeostasis. On the basis of this dataset, we created humancellmap.org as a community resource that provides online tools for localization analysis of user BioID data, and demonstrate how this resource can be used to understand BioID results better.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Generation and analysis of BioID dataset, and validation strategy.**

**Fig. 2: Localization of proteins using prey-centric analysis.**

**Fig. 3: Discovery of new connections using the humancellmap.**

Subcellular proteomics

Article 29 April 2021

Josie A. Christopher, Charlotte Stadler, … Kathryn S. Lilley

Deep and fast label-free Dynamic Organellar Mapping

Article Open access 29 August 2023

Julia P. Schessner, Vincent Albrecht, … Georg H. H. Borner

Bioorthogonal masked acylating agents for proximity-dependent RNA labelling

Article 09 April 2024

Shubhashree Pani, Tian Qiu, … Bryan C. Dickinson

Data availability

Mass spectrometry datasets consisting of raw files and associated peak lists and results files have been deposited in ProteomeXchange through partner Mass spectrometry Interactive Virtual Environment MassIVE (http://proteomics.ucsd.edu/ProteoSAFe/datasets.jsp) as complete submissions. Other files include the sample description, the peptide/protein evidence and the complete SAINTexpress output for each dataset, as well as a ‘README’ file that describes the dataset composition and the experimental procedures associated with each submission. The different datasets generated here were submitted as independent entries.

Dataset 1 (Supplementary Table 2): Go_BioID_humancellmap_HEK293_lowSDS_core_data set_2019 MassIVE ID MSV000084359 and PXD015530. Dataset 2 (Supplementary Table 2): Go_BioID_humancellmap_HEK293_ highSDS_core_data set _2019 MassIVE ID MSV000084360 and PXD015531. Dataset 3 (Supplementary Table 18): Go_BioID_humancellmap_HEK293_prediction_2019 MassIVE ID MSV000084369 and PXD015554. Dataset 4 (Supplementary Table 17): Go_BioID_humancellmap_HEK293_ER-mito_candidates_2019 MassIVE ID MSV000084357 and PXD015528.

Negative-control samples were deposited in the Contaminant Repository for Affinity Purification⁴⁶ (CRAPome.org) and assigned samples numbers CC1100 to CC1185 (Supplementary Table 2); this will be part of the next release of the database.

The BioGRID⁴⁷ human database v3.5.169 was downloaded on 13 February 2019 (https://downloads.thebiogrid.org/BioGRID/Release-Archive/BIOGRID-3.5.169/). Human gene annotations were downloaded from the GO on 15 February 2019 (GO version date 1 February 2019, http://release.geneontology.org/2019-02-01/annotations/goa_human.gaf.gz). The GO hierarchy (release date 1 February 2019) was downloaded from GO^48,49 on 15 February 2019 (http://release.geneontology.org/2019-02-01/ontology/go-basic.obo). The UniProt database⁵⁰ release 2019_2 was downloaded on 21 February 2019 (ftp.uniprot.org/pub/databases/uniprot/previous_releases/release-2019_02/knowledgebase/uniprot_sprot-only2019_02.tar.gz). The IntAct⁵¹ human database release 2018_11_30 was downloaded on 13 February 2019 (ftp.ebi.ac.uk/pub/databases/intact/2018-11-30/psimitab/intact.txt). Human protein domain annotations and motifs were retrieved from Pfam⁵² (version 32) on 21 February 2019 (ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam32.0/proteomes/9606.tsv.gz). ProteomicsDB⁵³ was queried for protein expression information on 14 January 2020 using their API. Text mining data was downloaded from the Compartments database⁵⁴ on 21 January 20 (https://compartments.jensenlab.org/Downloads). Source data are provided with this paper.

Code availability

Source code used for analysis can be accessed from https://github.com/knightjdr/cellmap-scripts.

Change history

11 January 2022
A Correction to this paper has been published: https://doi.org/10.1038/s41586-021-04308-2

References

Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).
Article PubMed Google Scholar
Christoforou, A. et al. A draft map of the mouse pluripotent stem cell spatial proteome. Nat. Commun. 7, 8992 (2016).
Article PubMed Google Scholar
Itzhak, D. N., Tyanova, S., Cox, J. & Borner, G. H. Global, quantitative and dynamic mapping of protein subcellular localization. eLife 5, e16950 (2016).
Article PubMed PubMed Central Google Scholar
Orre, L. M. et al. SubCellBarCode: Proteome-wide mapping of protein localization and relocalization. Mol. Cell 73, 166–182 (2019).
Article CAS PubMed Google Scholar
Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gupta, G. D. et al. A dynamic protein interaction landscape of the human centrosome-cilium interface. Cell 163, 1484–1499 (2015).
Article CAS PubMed PubMed Central Google Scholar
Youn, J. Y. et al. High-density proximity mapping reveals the subcellular organization of mRNA-associated granules and bodies. Mol. Cell 69, 517–532 (2018).
Article CAS PubMed Google Scholar
Rhee, H. W. et al. Proteomic mapping of mitochondria in living cells via spatially restricted enzymatic tagging. Science 339, 1328–1331 (2013).
Article CAS PubMed PubMed Central ADS Google Scholar
Gingras, A. C., Abe, K. T. & Raught, B. Getting to know the neighborhood: using proximity-dependent biotinylation to characterize protein complexes and map organelles. Curr. Opin. Chem. Biol. 48, 44–54 (2019).
Article CAS PubMed Google Scholar
Kim, D. I. et al. Probing nuclear pore complex architecture with proximity-dependent biotinylation. Proc. Natl Acad. Sci. USA 111, E2453–E2461 (2014).
Article CAS PubMed PubMed Central Google Scholar
Antonicka, H. et al. A high-density human mitochondrial proximity interaction network. Cell Metab. 32, 479–497 (2020).
Article CAS PubMed Google Scholar
Botham, A. et al. Global interactome mapping of mitochondrial intermembrane space proteases identifies a novel function for HTRA2. Proteomics 19, e1900139 (2019).
Article PubMed Google Scholar
Chapple, C. E. et al. Extreme multifunctional proteins identified from a human protein interaction network. Nat. Commun. 6, 7412 (2015).
Article PubMed ADS Google Scholar
Eisenberg-Bord, M., Shai, N., Schuldiner, M. & Bohnert, M. A tether is a tether is a tether: tethering at membrane contact sites. Dev. Cell 39, 395–409 (2016).
Article CAS PubMed Google Scholar
Branon, T. C. et al. Efficient proximity labeling in living cells and organisms with TurboID. Nat. Biotechnol. 36, 880–887 (2018).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. & Min, K. T. The interface between ER and mitochondria: molecular compositions and functions. Mol. Cells 41, 1000–1007 (2018).
CAS PubMed PubMed Central Google Scholar
Prudent, J. & McBride, H. M. The mitochondria-endoplasmic reticulum contact sites: a signalling platform for cell death. Curr. Opin. Cell Biol. 47, 52–63 (2017).
Article CAS PubMed Google Scholar
Rowland, A. A. & Voeltz, G. K. Endoplasmic reticulum-mitochondria contacts: function of the junction. Nat. Rev. Mol. Cell Biol. 13, 607–625 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ackema, K. B. et al. Sar1, a novel regulator of ER-mitochondrial contact sites. PLoS ONE 11, e0154280 (2016).
Article PubMed PubMed Central Google Scholar
Kalia, R. et al. Structural basis of mitochondrial receptor binding and constriction by DRP1. Nature 558, 401–405 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar
Korobova, F., Ramabhadran, V. & Higgs, H. N. An actin-dependent step in mitochondrial fission mediated by the ER-associated formin INF2. Science 339, 464–467 (2013).
Article CAS PubMed ADS Google Scholar
Bersuker, K. et al. A proximity labeling strategy provides insights into the composition and dynamics of lipid droplet proteomes. Dev. Cell 44, 97–112 (2018).
Article CAS PubMed Google Scholar
Xu, S. et al. Mitochondrial E3 ubiquitin ligase MARCH5 controls mitochondrial fission and cell sensitivity to stress-induced apoptosis through regulation of MiD49 protein. Mol. Biol. Cell 27, 349–359 (2016).
Article CAS PubMed PubMed Central Google Scholar
Chatr-Aryamontri, A. et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45 (D1), D369–D379 (2017).
Article CAS PubMed Google Scholar
Lambert, J. P. et al. Interactome rewiring following pharmacological targeting of BET bromodomains. Mol. Cell 73, 621–638 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kastritis, P. L. et al. Capturing protein communities by structural proteomics in a thermophilic eukaryote. Mol. Syst. Biol. 13, 936 (2017).
Article PubMed PubMed Central Google Scholar
Liu, X. et al. An AP-MS- and BioID-compatible MAC-tag enables comprehensive mapping of protein interactions and subcellular localizations. Nat. Commun. 9, 1188 (2018).
Article PubMed PubMed Central ADS Google Scholar
Omasits, U., Ahrens, C. H., Müller, S. & Wollscheid, B. Protter: interactive protein feature visualization and integration with experimental proteomic data. Bioinformatics 30, 884–886 (2014).
Article CAS PubMed Google Scholar
Couzens, A. L. et al. Protein interaction network of the mammalian Hippo pathway reveals mechanisms of kinase-phosphatase interactions. Sci. Signal. 6, rs15 (2013).
Article PubMed Google Scholar
Banks, C. A., Boanca, G., Lee, Z. T., Florens, L. & Washburn, M. P. Proteins interacting with cloning scars: a source of false positive protein-protein interactions. Sci. Rep. 5, 8530 (2015).
Article CAS PubMed PubMed Central ADS Google Scholar
Allen, M. D. & Zhang, J. Subcellular dynamics of protein kinase A activity visualized by FRET-based reporters. Biochem. Biophys. Res. Commun. 348, 716–721 (2006).
Article CAS PubMed Google Scholar
Kessner, D., Chambers, M., Burke, R., Agus, D. & Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24, 2534–2536 (2008).
Article CAS PubMed PubMed Central Google Scholar
Shteynberg, D. et al. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell Proteomics 10, M111.007690 (2011).
Article PubMed PubMed Central Google Scholar
Liu, G. et al. ProHits: integrated software for mass spectrometry-based interaction proteomics. Nat. Biotechnol. 28, 1015–1017 (2010).
Article CAS PubMed PubMed Central Google Scholar
Eng, J. K., Jahan, T. A. & Hoopmann, M. R. Comet: an open-source MS/MS sequence database search tool. Proteomics 13, 22–24 (2013).
Article CAS Google Scholar
Keller, A., Nesvizhskii, A. I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).
Article CAS PubMed Google Scholar
Nesvizhskii, A. I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).
Article CAS PubMed Google Scholar
Teo, G. et al. SAINTexpress: improvements and additional features in Significance Analysis of INTeractome software. J. Proteomics 100, 37–43 (2014).
Article CAS PubMed Google Scholar
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47 (W1), W191–W198 (2019).
Article CAS PubMed PubMed Central Google Scholar
Knight, J. D. R. et al. ProHits-viz: a suite of web tools for visualizing interaction proteomics data. Nat. Methods 14, 645–646 (2017).
Article CAS PubMed PubMed Central Google Scholar
Baryshnikova, A. Spatial Analysis of Functional Enrichment (SAFE) in large biological networks. Methods Mol. Biol. 1819, 249–268 (2018).
Article CAS PubMed Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Lee, D. D. & Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999).
Article CAS PubMed ADS MATH Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
van der Maaten, L. J. P. & Hinton, G. E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
MATH Google Scholar
Uhlén, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Article PubMed Google Scholar
Mellacheruvu, D. et al. The CRAPome: a contaminant repository for affinity purification-mass spectrometry data. Nat. Methods 10, 730–736 (2013).
Article CAS PubMed PubMed Central Google Scholar
Stark, C. et al. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, D535–D539 (2006).
Article CAS PubMed Google Scholar
The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Article PubMed Central Google Scholar
The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45 (D1), D331–D338 (2017).
Article Google Scholar
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47 (D1), D506–D515 (2019).
Article Google Scholar
Orchard, S. et al. The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–D363 (2014).
Article CAS PubMed Google Scholar
Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44 (D1), D279–D285 (2016).
Article CAS PubMed Google Scholar
Samaras, P. et al. ProteomicsDB: a multi-omics and multi-organism resource for life science research. Nucleic Acids Res. 48 (D1), D1153–D1163 (2020).
CAS PubMed Google Scholar
Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database (Oxford) 2014, bau012 (2014).
Article Google Scholar
Zecha, J. et al. Peptide level turnover measurements enable the study of proteoform dynamics. Mol. Cell. Proteomics 17, 974–992 (2018).
Article CAS PubMed PubMed Central Google Scholar
Burkhardt, J. K. In search of membrane receptors for microtubule-based motors - is kinectin a kinesin receptor? Trends Cell Biol. 6, 127–131 (1996).
Article CAS PubMed Google Scholar
St-Denis, N. et al. Phenotypic and interaction profiling of the human phosphatases identifies diverse mitotic regulators. Cell Rep. 17, 2488–2501 (2016).
Article CAS PubMed Google Scholar
Li, X. et al. Defining the protein-protein interaction network of the human protein tyrosine phosphatase family. Mol. Cell. Proteomics 15, 3030–3044 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rasila, T. et al. Astroprincin (FAM171A1, C10orf38): a regulator of human cell shape and invasive growth. Am. J. Pathol. 189, 177–189 (2019).
Article CAS PubMed Google Scholar
Monticone, M. et al. The nuclear genes Mtfr1 and Dufd1 regulate mitochondrial dynamic and cellular respiration. J. Cell. Physiol. 225, 767–776 (2010).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank Z.-Y. Lin for profiling PIK3R1 and members of the Gingras laboratory for discussion and advice throughout the project, Q. Morris and S. W. M. Eng for help with NMF, and J. Zhang and many cell biologists for suggestions about bait selection. Work in the Gingras laboratory was supported by a Canadian Institutes of Health Research (CIHR) Foundation Grant (FDN 143301). E.A.S. is supported by a grant from the CIHR (MOP-133530). Proteomics work was performed at the Network Biology Collaborative Centre at the Lunenfeld-Tanenbaum Research Institute, a facility supported by Canada Foundation for Innovation funding, by the Ontario Government, and by Genome Canada and Ontario Genomics (OGI-139). This research was enabled in part by support provided by Compute Canada (www.computecanada.ca). C.D.G. was supported by a CIHR Banting studentship. A.-C.G. is the Canada Research Chair in Functional Proteomics and the Lea Reichmann Chair in Cancer Proteomics.

Author information

Ji-Young Youn
Present address: Peter Gilgan Centre for Research and Learning, Hospital for Sick Children, Toronto, Ontario, Canada
Jean-Philippe Lambert
Present address: Department of Molecular Medicine, Cancer Research Centre, Big Data Research Centre, Université Laval, Quebec City, Quebec, Canada
Jean-Philippe Lambert
Present address: CHU de Québec-Université Laval Research Center (CHUL), Quebec City, Quebec, Canada
Étienne Coyaud
Present address: PRISM INSERM U1192, Université de Lille, Villeneuve d’Ascq, France
These authors contributed equally: Christopher D. Go, James D. R. Knight

Authors and Affiliations

Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health System, Toronto, Ontario, Canada
Christopher D. Go, James D. R. Knight, Bhavisha Rathod, Geoffrey G. Hesketh, Kento T. Abe, Ji-Young Youn, Payman Samavarchi-Tehrani, Jean-Philippe Lambert, Sally W. T. Cheung, Dushyandi Rajendran, Cassandra J. Wong, Laurence Pelletier & Anne-Claude Gingras
Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Christopher D. Go, Kento T. Abe, Ji-Young Youn, Evelyn Popiel, Laurence Pelletier & Anne-Claude Gingras
Montreal Neurological Institute and Department of Human Genetics, McGill University, Montreal, Quebec, Canada
Archita Rajasekharan, Hana Antonicka & Eric A. Shoubridge
Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
Hui Zhang, Lucie Y. Zhu & Alexander F. Palazzo
Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Étienne Coyaud & Brian Raught
Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
Brian Raught

Authors

Christopher D. Go
View author publications
You can also search for this author in PubMed Google Scholar
James D. R. Knight
View author publications
You can also search for this author in PubMed Google Scholar
Archita Rajasekharan
View author publications
You can also search for this author in PubMed Google Scholar
Bhavisha Rathod
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey G. Hesketh
View author publications
You can also search for this author in PubMed Google Scholar
Kento T. Abe
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Young Youn
View author publications
You can also search for this author in PubMed Google Scholar
Payman Samavarchi-Tehrani
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lucie Y. Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Evelyn Popiel
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Philippe Lambert
View author publications
You can also search for this author in PubMed Google Scholar
Étienne Coyaud
View author publications
You can also search for this author in PubMed Google Scholar
Sally W. T. Cheung
View author publications
You can also search for this author in PubMed Google Scholar
Dushyandi Rajendran
View author publications
You can also search for this author in PubMed Google Scholar
Cassandra J. Wong
View author publications
You can also search for this author in PubMed Google Scholar
Hana Antonicka
View author publications
You can also search for this author in PubMed Google Scholar
Laurence Pelletier
View author publications
You can also search for this author in PubMed Google Scholar
Alexander F. Palazzo
View author publications
You can also search for this author in PubMed Google Scholar
Eric A. Shoubridge
View author publications
You can also search for this author in PubMed Google Scholar
Brian Raught
View author publications
You can also search for this author in PubMed Google Scholar
Anne-Claude Gingras
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.-C.G., C.D.G. and J.D.R.K. conceived the project. A.-C.G., C.D.G., J.D.R.K. and B.R. wrote the paper with input from G.G.H., P.S.-T., J.-Y.Y., J.-P.L. and E.C. C.D.G. generated most of the BioID constructs and cell lines and performed BioID experiments and immunofluorescence studies. J.D.R.K., C.D.G., G.G.H. and A.-C.G. performed data analysis. J.D.R.K. created the humancellmap.org website. K.T.A. contributed the cell model and illustrations. C.J.W. helped with mass spectrometry data acquisition. A.R., H.A. and E.A.S. performed mitochondrial morphology experiments and analysed results. B.R. generated constructs and cell lines for BioID and testing predictions. G.G.H., K.T.A., J.-Y.Y., P.S.-T., H.Z., L.Y.Z., E.P., J.-P.L., D.R., E.C., S.W.T.C., L.P., B.R. and A.F.P. contributed constructs and cell lines. A.-C.G. supervised the project.

Corresponding author

Correspondence to Anne-Claude Gingras.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Luca Scorrano and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Overview of the dataset.

a, Cellular compartments targeted for profiling by BioID. Bold numbers on the schematic correspond to the indices in the legend. Italicized numbers in brackets next to the compartment name indicate the number of baits used to profile that compartment after quality control. b, Bait similarity and localization. The Jaccard index was calculated between each pair of baits in the core dataset using the list of high confidence (1% FDR) interactors. Baits were clustered using the Euclidean distance and complete linkage method, and clusters optimized using the CBA package in R. The colour gradient next to the bait labels indicates whether a bait shares an expected localization with both adjacent baits (red), one adjacent bait (light red) or neither adjacent bait (white). Major clusters were manually annotated on the basis of the expected localization of the components.

Extended Data Fig. 2 Factors affecting prey labelling and rationale for prey-wise analysis.

a, After sorting preys by proximity order and grouping by order across baits, the proportion of previously reported preys was calculated for the nth proximity order for n = 1, 2,… 200. b–f, For each bait, the relative proximity of every prey (proximity order) was calculated from the control-subtracted length-adjusted spectral counts (CLSC) (Methods), such that the prey with the highest CLSC value was considered to be the ‘interactor’ most proximal to the bait and the lowest CLSC value the most distal. b, Number of baits with a minimum of n preys at a 1% FDR, for n = 1, 2,… 200. c, Proximity order versus protein turnover rate (hours) in HeLa cells (turnover data are from ref. ⁵⁵). d, Proximity order versus protein expression as represented by the log₁₀-normalized MS1 iBAQ intensity from ProteomicsDB⁵³. e, Proximity order versus the number of lysine residues per protein. f, The log₁₀-normalized MS1 iBAQ intensity of proteins expressed in HEK293 versus HeLa cells from ProteomicsDB⁵³. The similarity in proteomes supports the usage of HeLa data in c as suitable HEK293 data was not available. Values along the x axis could reflect zero expression or missing data in HeLa cells. These were ignored when calculating the R² value. g, Bait comparisons for a pair of mitochondrial matrix proteins. Control-subtracted spectral counts are plotted for all high confidence preys (1% FDR) detected with either bait pair under comparison. AARS2 preferentially enriches components of the mitochondrial ribosome and proteins involved in translation, such as GFM1, MRPS9 and TRMT10C, whereas PDHA1 preferentially interacts with the pyruvate dehydrogenase complex component DLAT and the mitochondrial membrane ATP synthase ATP5F1B. h, Pipelines for localizing prey proteins using SAFE⁴⁰ and NMF⁴². In our SAFE pipeline, preys with a correlation across baits ≥ 0.65 are considered interactors and these pairs are used to generate a network that is annotated for GO:CC terms (Methods). In NMF, the bait–prey spectral counts matrix is reduced to a compartment-prey matrix and compartments are then defined using GO:CC for the compartment’s most abundant preys. A 2D network is generated in parallel from the compartment–prey matrix using t-SNE⁴⁴.

Source data

Extended Data Fig. 3 SAFE-based map of the cell and motif enrichment.

a, SAFE-based map of the cell generated from preys with a Pearson correlation score of 0.65 or higher and plotted using Cytoscape with a spring-embedded layout. Each prey is coloured to indicate its primary localization (domain in SAFE terminology) as indicated in the legend. An interactive version of the map can be viewed at humancellmap.org/explore/maps and toggling from NMF to SAFE on the bottom menu. b, Pfam regions or motifs enriched in the indicated SAFE domains. The heat map value represents the log₂-transformed fold change between the genes localized to the rank and all preys in the dataset. Only compartments or domains with a significant fold change for at least one motif are displayed on the heat map.

Extended Data Fig. 4 NMF-based correlation map of the cell and motif enrichment.

a, NMF-based map of the cell generated from preys with a Pearson correlation score across NMF ranks of 0.9 or higher and plotted using Cytoscape with a spring-embedded layout. Each prey is coloured to indicate its primary localization (rank in NMF terminology) as indicated in the legend. An interactive version of the map can be viewed at humancellmap.org/explore/maps and toggling from t-SNE to correlation on the bottom menu. b, Pfam regions or motifs enriched in the indicated NMF ranks. The heat map value represents the log₂-transformed fold change between the genes localized to the rank and all preys in the dataset. Only compartments/ranks with a significant fold change for at least one motif are displayed on the heat map.

Extended Data Fig. 5 Localization benchmarking and experimental validation.

a, Percentage of genes localized to a previously known compartment for each specificity tier using our NMF and SAFE pipelines, compared with the HPA¹ (www.proteinatlas.org) and the fractionation studies of Christoforou² and Itzhak³. Specificity tiers were defined by binning GO:CC terms on the basis of their information content (Methods). Tier 1 terms are the most specific, and tier 5 the least specific. b, Percentage of preys localized to a previously known compartment relative to the number of baits they were detected with for NMF and SAFE, respectively. c, Percentage of preys localized to a previously known compartment relative to the average number of spectral counts they were seen with for NMF and SAFE. Preys were binned by spectral counts. The left tick mark for each data point indicates the lower bound for the bin (inclusive) and the right tick mark the upper bound (exclusive). d, Localization prediction validation strategy and examples. Confidence rankings are as defined in Fig. 2d. Representative immunofluorescence images are shown. NMF scores across the defined ranks, categories and compartments are displayed as seen on humancellmap.org with the highest NMF category corresponding to the localization prediction.

Source data

Extended Data Fig. 6 Topology and moonlighting analysis.

a–c, Predicted versus annotated proportion of protein exposed to the cytosol or lumen for ER transmembrane proteins. a, Hypothetical examples of proteins with varying proportions of their sequence exposed to the cytosol or lumen. The extent of labelling by cytosolic or lumenal baits should be directly related to proportion of the sequence, and hence lysine residues available for biotinylation, exposed to the respective faces of the membrane that the protein spans. b, All transmembrane domain containing prey proteins localized to the cytosolic face of the ER (NMF compartments 3 and 15) and the lumenal face (NMF compartment 6), were assigned a CLR score on the basis of their NMF profile (313 proteins). The CLR score of a prey is calculated by taking the score in the cytosolic facing compartment/maximum score in that compartment and subtracting the corresponding score in the lumenal compartment. A score closer to 1 would indicate a protein with a signature at the cytosolic face of the ER membrane but little or no signature in the lumen and a score of −1 would indicate the opposite. A similar sequence-based score was calculated as the fraction of the sequence annotated as cytosolic minus the fraction that is lumenal according to UniProt. KTN1 is mis-annotated in UniProt⁵⁶ and should have a sequence score of +0.9742. c, Three example of proteins and their topology, CLR and sequence scores. Green examples have predictions matching annotated topology. d, e, Moonlighting and connections between compartments. d, Primary and secondary localizations of moonlighting preys. Preys with a score of at least 0.15 in each of two non-contiguous NMF compartments were considered to moonlight (a list of non-contiguous compartments is in Supplementary Table 15). The number of preys with a primary localization defined on the vertical axis and a secondary localization defined on the horizontal axis is shown (maximum 18). e, Inter-compartment edges were counted for each NMF rank/compartment. An interaction edge was defined between prey pairs having a correlation score across all NMF compartments of at least 0.9. Edges were then defined as ‘intra-compartment’ (if the primary localization for the two preys was the same compartment) or ‘inter-compartment’ (if the primary localization for the two preys was in different compartments) (Supplementary Table 15). Most organelles displayed a much greater proportion of intra-compartment interactions, with the extreme case of the mitochondrial matrix having only 15 inter-compartmental edges out of a total of 37,387 edges. The proportion of inter-compartment edges from the source to each target compartment is shown here. Inter-compartmental edges generally conformed with expectations, for example with edges from the chromatin compartment connecting to other nuclear substructures with which they may exchange components. The NMF rank number is shown in brackets next to the source compartment name.

Source data

Extended Data Fig. 7 Comparison of prey profiles for LMNA tagged with BioID, miniTurbo and TurboID.

a, Spectral counts for significant preys (FDR ≤ 0.01) were plotted for LMNA-BioID versus LMNA-miniTurbo. The average spectral counts value found in controls was subtracted from the detected spectral counts for each prey and the resulting value plotted. Zero values were set to 0.05 to create values suitable for log-transformation of the axes. b, LMNA-BioID versus LMNA-TurboID. c, LMNA-miniTurbo vs LMNA-TurboID.

Source data

Extended Data Fig. 8 Analysis of mitochondria–ER contact site candidates.

a, Heat map of genes with a primary localization at the mitochondrial outer membrane and ER membrane/nuclear outer membrane and a secondary localization to the other compartment as computed by NMF. To be included on the heat map, genes required an NMF score of at least 0.15 in the compartments of interest, a score ratio of at least 0.4 between the primary and secondary localization, and a score ratio of at least 2 between the compartments of interest and all other compartments. Bold genes indicate those selected for mitochondrial morphology assays in the following panels. A grey dot on the right side of the plot indicates proteins involved in lipid and cholesterol homeostasis, and a pink dot indicates calcium signalling. b, Dot plot view of BioID data for mito–ER contact site candidates highlighting recovery of mitochondrial fission machinery, mito–ER tethers and outer mitochondrial membrane proteins. Asterisks on the heat map indicate spectral counts for prey genes corresponding to the bait that were ignored by SAINT as peptides from the bait confound accurately evaluating the abundance of itself as an interactor. c, Mitochondrial morphology is altered by transient expression of GFP-tagged CHMP7 and C18orf32, as monitored by confocal immunofluorescence microscopy in HeLa cells. Cells were fixed and probed with antibodies directed against GFP and COXIV (Methods). The white box indicates the zoomed area displayed in the rightmost panels. Scale bars, 10 μm.

Extended Data Fig. 9 Analysis module at humancellmap.org.

a, Screenshot of the analysis report for the bait PIK3R1. Red circles indicate the following (1) Baits from the humancellmap are sorted from most similar to least similar as calculated by the Jaccard distance. (2) The ten most similar baits to the query in the humancellmap. (3) The average spectral counts for each prey averaged across all baits in the humancellmap database. (4) Expected localizations of the ten most similar baits. (5) Overlap or similarity metrics between the query bait and the top ten most similar baits in the humancellmap. The distance is the Jaccard distance, with a score of 0 for complete prey overlap and 1 for no overlap. The intersection refers to the number of shared preys, and the union refers to the combined number of preys between the query and the indicated bait. (6) The most specific preys for the query. The specificity score is calculated as the fold enrichment of a prey in the query relative to the average across the humancellmap baits used for the comparison. (7) The specificity score calculated against the top ten most similar baits to the query. (8) The specificity score calculated against all baits in the humancellmap. (9) Links to open the heat map or specificity plots at the interactive viewer at ProHits-viz³⁹. (10) Links for data downloads. b, Specificity plot for RNGTT showing the control-subtracted spectral counts versus the specificity score (calculation of the specificity score is described in the Methods). RNGTT is a nuclear protein involved in mRNA capping previously profiled by BioID⁷. Humancellmap analysis reported a nuclear localization, with bait-specific interactions including several RNA polymerase II subunits and components of the catalytic subunit of the PP4 phosphatase, as previously reported^57,58. c, Exploratory analysis of FAM171A1 reveals links to the cytoskeleton. FAM171A1 was predicted by our NMF and SAFE analyses to localize to the cell junction and plasma membrane. Consistent with this prediction, its BioID profile when screened as a bait was most similar to junctional and plasma membrane baits, whereas bait-specific preys included several cytoskeletal proteins, in line with a previous study⁵⁹ that reported a reduction of actin stress fibres after knockdown of FAM171A1. d, Specificity plot of MTFR2 showing the high specificity of proteins involved in mitochondrial dynamics. MTFR2 was associated with the mitochondrial outer membrane and peroxisome as a prey protein, with a weak signature at the mitochondrial inner membrane or mitochondrial intermembrane space. When profiled as a bait, the analysis module reports that it is most similar to peroxisomal baits, followed by mitochondrial outer and inner membrane baits, supporting its predicted localization. Interactions with MTFR1, SLC25A46 and VPS13D were found to be highly specific to MTFR2, consistent with the mitochondrial fragmentation previously observed after overexpression of GFP–MTFR2⁶⁰. e, BRD3 relocalization after JQ1 treatment. BirA-tagged BRD3 was treated with vehicle or JQ1 for 24 h (data from ref. ²⁵) and analysed using the analysis module at humancellmap.org. The Jaccard indices (1 − Jaccard distance) for the top 20 most similar baits were used to create networks in Cytoscape⁴¹ using an edge-weighted spring-embedded layout. Humancellmap baits are coloured on the basis of their expected localization to chromatin or the nucleolus.

Source data

Extended Data Fig. 10 BirA–Flag and GFP–BirA–Flag control stable cell line, and LMNA-BirA–Flag and AIFM1-BirA–Flag bait stable cell line immunofluorescence.

Cell lines were probed by confocal immunofluorescence microscopy in HEK293 Flp-In T-REx stable cells to assay for localization of the fusion construct and general biotinylation. Cells were fixed and then probed with an antibody to the Flag epitope and streptavidin for biotinylated proteins (Methods). The green channel represents nuclear or mitochondrial staining, the red channel denotes Flag and the blue channel represents streptavidin (biotinylated proteins). Scale bars, 10 μm.

Supplementary information

Reporting Summary

Supplementary Tables 1–21

This folder contains 21 .xlsx files (Supplementary Table 1–21) and an SI guide containing table legends.

Source data

Source Data Fig. 2

Source Data Fig. 3

Source Data Extended Data Fig. 2

Source Data Extended Data Fig. 5

Source Data Extended Data Fig. 6

Source Data Extended Data Fig. 7

Source Data Extended Data Fig. 9

Rights and permissions

Reprints and permissions

About this article

Cite this article

Go, C.D., Knight, J.D.R., Rajasekharan, A. et al. A proximity-dependent biotinylation map of a human cell. Nature 595, 120–124 (2021). https://doi.org/10.1038/s41586-021-03592-2

Download citation

Received: 25 September 2019
Accepted: 29 April 2021
Published: 02 June 2021
Issue Date: 01 July 2021
DOI: https://doi.org/10.1038/s41586-021-03592-2

This article is cited by

MTFR2-dependent mitochondrial fission promotes HCC progression
- La Zhang
- Xiuzhen Zhang
- Ning Jiang
Journal of Translational Medicine (2024)
Palmitoylation of vacuole membrane protein 1 promotes small extracellular vesicle secretion via interaction with ALIX and influences intercellular communication
- Mengyuan Qu
- Xinyu Liu
- Honggang Li
Cell Communication and Signaling (2024)
Evidence for widespread cytoplasmic structuring into mesoscale condensates
- Felix C. Keber
- Thao Nguyen
- Martin Wühr
Nature Cell Biology (2024)
DENND6A links Arl8b to a Rab34/RILP/dynein complex, regulating lysosomal positioning and autophagy
- Rahul Kumar
- Maleeha Khan
- Peter S. McPherson
Nature Communications (2024)
A deep learning model of tumor cell architecture elucidates response and resistance to CDK4/6 inhibitors
- Sungjoon Park
- Erica Silva
- Trey Ideker
Nature Cancer (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

Change history

11 January 2022

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links