当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Metric Labeling and Semimetric Embedding for Protein Annotation Prediction
Journal of Computational Biology ( IF 1.4 ) Pub Date : 2021-05-20 , DOI: 10.1089/cmb.2020.0425
Emre Sefer 1 , Carl Kingsford 2
Affiliation  

Computational techniques have been successful at predicting protein function from relational data (functional or physical interactions). These techniques have been used to generate hypotheses and to direct experimental validation. With few exceptions, the task is modeled as multilabel classification problems where the labels (functions) are treated independently or semi-independently. However, databases such as the Gene Ontology provide information about the similarities between functions. We explore the use of the Metric Labeling combinatorial optimization problem to make use of heuristically computed distances between functions to make more accurate predictions of protein function in networks derived from both physical interactions and a combination of other data types. To do this, we give a new technique (based on convex optimization) for converting heuristic semimetric distances into a metric with minimum least-squared distortion (LSD). The Metric Labeling approach is shown to outperform five existing techniques for inferring function from networks. These results suggest that Metric Labeling is useful for protein function prediction, and that LSD minimization can help solve the problem of converting heuristic distances to a metric.

中文翻译:

用于蛋白质注释预测的度量标记和半度量嵌入

计算技术已经成功地从相关数据(功能或物理相互作用)预测蛋白质功能。这些技术已被用于产生假设和指导实验验证。除了少数例外,该任务被建模为多标签分类问题,其中标签(功能)被独立或半独立地处理。但是,诸如 Gene Ontology 之类的数据库提供了有关功能之间相似性的信息。我们探索度量标签的使用组合优化问题,利用启发式计算的函数之间的距离,更准确地预测来自物理相互作用和其他数据类型组合的网络中的蛋白质功能。为此,我们提供了一种新技术(基于凸优化),用于将启发式半度量距离转换为具有最小二乘失真 (LSD) 的度量。度量标签方法被证明优于五种现有的从网络推断功能的技术。这些结果表明,度量标记对于蛋白质功能预测很有用,并且 LSD 最小化可以帮助解决将启发式距离转换为度量的问题。
更新日期:2021-05-22
down
wechat
bug