当前位置: X-MOL 学术Protein Eng. Des. Sel. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hydrophobicity diversity in globular and nonglobular proteins measured with the Gini index.
Protein Engineering, Design and Selection ( IF 2.4 ) Pub Date : 2017-12-12 , DOI: 10.1093/protein/gzx060
Oliviero Carugo 1
Affiliation  

Amino acids and their properties are variably distributed in proteins and different compositions determine all protein features, ranging from solubility to stability and functionality. Gini index, a tool to estimate distribution uniformity, is widely used in macroeconomics and has numerous statistical applications. Here, Gini index is used to analyze the distribution of hydrophobicity in proteins and to compare hydrophobicity distribution in globular and intrinsically disordered proteins. Based on the analysis of carefully selected high-quality data sets of proteins extracted from the Protein Data Bank (http://www.rcsb.org) and from the DisProt database (http://www.disprot.org/), it is observed that hydrophobicity is distributed in a more diverse way in intrinsically disordered proteins than in folded and soluble globular proteins. This correlates with the observation that the amino acid composition deviates from the uniformity (estimate with the Shannon and the Gini-Simpson indices) more in intrinsically disordered proteins than in globular and soluble proteins. Although statistical tools tike the Gini index have received little attention in molecular biology, these results show that they allow one to estimate sequence diversity and that they are useful to delineate trends that can hardly be described, otherwise, in simple and concise ways.

中文翻译:

用基尼系数测量的球状和非球状蛋白质的疏水性多样性。

氨基酸及其性质可变地分布在蛋白质中,不同的成分决定了所有蛋白质的特征,从溶解度到稳定性和功能性。基尼指数是一种估计分布均匀性的工具,已在宏观经济学中广泛使用,并具有大量的统计应用。在这里,基尼系数用于分析蛋白质中疏水性的分布,并比较球状蛋白和固有无序蛋白中的疏水性分布。基于对从蛋白质数据库(http://www.rcsb.org)和DisProt数据库(http://www.disprot.org/)中提取的蛋白质的高质量精选数据集的分析,据观察,疏水性在本质上无序的蛋白质中的分布比在折叠的可溶性球状蛋白质中的分布更多样化。这与以下观察结果相关:在本质上无序的蛋白质中,氨基酸组成比球状和可溶性蛋白质中的氨基酸组成更偏离均匀性(用香农指数和吉尼-辛普森指数估计)。尽管类似基尼指数的统计工具在分子生物学中很少受到关注,但这些结果表明,它们可以用来估计序列多样性,并且对描述难以描述的趋势很有用,否则,将以简单明了的方式进行描述。
更新日期:2019-11-01
down
wechat
bug