当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SCNrank: spectral clustering for network-based ranking to reveal potential drug targets and its application in pancreatic ductal adenocarcinoma
BMC Medical Genomics ( IF 2.1 ) Pub Date : 2020-04-03 , DOI: 10.1186/s12920-020-0681-6
Enze Liu , Zhuang Zhuang Zhang , Xiaolin Cheng , Xiaoqi Liu , Lijun Cheng

Pancreatic ductal adenocarcinoma (PDAC) is the most common pancreatic malignancy. Due to its wide heterogeneity, PDAC acts aggressively and responds poorly to most chemotherapies, causing an urgent need for the development of new therapeutic strategies. Cell lines have been used as the foundation for drug development and disease modeling. CRISPR-Cas9 plays a key role in every step-in drug discovery: from target identification and validation to preclinical cancer cell testing. Using cell-line models and CRISPR-Cas9 technology together make drug target prediction feasible. However, there is still a large gap between predicted results and actionable targets in real tumors. Biological network models provide great modus to mimic genetic interactions in real biological systems, which can benefit gene perturbation studies and potential target identification for treating PDAC. Nevertheless, building a network model that takes cell-line data and CRISPR-Cas9 data as input to accurately predict potential targets that will respond well on real tissue remains unsolved. We developed a novel algorithm ‘Spectral Clustering for Network-based target Ranking’ (SCNrank) that systematically integrates three types of data: expression profiles from tumor tissue, normal tissue and cell-line PDAC; protein-protein interaction network (PPI); and CRISPR-Cas9 data to prioritize potential drug targets for PDAC. The whole algorithm can be classified into three steps: 1. using STRING PPI network skeleton, SCNrank constructs tissue-specific networks with PDAC tumor and normal pancreas tissues from expression profiles; 2. With the same network skeleton, SCNrank constructs cell-line-specific networks using the cell-line PDAC expression profiles and CRISPR-Cas 9 data from pancreatic cancer cell-lines; 3. SCNrank applies a novel spectral clustering approach to reduce data dimension and generate gene clusters that carry common features from both networks. Finally, SCNrank applies a scoring scheme called ‘Target Influence score’ (TI), which estimates a given target’s influence towards the cluster it belongs to, for scoring and ranking each drug target. We applied SCNrank to analyze 263 expression profiles, CRPSPR-Cas9 data from 22 different pancreatic cancer cell-lines and the STRING protein-protein interaction (PPI) network. With SCNrank, we successfully constructed an integrated tissue PDAC network and an integrated cell-line PDAC network, both of which contain 4414 selected genes that are overexpressed in tumor tissue samples. After clustering, 4414 genes are distributed into 198 clusters, which include 367 targets of FDA approved drugs. These drug targets are all scored and ranked by their TI scores, which we defined to measure their influence towards the network. We validated top-ranked targets in three aspects: Firstly, mapping them onto the existing clinical drug targets of PDAC to measure the concordance. Secondly, we performed enrichment analysis to these drug targets and the clusters there are within, to reveal functional associations between clusters and PDAC; Thirdly, we performed survival analysis for the top-ranked targets to connect targets with clinical outcomes. Survival analysis reveals that overexpression of three top-ranked genes, PGK1, HMMR and POLE2, significantly increases the risk of death in PDAC patients. SCNrank is an unbiased algorithm that systematically integrates multiple types of omics data to do potential drug target selection and ranking. SCNrank shows great capability in predicting drug targets for PDAC. Pancreatic cancer-associated gene candidates predicted by our SCNrank approach have the potential to guide genetics-based anti-pancreatic drug discovery.

中文翻译:

SCNrank:基于网络排名的光谱聚类,以揭示潜在的药物靶标及其在胰腺导管腺癌中的应用

胰腺导管腺癌(PDAC)是最常见的胰腺恶性肿瘤。由于其广泛的异质性,PDAC会起积极作用,并且对大多数化学疗法反应较差,因此迫切需要开发新的治疗策略。细胞系已经用作药物开发和疾病建模的基础。CRISPR-Cas9在每个介入药物的发现中都发挥着关键作用:从靶标的识别和验证到临床前癌细胞测试。结合使用细胞系模型和CRISPR-Cas9技术,使药物靶标预测变得可行。但是,在实际肿瘤中,预测结果与可操作靶标之间仍然存在较大差距。生物网络模型为模仿真实生物系统中的遗传相互作用提供了很好的方法,它可以有益于基因扰动研究和潜在的靶标治疗PDAC。然而,建立一个将细胞系数据和CRISPR-Cas9数据作为输入以准确预测对真实组织反应良好的潜在靶标的网络模型仍未解决。我们开发了一种新颖的算法“基于网络的目标排名的光谱聚类”(SCNrank),可以系统地整合三种类型的数据:肿瘤组织,正常组织和细胞系PDAC的表达谱;蛋白质-蛋白质相互作用网络(PPI);和CRISPR-Cas9数据以优先考虑PDAC的潜在药物靶标。整个算法可以分为三个步骤:1.使用STRING PPI网络骨架,SCNrank根据表达谱构建具有PDAC肿瘤和正常胰腺组织的组织特异性网络;2。利用相同的网络骨架,SCNrank使用细胞系PDAC表达谱和来自胰腺癌细胞系的CRISPR-Cas 9数据构建细胞系特异性网络。3. SCNrank应用一种新颖的光谱聚类方法来减少数据量并生成带有两个网络共同特征的基因簇。最后,SCNrank应用一种称为“目标影响力得分”(TI)的评分方案,该方案可估算给定目标对其所属集群的影响,从而对每个药物目标进行评分和排名。我们应用SCNrank分析了来自22种不同胰腺癌细胞系和STRING蛋白-蛋白相互作用(PPI)网络的263表达谱,CRPSPR-Cas9数据。借助SCNrank,我们成功构建了集成的组织PDAC网络和集成的细胞系PDAC网络,两者都包含在肿瘤组织样品中过表达的4414个选定基因。聚类后​​,将4414个基因分配到198个聚类中,其中包括367个FDA批准药物的靶标。这些药物靶标均按其TI得分进行评分和排名,我们定义了TI得分以衡量其对网络的影响。我们从三个方面验证了排名靠前的目标:首先,将它们映射到PDAC的现有临床药物目标上,以衡量一致性。其次,我们对这些药物靶标及其内部的簇进行了富集分析,以揭示簇与PDAC之间的功能关联。第三,我们对排名靠前的目标进行了生存分析,以将目标与临床结果联系起来。生存分析显示,三个最重要的基因PGK1,HMMR和POLE2过表达,大大增加了PDAC患者的死亡风险。SCNrank是一种无偏算法,可以系统地整合多种类型的组学数据,以进行潜在的药物靶点选择和排名。SCNrank在预测PDAC的药物靶标方面显示出强大的能力。通过我们的SCNrank方法预测的与胰腺癌相关的基因候选物具有指导基于遗传学的抗胰腺药物发现的潜力。
更新日期:2020-04-22
down
wechat
bug