当前位置: X-MOL 学术Beilstein. J. Org. Chem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations.
Beilstein Journal of Organic Chemistry ( IF 2.2 ) Pub Date : 2020-08-27 , DOI: 10.3762/bjoc.16.176
Ian Walsh 1 , Matthew S F Choo 1 , Sim Lyn Chiin 1 , Amelia Mak 1 , Shi Jie Tay 1 , Pauline M Rudd 1, 2 , Yang Yuansheng 3 , Andre Choo 4, 5 , Ho Ying Swan 5 , Terry Nguyen-Khuong 1
Affiliation  

The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in protocols for capillary electrophoresis-laser-induced fluorescence (CE-LIF) measurements of antibody N-glycans have increased the potential for generating large datasets of N-glycosylation values for assessment. With large cohorts of CE-LIF data, peak picking and peak area calculations still remain a problem for fast and accurate quantitation, despite the presence of internal and external standards to reduce misalignment for the qualitative analysis. The peak picking and area calculation problems are often due to fluctuations introduced by varying process conditions resulting in heterogeneous peak shapes. Additionally, peaks with co-eluting glycans can produce peaks of a non-Gaussian nature in some process conditions and not in others. Here, we describe an approach to quantitatively and qualitatively curate large cohort CE-LIF glycomics data. For glycan identification, a previously reported method based on internal triple standards is used. For determining the glycan relative quantities our method uses a clustering algorithm to ‘divide and conquer’ highly heterogeneous electropherograms into similar groups, making it easier to define peaks manually. Open-source software is then used to determine peak areas of the manually defined peaks. We successfully applied this semi-automated method to a dataset (containing 391 glycoprofiles) of monoclonal antibody biosimilars from a bioreactor optimization study. The key advantage of this computational approach is that all runs can be analyzed simultaneously with high accuracy in glycan identification and quantitation and there is no theoretical limit to the scale of this method.

中文翻译:

电泳图的聚类和管理:一种有效的方法,可用于分析大批毛细管电泳糖类图谱,以进行生物处理操作。

在生物加工过程中对抗体糖基化的准确评估需要高通量生成大量糖类数据。这使生物过程工程师能够识别控制糖基化关键质量属性的关键过程参数。抗体N-聚糖的毛细管电泳-激光诱导的荧光(CE-LIF)测量规程中取得的进展增加了生成大型N-糖基化值数据集进行评估的潜力。对于大量的CE-LIF数据,尽管存在内部和外部标准品以减少定性分析的失准,但峰提取和峰面积计算仍然是快速准确定量的问题。峰选择和面积计算问题通常是由于变化的工艺条件引起的波动而导致峰形不均一。此外,在某些工艺条件下(而非其他条件下),带有共洗脱聚糖的峰会产生非高斯性质的峰。在这里,我们描述了一种定量和定性管理大型队列CE-LIF糖类数据的方法。对于聚糖鉴定,使用先前报道的基于内部三重标准的方法。为了确定聚糖的相对量,我们的方法使用聚类算法将高度异质的电泳图“划分和征服”为相似的组,从而更容易手动定义峰。然后使用开源软件确定手动定义的峰的峰面积。我们成功地将这种半自动化方法应用于来自生物反应器优化研究的单克隆抗体生物仿制药的数据集(包含391个糖谱)。这种计算方法的主要优点是可以在糖类鉴定和定量分析中同时高精度地分析所有运行,并且这种方法的规模没有理论限制。
更新日期:2020-08-27
down
wechat
bug