当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
A fuzzy granular sparse learning model for identifying antigenic variants of influenza viruses
Applied Soft Computing ( IF 5.472 ) Pub Date : 2021-06-10 , DOI: 10.1016/j.asoc.2021.107573
Yumin Chen, Zhiwen Cai, Lei Shi, Wei Li

Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model is difficult to deal with uncertain data, we propose a Fuzzy Granular Sparse Learning (FGSL) model for identifying antigenic variants of influenza viruses. Firstly, a fuzzy set theory is introduced to measure and granulate the influenza viruses. Some fuzzy granules are induced by a single feature fuzzy granulation. Then, a fuzzy granular vector is constructed from these fuzzy granules, and the fuzzy granular regression is presented. Some constraint norms for granules and granular vectors are proposed, which are two granule norms and four granular vector norms. Therefore, the FGSL model is constructed based on granular regression and constraint norms. The FGSL model includes granular ridge and lasso regressions under different constraint norms. Furthermore, we prove the derivative forms of two granular regression functions, guaranteeing the convergence of the FGSL model. The optimization problem of the FGSL model is discussed and two gradient descent algorithms of the FGSL model are designed. Finally, we employ the FGSL model to serologic data and hemagglutinin sequences for learning antigenicity-associated mutations and inferring antigenic variants. The experimental results confirm some advantages of the FGSL model with fast convergence, low RMSE and strong feature selection ability. We successfully identify antigenic variants of influenza viruses by the FGSL model.



中文翻译:

一种用于识别流感病毒抗原变体的模糊粒度稀疏学习模型

稀疏学习在统计学、大数据、生物信息学和机器学习中有着重要的应用。在大数据系统中,大量冗余、缺失和嘈杂的数据导致稀疏性,信息的快速变化导致不确定性。由于传统的稀疏学习模型难以处理不确定数据,我们提出了一种模糊粒度稀疏学习(FGSL)模型来识别流感病毒的抗原变体。首先,引入模糊集理论对流感病毒进行测量和颗粒化。一些模糊颗粒是由单一特征的模糊颗粒引起的。然后,根据这些模糊颗粒构造一个模糊颗粒向量,并提出模糊颗粒回归。提出了一些颗粒和颗粒向量的约束范数,它们是两个粒度范数和四个粒度向量范数。因此,FGSL 模型是基于粒度回归和约束范数构建的。FGSL 模型包括不同约束范数下的粒度脊回归和套索回归。此外,我们证明了两个粒度回归函数的导数形式,保证了 FGSL 模型的收敛性。讨论了FGSL模型的优化问题,设计了FGSL模型的两种梯度下降算法。最后,我们将 FGSL 模型用于血清学数据和血凝素序列,以学习抗原性相关突变并推断抗原变异。实验结果证实了 FGSL 模型具有收敛速度快、RMSE 低和特征选择能力强的一些优点。

更新日期:2021-06-10
全部期刊列表>>
virulence
欢迎新作者ACS
中国作者高影响力研究精选
虚拟特刊
屿渡论文,编辑服务
浙大
上海中医药大学
深圳大学
上海交通大学
南方科技大学
浙江大学
清华大学
徐晶
张大卫
彭孝军
北京大学
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
华辉
天合科研
x-mol收录
试剂库存
down
wechat
bug