当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature selection in a neighborhood decision information system with application to single cell RNA data classification
Applied Soft Computing ( IF 8.7 ) Pub Date : 2021-09-09 , DOI: 10.1016/j.asoc.2021.107876
Jie Zhang , Gangqiang Zhang , Zhaowen Li , Liangdong Qu , Ching-Feng Wen

A neighborhood information system (NIS) deals with an information system (IS) by means of neighborhoods. Sometimes it has some advantages over an IS. A neighborhood decision information system (NDIS) means a NIS with decision attributes. Single cell RNA (scRNA) data possess the characteristics of high dimensionality, small sample, unbalanced distribution, big noise and high redundancy. It has become an important research topic to select suitable and effective genes. This paper studies feature selection in a NDIS and considers its application for scRNA data classification. We first give the distance between information values on each attribute in a NDIS. Then, we present tolerance relations on the object set of a NDIS based on this distance. Next, we define the rough approximations in a NDIS by means of the presented tolerance relations. Furthermore, we put forward the notions of δ-dependence degree, δ-information entropy, δ-conditional information entropy and δ-joint information entropy in a NDIS. Based on Kryszkiewicz’s ideal, we introduce δ-generalized decision and consider feature selection in a consistent NDIS by decision. Finally, we study feature selection in a consistent NDIS by using dependence degree and information entropy, and design the relevant algorithms. The experimental results conducted several scRNA data demonstrate that the designed algorithms possess excellent performance.



中文翻译:

应用于单细胞 RNA 数据分类的邻域决策信息系统中的特征选择

邻域信息系统 (NIS) 通过邻域处理信息系统 (IS)。有时它比 IS 有一些优势。邻域决策信息系统 (NDIS) 是指具有决策属性的 NIS。单细胞RNA(scRNA)数据具有维数高、样本量小、分布不均衡、噪声大、冗余度高等特点。选择合适有效的基因已成为重要的研究课题。本文研究了 NDIS 中的特征选择,并考虑了其在 scRNA 数据分类中的应用。我们首先给出 NDIS 中每个属性的信息值之间的距离。然后,我们根据该距离呈现 NDIS 对象集的容差关系。接下来,我们通过提供的容差关系定义 NDIS 中的粗略近似值。δ-依赖程度, δ-信息熵, δ-条件信息熵和 δ-NDIS 中的联合信息熵。基于 Kryszkiewicz 的理想,我们引入δ- 广义决策并通过决策在一致的 NDIS 中考虑特征选择。最后,我们利用依赖度和信息熵研究了一致 NDIS 中的特征选择,并设计了相关算法。对几个scRNA数据进行的实验结果表明,所设计的算法具有优异的性能。

更新日期:2021-09-13
down
wechat
bug