Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Granular Mining and Big Data Analytics: Rough Models and Challenges
Proceedings of the National Academy of Sciences, India Section A: Physical Sciences ( IF 0.8 ) Pub Date : 2019-01-11 , DOI: 10.1007/s40010-018-0578-3
Sankar K. Pal

Data analytics in granular computing framework is considered for several mining applications, such as in video analysis, bioinformatics and online social networks which have all the characteristics of Big data. The role of granulation, lower approximation and rf information measure is exhibited. While the lower approximation over a video sequence signifies the object model for unsupervised tracking, it characterizes the probability (relative frequency) of definite regions in ranking miRNAs for normal and cancer classification. For neural learning, the information on definite region is used as the initial knowledge for encoding while generating the networks through evolution. Granules considered are of different sizes and dimensions with fuzzy and crisp boundaries. The tracking method is effective in handling different ambiguous situations, e.g., overlapping objects, newly appeared object(s), multiple objects in different directions and speeds, in unsupervised mode. The ranking algorithm could find only 1% miRNAs to result in significantly higher F-score than the entire set. Fuzzy–rough communities detected over the granular model of social networks are suitable in dealing with overlapping virtual communities in Big data. The knowledge encoding based on fuzzy–rough set provides superior performance than that of rough set. Future directions of research and challenges including the significance of z-numbers in precisiation of granules are stated. The article includes some of the results published elsewhere.

中文翻译:

粒度挖掘和大数据分析:粗糙的模型和挑战

颗粒计算框架中的数据分析被认为可用于多种挖掘应用程序,例如具有大数据所有特征的视频分析,生物信息学和在线社交网络。制粒的作用,较低的近似值和rf展示信息措施。视频序列的较低逼近度表示无监督跟踪的对象模型,但它表征了正常和癌症分类miRNA排名中确定区域的概率(相对频率)。对于神经学习,使用确定区域的信息作为编码的初始知识,同时通过演化生成网络。所考虑的颗粒具有不同的大小和尺寸,边界模糊且清晰。跟踪方法在无监督模式下可有效地处理不同的歧义情况,例如重叠的对象,新出现的对象,不同方向和速度的多个对象。排名算法只能找到1%的miRNA,从而导致F得分明显高于整个结果。在社交网络的粒度模型上检测到的模糊粗糙社区非常适合处理大数据中重叠的虚拟社区。基于模糊粗糙集的知识编码比粗糙集具有更好的性能。研究的未来方向和挑战,包括研究的意义陈述了颗粒精确度中的z数。本文包括在其他地方发布的一些结果。
更新日期:2019-01-11
down
wechat
bug