Analysis of Simple K-Mean and Parallel K-Mean Clustering for Software Products and Organizational Performance Using Education Sector Dataset,Scientific Programming

当前位置： X-MOL 学术 › Sci. Program. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Analysis of Simple K-Mean and Parallel K-Mean Clustering for Software Products and Organizational Performance Using Education Sector Dataset
Scientific Programming Pub Date : 2021-05-17 , DOI: 10.1155/2021/9988318
Rui Shang ₁ , Balqees Ara ₂ , Islam Zada ₂ , Shah Nazir ₃ , Zaid Ullah ₄ , Shafi Ullah Khan ₅

Affiliation

Context. Educational Data Mining (EDM) is a new and emerging research area. Data mining techniques are used in the educational field in order to extract useful information on employee or student progress behaviors. Recent increase in the availability of learning data has given importance and momentum to educational data mining to better understand and optimize the learning process and the environments in which it takes place. Objective. Data are the most valuable commodity for any organization. It is very difficult to extract useful information from such a large and massive collection of data. Data mining techniques are used to forecast and evaluate academic performance of students based on their academic record and participation in the forum. Although several studies have been carried out to evaluate the academic performance of students worldwide, there is a lack of appropriate studies to assess factors that can boost the academic performance of students. Methodology. The current study sought to weigh up factors that contribute to improving student academic performance in Pakistan. In this paper, both the simple and parallel clustering techniques are implemented and analyzed to point out their best features. The Parallel K-Mean algorithms overcome the problems of simple algorithm and the outcomes of the parallel algorithms are always the same, which improves the cluster quality, number of iterations, and elapsed time. Results. Both the algorithms are tested and compared with each other for a dataset of 10,000 and 5000 integer data items. The datasets are evaluated 10 times for minimum elapse time-varying K value from 1 to 10. The proposed study is more useful for scientific research data sorting. Scientific research data statistics are more accurate.

中文翻译：

使用教育部门数据集分析软件产品的简单K均值和并行K均值聚类以及组织绩效

语境。教育数据挖掘（EDM）是一个新兴的研究领域。数据挖掘技术用于教育领域，以提取有关员工或学生进步行为的有用信息。学习数据可用性的最近增加已经使教育数据挖掘的重要性和势头得到增强，以便更好地理解和优化学习过程及其发生的环境。客观的。数据是任何组织最有价值的商品。从如此大量的数据中提取有用的信息非常困难。数据挖掘技术用于根据学生的学习成绩和参加论坛的情况来预测和评估学生的学习成绩。尽管已经进行了几项研究来评估全球学生的学习成绩，但仍缺乏适当的研究来评估可以提高学生学习成绩的因素。方法论。当前的研究试图权衡有助于改善巴基斯坦学生学习成绩的因素。在本文中，简单聚类和并行聚类技术均已实现并进行了分析，以指出它们的最佳功能。平行K-普通算法克服了简单算法的问题，并行算法的结果始终相同，从而提高了集群质量，迭代次数和经过时间。结果。对于10,000和5000个整数数据项的数据集，对这两种算法进行了测试并相互比较。对数据集进行10次评估，以使最小随时间变化的K值（从1到10）得到评估。所提出的研究对于科学研究数据的分类更为有用。科学研究数据统计更加准确。

更新日期：2021-05-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11