当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Recognising innovative companies by using a diversified stacked generalisation method for website classification
Applied Intelligence ( IF 5.3 ) Pub Date : 2019-06-22 , DOI: 10.1007/s10489-019-01509-1
Marcin Michał Mirończuk , Jarosław Protasiewicz

In this paper, we propose a classification system which is able to decide whether a company is innovative or not, based only on its public website available on the internet. As innovativeness plays a crucial role in the development of myriad branches of the modern economy, an increasing number of entities are expending effort to be innovative. Thus, a new issue has appeared: how can we recognise them? Not only is grasping the idea of innovativeness challenging for humans, but also impossible for any known machine learning algorithm. Therefore, we propose a new indirect technique: a diversified stacked generalisation method, which is based on a combination of a multi-view approach and a genetic algorithm. The proposed approach achieves better performance than all other classification methods which include: (i) models trained on single datasets; or (ii) a simple voting method on these models. Furthermore, in this study, we check if unaligned feature space improves classification results. The proposed solution has been extensively evaluated on real data collected from companies’ websites. The experimental results verify that the proposed method improves the classification quality of websites which might represent innovative companies.

中文翻译:

使用多元化的堆叠归纳法对网站进行分类,以表彰创新公司

在本文中,我们提出了一个分类系统,该系统只能基于互联网上的公共网站来决定公司是否具有创新能力。由于创新在现代经济的无数分支的发展中起着至关重要的作用,越来越多的实体正在努力进行创新。因此,出现了一个新问题:我们如何识别它们?掌握创新思想不仅对人类具有挑战性,而且对于任何已知的机器学习算法而言都是不可能的。因此,我们提出了一种新的间接技术:基于多视图方法和遗传算法相结合的多样化堆叠泛化方法。与所有其他分类方法相比,该方法具有更好的性能,其中包括:(i)在单个数据集上训练的模型;或(ii)对这些模型采用简单的投票方法。此外,在这项研究中,我们检查未对齐的特征空间是否可以改善分类结果。所提出的解决方案已对从公司网站收集的真实数据进行了广泛的评估。实验结果证明,该方法提高了代表创新公司网站的分类质量。
更新日期:2020-01-04
down
wechat
bug