An Adjective Selection Personality Assessment Method Using Gradient Boosting Machine Learning,Processes

当前位置： X-MOL 学术 › Processes › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Adjective Selection Personality Assessment Method Using Gradient Boosting Machine Learning
Processes ( IF 3.5 ) Pub Date : 2020-05-21 , DOI: 10.3390/pr8050618
Bruno Fernandes , Alfonso González-Briones , Paulo Novais , Miguel Calafate , Cesar Analide , José Neves

Goldberg’s 100 Unipolar Markers remains one of the most popular ways to measure personality traits, in particular, the Big Five. An important reduction was later preformed by Saucier, using a sub-set of 40 markers. Both assessments are performed by presenting a set of markers, or adjectives, to the subject, requesting him to quantify each marker using a 9-point rating scale. Consequently, the goal of this study is to conduct experiments and propose a shorter alternative where the subject is only required to identify which adjectives describe him the most. Hence, a web platform was developed for data collection, requesting subjects to rate each adjective and select those describing him the most. Based on a Gradient Boosting approach, two distinct Machine Learning architectures were conceived, tuned and evaluated. The first makes use of regressors to provide an exact score of the Big Five while the second uses classifiers to provide a binned output. As input, both receive the one-hot encoded selection of adjectives. Both architectures performed well. The first is able to quantify the Big Five with an approximate error of 5 units of measure, while the second shows a micro-averaged f1-score of 83%. Since all adjectives are used to compute all traits, models are able to harness inter-trait relationships, being possible to further reduce the set of adjectives by removing those that have smaller importance.

中文翻译：

基于梯度提升机器学习的形容词选择人格评估方法

戈德堡的100个单极标记仍然是衡量人格特质，尤其是五巨头的最受欢迎的方法之一。Saucier随后使用40个标记的子集进行了重要的还原。通过向受试者展示一组标记物或形容词来进行两项评估，要求受试者使用9分评分量表对每个标记物进行量化。因此，本研究的目标是进行实验，并提出一个较短的替代方案，其中仅要求受试者确定哪些形容词对他的描述最多。因此，开发了一个用于收集数据的网络平台，要求受试者对每个形容词进行评级，并选择最能描述他的形容词。基于梯度提升方法，构思，调整和评估了两种不同的机器学习架构。第一个使用回归器提供五巨头的准确得分，而第二个使用分类器提供装箱的输出。作为输入，两者都接受形容词的一次性编码选择。两种架构均表现良好。前者能够以5个度量单位的近似误差来量化五巨头，而后者则显示了83％的微观平均f1得分。由于所有形容词都用于计算所有特征，因此模型能够利用特征间的关系，从而可以通过删除重要性较低的形容词来进一步减少形容词的集合。前者能够以5个度量单位的近似误差来量化五巨头，而后者则显示了83％的微观平均f1得分。由于所有形容词都用于计算所有特征，因此模型能够利用特征之间的关系，从而可以通过删除重要性较低的形容词来进一步减少形容词的集合。前者能够以5个度量单位的近似误差来量化五巨头，而后者则显示了83％的微观平均f1得分。由于所有形容词都用于计算所有特征，因此模型能够利用特征间的关系，从而可以通过删除重要性较低的形容词来进一步减少形容词的集合。

更新日期：2020-05-21

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>