Communication Methods and Measures ( IF 11.4 ) Pub Date : 2019-08-08 , DOI: 10.1080/19312458.2019.1650166 Andrew Pilny 1 , Kelly McAninch 1 , Amanda Slone 1 , Kelsey Moore 1
ABSTRACT
The goal of this research is to make progress towards using supervised machine learning for automated content analysis dealing with complex interpretations of text. For Step 1, two humans coded a sub-sample of online forum posts for relational uncertainty. For Step 2, we evaluated reliability, in which we trained three different classifiers to learn from those subjective human interpretations. Reliability was established when two different metrics of inter-coder reliability could not distinguish whether a human or a machine coded the text on a separate hold-out set. Finally, in Step 3 we assessed validity. To accomplish this, we administered a survey in which participants described their own relational uncertainty/certainty via text and completed a questionnaire. After classifying the text, the machine’s classifications of the participants’ text positively correlated with the subjects’ own self-reported relational uncertainty and relational satisfaction. We discuss our results in line with areas of computational communication science, content analysis, and interpersonal communication.
中文翻译:
在自动内容分析中使用监督机器学习:使用关系不确定性的示例
摘要
这项研究的目的是在使用监督机器学习进行自动内容分析以处理复杂的文本解释方面取得进展。对于步骤1,两个人为在线论坛帖子的子样本编码了关系不确定性。对于步骤2,我们评估了可靠性,其中我们训练了三个不同的分类器以从那些主观的人类解释中学习。当两种不同的编码器间可靠性指标无法区分是人类还是机器将文本编码在单独的保留集上时,便建立了可靠性。最后,在步骤3中,我们评估了有效性。为了实现这一目标,我们进行了一项调查,其中参与者通过文字描述了自己的关系不确定性/确定性并填写了调查表。在对文本进行分类之后,机器对参与者文本的分类与受试者自身报告的关系不确定性和关系满意度呈正相关。我们根据计算通信科学,内容分析和人际交流领域讨论我们的结果。