当前位置: X-MOL 学术Math. Popul. Stud. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design aspects of rating scales in questionnaires
Mathematical Population Studies ( IF 1.4 ) Pub Date : 2018-04-03 , DOI: 10.1080/08898480.2018.1439240
Natalja Menold 1 , Christof Wolf 2 , Kathrin Bogner 3
Affiliation  

Since their introduction by Thurstone (1929) and Likert (1932), rating scales have been determinant in questionnaires. A rating scale usually defines the graduations out of a continuum such as agreement, intensity, frequency, or satisfaction. Respondents evaluate questions and items by marking the appropriate category, which usually concerns personal characteristics, opinions, and behavior Parducci (1983) defines responses as functions of the continuum of a rating scale. They range between the end poles and depend on the graduation of the scale; however, their quality should not be influenced by the characteristics of the rating scale. Menold and Bogner (2016) review characteristics of rating scales as follows: total number of categories, usage of middle and “do not know” options, category labeling, scale orientation (starting with a negative or a positive value, or a lower or a higher value), scale polarity (usage of verbal opposites), and visual presentation. The best design of rating scales remains controversial. Moreover, characteristics of rating scales can affect the quality of measurement (Krosnick and Fabrigar, 1997). Menold and Tausch (2016) demonstrate that different total numbers of categories have different psychometric properties and that verbalisation affects measurement. Data can no longer be compared if they have been produced from different rating scales. Graduations used in rating scales involve metric properties, because they are supposed to correspond to equal differences between categories. Orth (1982) and Westermann (1985) criticized this assumption of equidistance. The socalled “visual design” (Christian and Dillman, 2004; Tourangeau et al., 2007) was introduced to make out the influence of the graphical presentation of rating scales on response. Responses could, conciously or not, be biased by the graphical features. According to Schaefer and Dykema (2011: 912), “although past research often allows us to predict how a marginal distribution will be affected ... we are too often unable to say which version of a question is more reliable or valid.” That is why the reliability and validity of rating scales constitute an issue. This special

中文翻译:

问卷中评分量表的设计方面

自从 Thurstone (1929) 和 Likert (1932) 推出以来,评分量表一直是问卷中的决定因素。评级量表通常定义连续性的分级,例如一致性、强度、频率或满意度。受访者通过标记适当的类别来评估问题和项目,这些类别通常涉及个人特征、意见和行为 Parducci (1983) 将回答定义为评分量表连续体的函数。它们的范围在端极之间,取决于刻度的刻度;但是,它们的质量不应受到评级量表特征的影响。Menold 和 Bogner (2016) 回顾评分量表的特征如下:类别总数、中间和“不知道”选项的使用、类别标签、尺度方向(从负值或正值开始,或者更低或更高的值)、尺度极性(口头对立的使用)和视觉呈现。评分量表的最佳设计仍然存在争议。此外,评分量表的特征会影响测量的质量(Krosnick 和 Fabrigar,1997)。Menold 和 Tausch (2016) 证明不同类别的总数具有不同的心理测量特性,并且语言表达会影响测量。如果数据是从不同的评级尺度产生的,则无法再进行比较。等级量表中使用的等级涉及度量属性,因为它们应该对应于类别之间的相等差异。Orth (1982) 和 Westermann (1985) 批评了这种等距假设。所谓的“视觉设计”(Christian 和 Dillman,2004; Tourangeau et al., 2007) 被引入以了解评分量表的图形表示对响应的影响。响应可能有意或无意地受到图形特征的影响。根据 Schaefer 和 Dykema (2011: 912) 的说法,“虽然过去的研究经常让我们能够预测边际分布将如何受到影响......我们往往无法说出哪个版本的问题更可靠或更有效。” 这就是为什么评定量表的信度和效度构成问题的原因。这个特别 “虽然过去的研究通常让我们能够预测边缘分布将如何受到影响......我们往往无法说出哪个版本的问题更可靠或更有效。” 这就是为什么评定量表的信度和效度构成问题的原因。这个特别 “虽然过去的研究通常让我们能够预测边缘分布将如何受到影响......我们往往无法说出哪个版本的问题更可靠或更有效。” 这就是为什么评定量表的信度和效度构成问题的原因。这个特别
更新日期:2018-04-03
down
wechat
bug