当前位置: X-MOL 学术arXiv.cs.SI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Contrastive Multiple Correspondence Analysis (cMCA): Applying the Contrastive Learning Method to Identify Political Subgroups
arXiv - CS - Social and Information Networks Pub Date : 2020-07-09 , DOI: arxiv-2007.04540
Takanori Fujiwara, Tzu-Ping Liu

Ideal point estimation and dimensionality reduction have long been utilized to simplify and cluster complex, high-dimensional political data (e.g., roll-call votes and surveys) for use in analysis and visualization. These methods often work by finding the directions or principal components (PCs) on which either the data varies the most or respondents make the fewest decision errors. However, these PCs, which usually reflect the left-right political spectrum, are sometimes uninformative in explaining significant differences in the distribution of the data (e.g., how to categorize a set of highly-moderate voters). To tackle this issue, we adopt an emerging analysis approach, called contrastive learning. Contrastive learning-e.g., contrastive principal component analysis (cPCA)-works by first splitting the data by predefined groups, and then deriving PCs on which the target group varies the most but the background group varies the least. As a result, cPCA can often find `hidden' patterns, such as subgroups within the target group, which PCA cannot reveal when some variables are the dominant source of variations across the groups. We contribute to the field of contrastive learning by extending it to multiple correspondence analysis (MCA) to enable an analysis of data often encountered by social scientists---namely binary, ordinal, and nominal variables. We demonstrate the utility of contrastive MCA (cMCA) by analyzing three different surveys: The 2015 Cooperative Congressional Election Study, 2012 UTokyo-Asahi Elite Survey, and 2018 European Social Survey. Our results suggest that, first, for the cases when ordinary MCA depicts differences between groups, cMCA can further identify characteristics that divide the target group; second, for the cases when MCA does not show clear differences, cMCA can successfully identify meaningful directions and subgroups, which traditional methods overlook.

中文翻译:

对比多重对应分析 (cMCA):应用对比学习方法识别政治子群体

理想点估计和降维长期以来一直被用于简化和聚类复杂的高维政治数据(例如,唱名投票和调查)以用于分析和可视化。这些方法通常通过找到数据变化最大或受访者决策错误最少的方向或主成分 (PC) 来工作。然而,这些通常反映左右政治光谱的 PC 有时无法解释数据分布的显着差异(例如,如何对一组高度温和的选民进行分类)。为了解决这个问题,我们采用了一种新兴的分析方法,称为对比学习。对比学习——例如对比主成分分析 (cPCA)——首先按预定义的组分割数据,然后导出目标组变化最大但背景组变化最小的PC。因此,cPCA 通常可以找到“隐藏”模式,例如目标组内的子组,当某些变量是组间变异的主要来源时,PCA 无法揭示这些模式。我们通过将对比学习扩展到多重对应分析 (MCA) 来对社会科学家经常遇到的数据进行分析——即二元、有序和名义变量,从而为对比学习领域做出贡献。我们通过分析三个不同的调查来证明对比 MCA (cMCA) 的效用:2015 年合作国会选举研究、2012 年 UTokyo-Asahi 精英调查和 2018 年欧洲社会调查。我们的结果表明,首先,对于普通 MCA 描述组间差异的情况,cMCA 可以进一步识别划分目标群体的特征;其次,对于MCA没有表现出明显差异的情况,cMCA可以成功识别出传统方法忽略的有意义的方向和亚组。
更新日期:2020-07-10
down
wechat
bug