Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Playing with matches: An assessment of accuracy in linked historical data
Historical Methods: A Journal of Quantitative and Interdisciplinary History ( IF 1.6 ) Pub Date : 2017-03-13 , DOI: 10.1080/01615440.2017.1288598
Catherine G. Massey 1
Affiliation  

ABSTRACT

This article evaluates linkage quality achieved by various record linkage techniques used in historical demography. The author creates benchmark, or truth, data by linking the 2005 Current Population Survey Annual Social and Economic Supplement to the Social Security Administration's numeric identification system by social security number. By comparing simulated linkages to the benchmark data, she examines the value added (in terms of number and quality of links) from incorporating text-string comparators, adjusting age, and using a probabilistic matching algorithm. She finds that text-string comparators and probabilistic approaches are useful for increasing the linkage rate, but use of text-string comparators may decrease accuracy in some cases. Overall, probabilistic matching offers the best balance between linkage rates and accuracy.



中文翻译:

玩比赛:评估链接历史数据的准确性

摘要

本文评估了历史人口学中使用的各种记录链接技术所实现的链接质量。作者通过将2005年《当前人口调查》年度《社会经济概览》与社会保障局的数字识别系统(按社会安全号码)相关联来创建基准数据或真实数据。通过将模拟的链接与基准数据进行比较,她检查了合并文本字符串比较器,调整年龄以及使用概率匹配算法带来的附加值(在链接的数量和质量方面)。她发现文本字符串比较器和概率方法对于提高链接率很有用,但是在某些情况下使用文本字符串比较器可能会降低准确性。总体而言,概率匹配可在链接率和准确性之间实现最佳平衡。

更新日期:2017-03-13
down
wechat
bug