当前位置: X-MOL 学术Big Data & Society › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Challenges in administrative data linkage for research
Big Data & Society ( IF 6.5 ) Pub Date : 2017-12-01 , DOI: 10.1177/2053951717745678
Katie Harron 1 , Chris Dibben 2 , James Boyd 3 , Anders Hjern 4 , Mahmoud Azimaee 5 , Mauricio L Barreto 6 , Harvey Goldstein 7
Affiliation  

Linkage of population-based administrative data is a valuable tool for combining detailed individual-level information from different sources for research. While not a substitute for classical studies based on primary data collection, analyses of linked administrative data can answer questions that require large sample sizes or detailed data on hard-to-reach populations, and generate evidence with a high level of external validity and applicability for policy making. There are unique challenges in the appropriate research use of linked administrative data, for example with respect to bias from linkage errors where records cannot be linked or are linked together incorrectly. For confidentiality and other reasons, the separation of data linkage processes and analysis of linked data is generally regarded as best practice. However, the ‘black box’ of data linkage can make it difficult for researchers to judge the reliability of the resulting linked data for their required purposes. This article aims to provide an overview of challenges in linking administrative data for research. We aim to increase understanding of the implications of (i) the data linkage environment and privacy preservation; (ii) the linkage process itself (including data preparation, and deterministic and probabilistic linkage methods) and (iii) linkage quality and potential bias in linked data. We draw on examples from a number of countries to illustrate a range of approaches for data linkage in different contexts.

中文翻译:


研究行政数据链接的挑战



基于人口的行政数据的链接是结合不同来源的详细个人信息进行研究的宝贵工具。虽然不能替代基于原始数据收集的经典研究,但对关联行政数据的分析可以回答需要大样本量或难以接触到的人群的详细数据的问题,并生成具有高水平外部有效性和适用性的证据。政策制定。在适当研究使用链接的管理数据方面存在独特的挑战,例如,由于记录无法链接或错误链接在一起而导致的链接错误造成的偏差。出于保密和其他原因,数据链接过程和链接数据分析的分离通常被认为是最佳实践。然而,数据链接的“黑匣子”可能使研究人员难以判断所产生的链接数据对于其所需目的的可靠性。本文旨在概述连接行政数据进行研究的挑战。我们的目标是加深对以下影响的理解:(i) 数据链接环境和隐私保护; (ii) 链接过程本身(包括数据准备、确定性和概率性链接方法)以及 (iii) 链接质量和链接数据中的潜在偏差。我们借鉴许多国家的例子来说明不同背景下的一系列数据链接方法。
更新日期:2017-12-01
down
wechat
bug