当前位置: X-MOL 学术Population and Environment › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Working toward effective anonymization for surveillance data: innovation at South Africa’s Agincourt Health and Socio-Demographic Surveillance Site
Population and Environment ( IF 4.283 ) Pub Date : 2021-03-24 , DOI: 10.1007/s11111-020-00372-4
Lori M Hunter 1, 2 , Catherine Talbot 1, 2 , Wayne Twine 3, 4 , Joe McGlinchy 5 , Chodziwadziwa Kabudula 4 , Daniel Ohene-Kwofie 4
Affiliation  

Linking people and places is essential for population-health-environment research. Yet, this data integration requires geographic coding such that information reflecting individuals or households can appropriately be connected with characteristics of their proximate environments. However, offering access to such geocoding greatly increases the risk of respondent identification and, therefore, holds the potential to breach confidentiality. In response, a variety of “geographic masking” techniques have been developed to introduce error into geographic coding and thereby reduce the likelihood of identification. We report findings from analyses of the error introduced by several masking techniques applied to data from the Agincourt Health and Socio-Demographic Surveillance System in rural South Africa. Using a vegetation index (Normalized Difference Vegetation Index (NDVI)) at the household scale, comparisons are made between the “true” NDVI values and those calculated after masking. We also examine the tradeoffs between accuracy and protecting respondent privacy. The exploration suggests that in this study setting and for NDVI, geomasking approaches that use buffers and account for population density produce the most accurate results. However, the exploration also clearly demonstrates the tradeoff between accuracy and privacy, with more accuracy resulting in a higher level of potential respondent identification. It is important to note that these analyses illustrate a process that should characterize spatially informed research but within which particular decisions must be shaped by the research setting and objectives. In the long run, we aim to provide insight into masking’s potential and perils to facilitate population-environment-health research.



中文翻译:

努力实现监测数据的有效匿名化:南非阿金库尔健康和社会人口监测网站的创新

将人和地方联系起来对于人口-健康-环境研究至关重要。然而,这种数据集成需要地理编码,以便反映个人或家庭的信息可以与其邻近环境的特征适当地联系起来。然而,提供对此类地理编码的访问极大地增加了受访者身份识别的风险,因此有可能破坏机密性。作为回应,已经开发了各种“地理掩蔽”技术,以将错误引入地理编码,从而降低识别的可能性。我们报告了对应用于南非农村阿金库尔健康和社会人口监测系统数据的几种掩蔽技术引入的错误的分析结果。使用家庭尺度的植被指数(归一化差异植被指数 (NDVI)),在“真实”NDVI 值与掩蔽后计算的值之间进行比较。我们还研究了准确性和保护受访者隐私之间的权衡。探索表明,在本研究环境中,对于 NDVI,使用缓冲区并考虑人口密度的地理掩蔽方法会产生最准确的结果。然而,探索也清楚地展示了准确性和隐私之间的权衡,更高的准确性导致更高水平的潜在受访者识别。值得注意的是,这些分析说明了一个过程,该过程应该是空间信息研究的特征,但在这个过程中,特定的决定必须由研究环境和目标决定。从长远来看,

更新日期:2021-03-25
down
wechat
bug