当前位置: X-MOL 学术Methods Inf. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Generic Method and Implementation to Evaluate and Improve Data Quality in Distributed Research Networks.
Methods of Information in Medicine ( IF 1.7 ) Pub Date : 2019-09-01 , DOI: 10.1055/s-0039-1693685
D Juárez 1, 2 , E E Schmidt 1, 2 , S Stahl-Toyota 3 , F Ückert 3 , M Lablans 1, 2
Affiliation  

BACKGROUND With the increasing personalization of clinical therapies, translational research is evermore dependent on multisite research cooperations to obtain sufficient data and biomaterial. Distributed research networks rely on the availability of high-quality data stored in local databases operated by their member institutions. However, reusing data documented by independent health providers for the purpose of care, rather than research ("secondary use"), reveal a high variability in terms of data formats, as well as poor data quality, across network sites. OBJECTIVES The aim of this work is the provision of a process for the assessment of data quality with regard to completeness and syntactic accuracy across independently operated data warehouses using common definitions stored in a central (network-wide) metadata repository (MDR). METHODS For assessment of data quality across multiple sites, we employ a framework of so-called bridgeheads. These are federated data warehouses, which allow the sites to participate in a research network. A central MDR is used to store the definitions of the commonly agreed data elements and their permissible values. RESULTS We present the design for a generator of quality reports within a bridgehead, allowing the validation of data in the local data warehouse against a research network's central MDR. A standardized quality report can be produced at each network site, providing a means to compare data quality across sites, as well as to channel feedback to the local data source systems, and local documentation personnel. A reference implementation for this concept has been successfully utilized at 10 sites across the German Cancer Consortium. CONCLUSIONS We have shown that comparable data quality assessment across different partners of a distributed research network is feasible when a central metadata repository is combined with locally installed assessment processes. To achieve this, we designed a quality report and the process for generating such a report. The final step was the implementation in a German research network.

中文翻译:

评估和改善分布式研究网络中数据质量的通用方法和实现。

背景技术随着临床疗法的个性化的增加,转化研究越来越依赖于多场所研究合作以获得足够的数据和生物材料。分布式研究网络依赖于由其成员机构运营的本地数据库中存储的高质量数据的可用性。但是,出于护理目的而不是研究目的(“二次使用”)重复使用由独立健康提供者记录的数据,发现跨网络站点的数据格式存在很大差异,并且数据质量较差。目标这项工作的目的是提供一个过程,该过程使用存储在中央(网络范围)元数据存储库(MDR)中的通用定义,跨独立操作的数据仓库评估完整性和语法准确性方面的数据质量。方法为了评估多个站点的数据质量,我们采用了所谓的桥头堡框架。这些是联合数据仓库,可让站点参与研究网络。中央MDR用于存储通用数据元素的定义及其允许的值。结果我们介绍了桥头堡内质量报告生成器的设计,从而可以根据研究网络的中央MDR验证本地数据仓库中的数据。可以在每个网络站点上生成标准化质量报告,从而提供一种比较站点之间数据质量的方法,并将反馈反馈到本地数据源系统和本地文档编制人员。此概念的参考实现已在德国癌症协会的10个地点成功使用。结论我们已经表明,当中央元数据存储库与本地安装的评估流程结合使用时,在分布式研究网络的不同合作伙伴之间进行可比的数据质量评估是可行的。为了实现这一目标,我们设计了质量报告以及生成此类报告的过程。最后一步是在德国研究网络中实施。
更新日期:2019-09-01
down
wechat
bug