Questioning causality on sex, gender and COVID-19, and identifying bias in large-scale data-driven analyses: the Bias Priority Recommendations and Bias Catalog for Pandemics,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Questioning causality on sex, gender and COVID-19, and identifying bias in large-scale data-driven analyses: the Bias Priority Recommendations and Bias Catalog for Pandemics
arXiv - CS - Information Retrieval Pub Date : 2021-04-29 , DOI: arxiv-2104.14492
Natalia Díaz-Rodríguez, Rūta Binkytė-Sadauskienė, Wafae Bakkali, Sannidhi Bookseller, Paola Tubaro, Andrius Bacevicius, Raja Chatila

The COVID-19 pandemic has spurred a large amount of observational studies reporting linkages between the risk of developing severe COVID-19 or dying from it, and sex and gender. By reviewing a large body of related literature and conducting a fine grained analysis based on sex-disaggregated data of 61 countries spanning 5 continents, we discover several confounding factors that could possibly explain the supposed male vulnerability to COVID-19. We thus highlight the challenge of making causal claims based on available data, given the lack of statistical significance and potential existence of biases. Informed by our findings on potential variables acting as confounders, we contribute a broad overview on the issues bias, explainability and fairness entail in data-driven analyses. Thus, we outline a set of discriminatory policy consequences that could, based on such results, lead to unintended discrimination. To raise awareness on the dimensionality of such foreseen impacts, we have compiled an encyclopedia-like reference guide, the Bias Catalog for Pandemics (BCP), to provide definitions and emphasize realistic examples of bias in general, and within the COVID-19 pandemic context. These are categorized within a division of bias families and a 2-level priority scale, together with preventive steps. In addition, we facilitate the Bias Priority Recommendations on how to best use and apply this catalog, and provide guidelines in order to address real world research questions. The objective is to anticipate and avoid disparate impact and discrimination, by considering causality, explainability, bias and techniques to mitigate the latter. With these, we hope to 1) contribute to designing and conducting fair and equitable data-driven studies and research; and 2) interpret and draw meaningful and actionable conclusions from these.

中文翻译：

质疑性别，性别和COVID-19的因果关系，并在大规模数据驱动的分析中发现偏见：《偏爱优先建议》和《大流行病偏爱目录》

COVID-19大流行刺激了大量观察性研究，这些研究报告了发展严重COVID-19或从中死亡的风险与性别之间的联系。通过回顾大量相关文献并基于跨越五大洲的61个国家的按性别分类的数据进行细粒度的分析，我们发现了一些混杂因素，这些因素可能可以解释所谓的男性对COVID-19的脆弱性。因此，鉴于缺乏统计意义和潜在的偏见，我们强调了基于可用数据做出因果主张的挑战。鉴于我们对作为混杂因素的潜在变量的发现，我们对数据驱动的分析所涉及的问题偏见，可解释性和公平性做出了广泛的概述。因此，我们概述了一系列歧视性政策后果，基于这些结果，这些后果可能导致意外的歧视。为了提高对此类可预见影响的范围的认识，我们编写了类似百科全书的参考指南，即大流行偏见目录（BCP），以提供定义并强调一般偏见的现实示例，并在COVID-19大流行背景下进行。这些分类分为偏见族和2级优先级量表以及预防步骤。此外，我们促进了有关如何最佳使用和应用此目录的“偏爱优先建议书”，并提供了指导原则以解决现实世界中的研究问题。目的是通过考虑因果关系，可解释性，偏见和缓解后者的技术。借助这些，我们希望：1）为设计和进行公平，公正的数据驱动的研究和研究做出贡献；2）从中解释并得出有意义且可行的结论。

更新日期：2021-04-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>