Investigating Gender Bias in BERT

Bhardwaj, Rishabh; Majumder, Navonil; Poria, Soujanya

doi:10.1007/s12559-021-09881-2

Investigating Gender Bias in BERT

Published: 20 May 2021

Volume 13, pages 1008–1018, (2021)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

2253 Accesses
31 Citations
Explore all metrics

Abstract

In this work, we analyze the gender bias induced by BERT in downstream tasks. We also propose solutions to reduce gender bias. Contextual language models (CLMs) have pushed the NLP benchmarks to a new height. It has become a new norm to utilize CLM-provided word embeddings in downstream tasks such as text classification. However, unless addressed, CLMs are prone to learn intrinsic gender bias in the dataset. As a result, predictions of downstream NLP models can vary noticeably by varying gender words, such as replacing “he” to “she”, or even gender-neutral words. In this paper, we focus our analysis on a popular CLM, i.e., \(\text {BERT}\). We analyze the gender bias it induces in five downstream tasks related to emotion and sentiment intensity prediction. For each task, we train a simple regressor utilizing \(\text {BERT}\)’s word embeddings. We then evaluate the gender bias in regressors using an equity evaluation corpus. Ideally and from the specific design, the models should discard gender informative features from the input. However, the results show a significant dependence of the system’s predictions on gender-particular words and phrases. We claim that such biases can be reduced by removing gender-specific features from word embedding. Hence, for each layer in BERT, we identify directions that primarily encode gender information. The space formed by such directions is referred to as the gender subspace in the semantic space of word embeddings. We propose an algorithm that finds fine-grained gender directions, i.e., one primary direction for each BERT layer. This obviates the need of realizing gender subspace in multiple dimensions and prevents other crucial information from being omitted. Experiments show that removing embedding components in gender directions achieves great success in reducing BERT-induced bias in the downstream tasks. The investigation reveals significant gender bias a contextualized language model ( i.e., \(\text {BERT}\)) induces in downstream tasks. The proposed solution seems promising in reducing such biases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Detection and moderation of detrimental content on social media platforms: current status and future directions

Article 05 September 2022

Notes

e.g.,:- {angry, enraged} represents a common emotion, i.e., anger
\(W_k^{CLS}\) and \(t_k^{cls}\) represent the same vector.
Following https://competitions.codalab.org/competitions/17751#results
Delta values can be compared to models studied in [23]
We focus on those words having low word-sense ambiguity
https://github.com/uclanlp/gn_glove/tree/master/wordlist
https://github.com/ecmonsen/gendered_words
cosine of the angle between two random vectors in high dimensions is zero with high probability.

References

Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, Ali A, Sheikh A. Artificial intelligence-enabled analysis of public attitudes on facebook and twitter toward covid-19 vaccines in the united kingdom and the united states: Observational study. J Med Intern Res. 2021;23(4):e26627.
Google Scholar
Basta C, Costa-jussà MR, Casas N. Evaluating the underlying gender bias in contextualized word embeddings. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics. 2019:33–39.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In Advances in neural information processing systems 2017:5998–6008.
Caliskan A, Bryson JJ, Narayanan A. Semantics derived automatically from language corpora contain human-like biases. Science. 2017;356(6334):83–186.
Dashtipour K, Gogate M, Cambria E, Hussain A. A novel context-aware multimodal framework for persian sentiment analysis. 2021. https://arxiv.org/abs/2103.02636.
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. 2018. https://arxiv.org/abs/1810.04805.
Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems. 2016:4349–4357.
Gonen H, Goldberg Y. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics. 2019:609–614.
Caliskan A, Bryson JJ, Narayanan A. Semantics derived automatically from language corpora contain human-like biases. Science. 2017;356(6334):183–6.
Article Google Scholar
Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, Ali A, Sheikh A. Artificial intelligence–enabled analysis of public attitudes on facebook and twitter toward covid-19 vaccines in the united kingdom and the united states: Observational study. Journal of medical Internet research. 2021;23(4):e26627.
Kim Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. 2014:1746–1751.
Kiritchenko S, Mohammad S. Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics. 2018:43–53.
Kurita K, Vyas N, Pareek A, Black AW, Tsvetkov Y. Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics. 2019:166–172.
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. 2019. https://arxiv.org/abs/1907.11692.
Lu J, Batra D, Parikh D, Lee S. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Advances in Neural Information Processing Systems. 2019:13–23.
Mohammad S, Bravo-Marquez F, Salameh M, Kiritchenko S. SemEval-2018 task 1: Affect in tweets. In Proceedings of The 12th International Workshop on Semantic Evaluation. Association for Computational Linguistics. 2018:pp. 1–17.
Morcos A, Raghu M, Bengio S. Insights on representational similarity in neural networks with canonical correlation. In Advances in Neural Information Processing Systems. 2018:5727–5736.
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics. 2018:2227–2237.
Elazar Y, Ravfogel S, Jacovi A, Goldberg Y. Amnesic probing: Behavioral explanation with amnesic counterfactuals. Transactions of the Association for Computational Linguistics. 2021;9:160–75.
Article Google Scholar
Morcos A, Raghu M, Bengio S. Insights on representational similarity in neural networks with canonical correlation. In Advances in Neural Information Processing Systems. 2018:5727–5736.
Tenney I, Xia P, Chen B, Wang A, Poliak A, McCoy RT, Kim N, Van Durme B, Bowman SR, Das D, et al. What do you learn from context? probing for sentence structure in contextualized word representations. 2019. https://arxiv.org/abs/1905.06316.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In Advances in neural information processing systems. 2017:5998–6008.
Voita E, Titov I. Information-theoretic probing with minimum description length. 2020. https://arxiv.org/abs/2003.12298.
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Brew J. Huggingface’s transformers: State-of-the-art natural language processing. 2019. https://arxiv.org/abs/1910.03771.
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q. V. Xlnet: Generalized autoregressive pretraining for language understanding. 2019. https://arxiv.org/abs/1906.08237.
Zhao J, Wang T, Yatskar M, Cotterell R, Ordonez V, Chang KW. Gender bias in contextualized word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics. 2019:629–634.
Zhao J, Wang T, Yatskar M, Ordonez V, Chang KW. Gender bias in coreference resolution: Evaluation and debiasing methods. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics. 2018:15–20.
Zhao J, Zhou Y, Li Z, Wang W, Chang KW. Learning gender-neutral word embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. 2018:4847–4853.
Zhao J, Zhou Y, Li Z, Wang W, Chang KW. Learning gender-neutral word embeddings. 2018. https://arxiv.org/abs/1809.01496.

Download references

Author information

Authors and Affiliations

DeCLaRe Lab, Singapore University of Technology and Design, Singapore, Singapore
Rishabh Bhardwaj, Navonil Majumder & Soujanya Poria

Authors

Rishabh Bhardwaj
View author publications
You can also search for this author in PubMed Google Scholar
Navonil Majumder
View author publications
You can also search for this author in PubMed Google Scholar
Soujanya Poria
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Soujanya Poria.

Ethics declarations

Informed Consent

The authors did not receive support from any organization for the submitted work.

Ethical Standard

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflicts of Interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhardwaj, R., Majumder, N. & Poria, S. Investigating Gender Bias in BERT. Cogn Comput 13, 1008–1018 (2021). https://doi.org/10.1007/s12559-021-09881-2

Download citation

Received: 20 April 2021
Accepted: 11 May 2021
Published: 20 May 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s12559-021-09881-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigating Gender Bias in BERT

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Detection and moderation of detrimental content on social media platforms: current status and future directions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Informed Consent

Ethical Standard

Conflicts of Interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Investigating Gender Bias in BERT

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Detection and moderation of detrimental content on social media platforms: current status and future directions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Informed Consent

Ethical Standard

Conflicts of Interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation