A flexible framework for evaluating user and item fairness in recommender systems

Deldjoo, Yashar; Anelli, Vito Walter; Zamani, Hamed; Bellogín, Alejandro; Di Noia, Tommaso

doi:10.1007/s11257-020-09285-1

A flexible framework for evaluating user and item fairness in recommender systems

Published: 27 January 2021

Volume 31, pages 457–511, (2021)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Yashar Deldjoo ORCID: orcid.org/0000-0002-6767-358X¹,
Vito Walter Anelli¹,
Hamed Zamani²,
Alejandro Bellogín³ &
…
Tommaso Di Noia¹

2154 Accesses
31 Citations
4 Altmetric
Explore all metrics

Abstract

One common characteristic of research works focused on fairness evaluation (in machine learning) is that they call for some form of parity (equality) either in treatment—meaning they ignore the information about users’ memberships in protected classes during training—or in impact—by enforcing proportional beneficial outcomes to users in different protected classes. In the recommender systems community, fairness has been studied with respect to both users’ and items’ memberships in protected classes defined by some sensitive attributes (e.g., gender or race for users, revenue in a multi-stakeholder setting for items). Again here, the concept has been commonly interpreted as some form of equality—i.e., the degree to which the system is meeting the information needs of all its users in an equal sense. In this work, we propose a probabilistic framework based on generalized cross entropy (GCE) to measure fairness of a given recommendation model. The framework comes with a suite of advantages: first, it allows the system designer to define and measure fairness for both users and items and can be applied to any classification task; second, it can incorporate various notions of fairness as it does not rely on specific and predefined probability distributions and they can be defined at design time; finally, in its design it uses a gain factor, which can be flexibly defined to contemplate different accuracy-related metrics to measure fairness upon decision-support metrics (e.g., precision, recall) or rank-based measures (e.g., NDCG, MAP). An experimental evaluation on four real-world datasets shows the nuances captured by our proposed metric regarding fairness on different user and item attributes, where nearest-neighbor recommenders tend to obtain good results under equality constraints. We observed that when the users are clustered based on both their interaction with the system and other sensitive attributes, such as age or gender, algorithms with similar performance values get different behaviors with respect to user fairness due to the different way they process data for each user cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of Fairness in Recommender Systems: A Review

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

Enhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic Approach

Notes

http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html.
https://sifter.org/~simon/journal/20061211.html.
The terms “poverty,” “welfare,” or “inequality” were used interchangeably in the economy literature (Cowell 2000; Cowell and Kuga 1981) when referring to discrimination or unfairness.
https://www.eeoc.gov/statutes/title-vii-civil-rights-act-1964.
https://www.strategy-business.com/article/What-is-fair-when-it-comes-to-AI-bias?gko=827c0.
https://www.etsy.com.
These scenarios are becoming more and more realistic especially in edge computing settings where computational resources are often quite limited.
http://jmcauley.ucsd.edu/data/amazon/.
https://github.com/sisinflab/DatasetsSplits/.
We need to resort to a binary classification for gender since this is the information available in this dataset.
http://ranksys.org/.
Please note that in Sect. 3 we defined an unfairness metric $\omega$, the one producing a nonnegative value in which if $\omega (m, a) < \omega (m', a)$ we can conclude that model m is less unfair than model $m'$ (or more fair). This would make our unfairness metric consistent with the literature, e.g., see Speicher et al. (2018) Section 2.3. “Axioms for measuring inequality” where the authors define inequality as a nonnegative value. Our GCE metric reports values that are all negative, with the maximum occurring when GCE $\approx 0$. Our proposed GCE metric can be seen as a fairness metric, while the absolute form |GCE| represents unfairness (always positive). For simplicity in discussing the results, however, we keep reporting the raw values for |GCE|, considering the sign when saying larger or smaller.
In this challenge, the users correspond to the items being recommended.

References

Abdollahpouri, H., Adomavicius, G., Burke, R., Guy, I., Jannach, D., Kamishima, T., Krasnodebski, J., Pizzato, L.: Multistakeholder recommendation: survey and research directions. User Model. User-Adapt. Interact. 30(1), 127–158 (2020)
Article Google Scholar
Abdollahpouri, H., Burke, R., Mobasher, B.: Controlling popularity bias in learning-to-rank recommendation. In: Proceedings of the 11th ACM Conference on Recommender Systems, pp. 42–46 (2017)
Abdollahpouri, H., Burke, R., Mobasher, B.: Recommender systems as multistakeholder environments. In: Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, UMAP 2017, Bratislava, Slovakia, July 09–12, 2017, pp. 347–348 (2017). https://doi.org/10.1145/3079628.3079657
Abebe, R., Kleinberg, J.M., Parkes, D.C.: Fair division via social comparison. In: Larson, K., Winikoff, M., Das, S., Durfee, E.H. (eds.) Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2017, São Paulo, Brazil, May 8–12, 2017, pp. 281–289. ACM, (2017). http://dl.acm.org/citation.cfm?id=3091171
Abel, F., Deldjoo, Y., Elahi, M., Kohlsdorf, D.: Recsys challenge 2017: offline and online evaluation. In: Proceedings of the 11th ACM Conference on Recommender Systems, RecSys ’17, pp. 372–373. ACM, New York (2017). https://doi.org/10.1145/3109859.3109954
Adomavicius, G., Zhang, J.: Impact of data characteristics on recommender systems performance. ACM Trans. Manag. Inf. Syst. 3(1), 3:1–3:17 (2012). https://doi.org/10.1145/2151163.2151166
Akoglu, L., Faloutsos, C.: Valuepick: towards a value-oriented dual-goal recommender system. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDMW 2010, The 10th IEEE International Conference on Data Mining Workshops, Sydney, Australia, 13 December 2010, pp. 1151–1158. IEEE Computer Society, (2010). https://doi.org/10.1109/ICDMW.2010.68
Anelli, V.W., Bellini, V., Di Noia, T., Bruna, W.L., Tomeo, P., Di Sciascio, E.: An analysis on time- and session-aware diversification in recommender systems. In: UMAP, pp. 270–274. ACM (2017)
Anelli, V.W., Noia, T.D., Sciascio, E.D., Ragone, A., Trotta, J.: Time-aware personalized popularity in top-n recommendation. In: Workshop on Recommendation in Complex Scenarios co-located with 12th ACM Conference on Recommender Systems (RecSys 2018), Vancouver, BC, Canada, October 2–7, 2018 (2018)
Anelli, V.W., Noia, T.D., Sciascio, E.D., Ragone, A., Trotta, J.: Local popularity and time in top-n recommendation. In: L. Azzopardi, B. Stein, N. Fuhr, P. Mayr, C. Hauff, D. Hiemstra (eds.) Advances in Information Retrieval: 41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol. 11437, pp. 861–868. Springer (2019). https://doi.org/10.1007/978-3-030-15712-8_63
Azaria, A., Hassidim, A., Kraus, S., Eshkol, A., Weintraub, O., Netanely, I.: Movie recommender system for profit maximization. In: Yang, Q., King, I., Li, Q., Pu, P., Karypis, G. (eds.) 7th ACM Conference on Recommender Systems, RecSys ’13, Hong Kong, China, October 12–16, 2013, pp. 121–128. ACM (2013). https://doi.org/10.1145/2507157.2507162
Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the 4th International Conference on Web Search and Web Data Mining, WSDM 2011, Hong Kong, China, February 9–12, 2011, pp. 635–644 (2011)
Balabanovic, M., Shoham, Y.: Content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997). https://doi.org/10.1145/245108.245124
Barocas, S., Selbst, A.D.: Big data’s disparate impact. Cal. L. Rev. 104, 671 (2016)
Google Scholar
Belkin, N.J., Robertson, S.E.: Some ethical and political implications of theoretical research in information science. In: Proceedings of the Association for Information Science, ASIS ’76, pp. 597–605 (1976)
Bellogín, A., Castells, P., Cantador, I.: Statistical biases in information retrieval metrics for recommender systems. Inf. Retr. J. 20(6), 606–634 (2017). https://doi.org/10.1007/s10791-017-9312-z
Article Google Scholar
Biega, A.J., Gummadi, K.P., Weikum, G.: Equity of attention: amortizing individual fairness in rankings. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, pp. 405–414. ACM (2018). https://doi.org/10.1145/3209978.3210063
Billsus, D., Pazzani, M.J.: User modeling for adaptive news access. User Model. User-Adapt. Interact. 10(2–3), 147–180 (2000). https://doi.org/10.1023/A:1026501525781
Boratto, L., Fenu, G., Marras, M.: The effect of algorithmic bias on recommender systems for massive open online courses. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) Advances in Information Retrieval–41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol. 11437, pp. 457–472. Springer (2019). https://doi.org/10.1007/978-3-030-15712-8_30
Botev, Z.I., Kroese, D.P.: The generalized cross entropy method, with applications to probability density estimation. Methodol. Comput. Appl. Probab. 13(1), 1–27 (2011)
Article MathSciNet Google Scholar
Breese, J.S., Heckerman, D., Kadie, C.M.: Empirical analysis of predictive algorithms for collaborative filtering. In: Cooper, G.F., Moral, S. (eds.) UAI ’98: proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, University of Wisconsin Business School, Madison, Wisconsin, USA, July 24–26, 1998, pp. 43–52. Morgan Kaufmann (1998). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=231&proceeding_id=14
Burke, R.: Multisided fairness for recommendation. arXiv preprint arXiv:1707.00093 (2017)
Burke, R.D., Abdollahpouri, H., Mobasher, B., Gupta, T.: Towards multi-stakeholder utility evaluation of recommender systems. In: Cena, F., Desmarais, M.C., Dicheva, D. (eds.) Late-breaking Results, Posters, Demos, Doctoral Consortium and Workshops Proceedings of the 24th ACM Conference on User Modeling, Adaptation and Personalisation (UMAP 2016), Halifax, Canada, July 13-16, 2016., CEUR Workshop Proceedings, vol. 1618. CEUR-WS.org (2016). http://ceur-ws.org/Vol-1618/SOAP_paper2.pdf
Burke, R., Sonboli, N., Ordonez-Gauger, A.: Balanced neighborhoods for multi-sided fairness in recommendation. In: Friedler, S.A., Wilson, C. (eds.) Conference on Fairness, Accountability and Transparency, FAT 2018, 23–24 February 2018, New York, NY, USA, Proceedings of Machine Learning Research, vol. 81, pp. 202–214. PMLR (2018). http://proceedings.mlr.press/v81/burke18a.html
Campos, P.G., Díez, F., Cantador, I.: Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols. User Model. User-Adapt. Interact. 24(1–2), 67–119 (2014). https://doi.org/10.1007/s11257-012-9136-x
Article Google Scholar
Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020)
Chen, L., Hsu, F., Chen, M., Hsu, Y.: Developing recommender systems with the consideration of product profitability for sellers. Inf. Sci. 178(4), 1032–1048 (2008). https://doi.org/10.1016/j.ins.2007.09.027
Christakopoulou, K., Kawale, J., Banerjee, A.: Recommendation with capacity constraints. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 1439–1448. ACM (2017). https://doi.org/10.1145/3132847.3133034
Cowell, F.A.: Measurement of inequality. Handb. Income Distrib. 1, 87–166 (2000)
Article Google Scholar
Cowell, F.A., Kuga, K.: Inequality measurement: an axiomatic approach. Eur. Econ. Rev. 15(3), 287–305 (1981)
Article Google Scholar
Cremonesi, P., Koren, Y., Turrin, R.: Performance of recommender algorithms on top-n recommendation tasks. In: Amatriain, X., Torrens, M., Resnick, P., Zanker, M. (eds.) Proceedings of the 2010 ACM Conference on Recommender Systems, RecSys 2010, Barcelona, Spain, September 26–30, 2010, pp. 39–46. ACM (2010). https://doi.org/10.1145/1864708.1864721
Csiszár, I.: A class of measures of informativity of observation channels. Period. Math. Hung. 2(1–4), 191–213 (1972)
Article MathSciNet Google Scholar
Dalton, H.: The measurement of the inequality of incomes. Econ. J. 30(119), 348–361 (1920)
Article Google Scholar
Das, A., Mathieu, C., Ricketts, D.: Maximizing profit using recommender systems. CoRR arXiv:0908.3633 (2009)
Deldjoo, Y., Anelli, V.W., Zamani, H., Kouki, A.B., Noia, T.D.: Recommender systems fairness evaluation via generalized cross entropy. In: Proceedings of the Workshop on Recommendation in Multi-stakeholder Environments co-located with the 13th ACM Conference on Recommender Systems (RecSys 2019), Copenhagen, Denmark, September 20, 2019, CEUR Workshop Proceedings, vol. 2440. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2440/short3.pdf
Deldjoo, Y., Noia, T.D., Merra, F.A.: Adversarial machine learning in recommender systems (aml-recsys). In: WSDM ’20: The 13th ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3–7, 2020, pp. 869–872. ACM (2020a). https://doi.org/10.1145/3336191.3371877
Deldjoo, Y., Noia, T.D., Sciascio, E.D., Merra, F.A.: How dataset characteristics affect the robustness of collaborative recommendation models. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25–30, 2020, pp. 951–960. ACM (2020b). https://doi.org/10.1145/3397271.3401046
Deldjoo, Y., Schedl, M., Cremonesi, P., Pasi, G.: Recommender systems leveraging multimedia content. ACM Comput. Surv. 53(5), 1–38 (2020c)
Article Google Scholar
Deldjoo, Y., Di Noia, T., Merra, F.A.: A survey on adversarial recommender systems: from attack/defense strategies to generative adversarial networks. ACM Comput. Surv. (2021). https://doi.org/10.1145/3439729
Article Google Scholar
Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, pp. 577–586. ACM (2011)
Ekstrand, M.D., Tian, M., Azpiazu, I.M., Ekstrand, J.D., Anuyah, O., McNeill, D., Pera, M.S.: All the cool kids, how do they fit in?: popularity and demographic biases in recommender evaluation and effectiveness. In: Conference on Fairness, Accountability and Transparency, pp. 172–186 (2018)
Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp. 51–60. AAAI Press (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16523
Gunawardana, A., Shani, G.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 265–308. Springer, (2015). https://doi.org/10.1007/978-1-4899-7637-6_8
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016, Barcelona, Spain, pp. 3315–3323 (2016). http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning
Havrda, J., Charvát, F.: Quantification method of classification processes. Concept of structural $a$-entropy. Kybernetika 3(1), 30–35 (1967)
MathSciNet MATH Google Scholar
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.: Neural collaborative filtering. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3–7, 2017, pp. 173–182. ACM (2017). https://doi.org/10.1145/3038912.3052569
He, R., McAuley, J.J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Bourdeau, J., Hendler, J., Nkambou, R., Horrocks, I., Zhao, B.Y. (eds.) Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11–15, 2016, pp. 507–517. Springer, (2016). https://doi.org/10.1145/2872427.2883037
Herlocker, J.L., Konstan, J.A., Riedl, J.: An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5(4), 287–310 (2002). https://doi.org/10.1023/A:1020443909834
Article Google Scholar
Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15–19, 2008, Pisa, Italy, pp. 263–272. IEEE Computer Society (2008). https://doi.org/10.1109/ICDM.2008.22
Jannach, D., Adomavicius, G.: Price and profit awareness in recommender systems. CoRR arXiv:1707.08029 (2017)
Jannach, D., Lerche, L., Kamehkhosh, I., Jugovac, M.: What recommenders recommend: an analysis of recommendation biases and possible countermeasures. User Model. User-Adapt. Interact. 25(5), 427–491 (2015). https://doi.org/10.1007/s11257-015-9165-3
Article Google Scholar
Jannach, D., Resnick, P., Tuzhilin, A., Zanker, M.: Recommender systems: beyond matrix completion. Commun. ACM 59(11), 94–102 (2016). https://doi.org/10.1145/2891406
Article Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Article Google Scholar
Kapur, J.N., Kesavan, H.K.: The Generalized Maximum Entropy Principle (with Applications). Sandford Educational Press Waterloo, Ontario (1987)
MATH Google Scholar
Kim, Y., Stratos, K., Sarikaya, R.: Frustratingly easy neural domain adaptation. In: Calzolari, N., Matsumoto, Y., Prasad, R. (eds.) COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan, pp. 387–396. ACL (2016). http://aclweb.org/anthology/C/C16/C16-1038.pdf
Koren, Y.: Factorization meets the neighborhood a multifaceted collaborative filtering model. In: Li, Y., Liu, B., Sarawagi, S. (eds.) Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24–27, 2008, pp. 426–434. ACM (2008). https://doi.org/10.1145/1401890.1401944
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Lang, K.: Newsweeder: learning to filter netnews. In: Proceedings of the 12th International Machine Learning Conference (ML95 (1995))
Liu, W., Burke, R.: Personalizing fairness-aware re-ranking. In: 2nd FATREC Workshop on Responsible Recommendation (2018)
McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: Baeza-Yates, R.A., Lalmas, M., Moffat, A., Ribeiro-Neto, B.A. (eds.) Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9–13, 2015, pp. 43–52. ACM, (2015). https://doi.org/10.1145/2766462.2767755
McNee, S.M., Riedl, J., Konstan, J.A.: Being accurate is not enough: how accuracy metrics have hurt recommender systems. In: Olson, G.M., Jeffries, R. (eds.) Extended Abstracts Proceedings of the 2006 Conference on Human Factors in Computing Systems, CHI 2006, Montréal, Québec, Canada, April 22–27, 2006, pp. 1097–1101. ACM (2006). https://doi.org/10.1145/1125451.1125659
Mehrotra, R., Anderson, A., Diaz, F., Sharma, A., Wallach, H.M., Yilmaz, E.: Auditing search engines for differential satisfaction across demographics. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3–7, 2017, pp. 626–633. ACM, (2017). https://doi.org/10.1145/3041021.3054197
Mehrotra, R., McInerney, J., Bouchard, H., Lalmas, M., Diaz, F.: Towards a fair marketplace: Counterfactual evaluation of the trade-off between relevance, fairness & satisfaction in recommendation systems. In: Cuzzocrea, A., Allan, J., Paton, N.W., Srivastava, D., Agrawal, R., Broder, A.Z., Zaki, M.J., Candan, K.S., Labrinidis, A., Schuster, A., Wang, H. (eds.) Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22–26, 2018, pp. 2243–2251. ACM (2018). https://doi.org/10.1145/3269206.3272027
Ning, X., Karypis, G.: SLIM: sparse linear methods for top-n recommender systems. In: Cook, D.J., Pei, J., Wang, W., Zaïane, O.R., Wu, X. (eds.) 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, pp. 497–506. IEEE Computer Society (2011). https://doi.org/10.1109/ICDM.2011.134
Pan, R., Zhou, Y., Cao, B., Liu, N.N., Lukose, R.M., Scholz, M., Yang, Q.: One-class collaborative filtering. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15–19, 2008, Pisa, Italy, pp. 502–511. IEEE Computer Society (2008). https://doi.org/10.1109/ICDM.2008.16
Panniello, U., Gorgoglione, M., Hill, S., Hosanagar, K.: Incorporating profit margins into recommender systems: a randomized field experiment of purchasing behavior and consumer trust. University of Pennsylvania, Scholarly Commons (2014)
Pigou, A.: Wealth and Welfare, PCMI Collection. Macmillan and Company, Limited, Canada (1912)
Google Scholar
Qamar, A.M., Gaussier, E., Chevallet, J.P., Lim, J.H.: Similarity learning for nearest neighbor classification. In: Data Mining, 2008. ICDM’08. 8th IEEE International Conference on, pp. 983–988. IEEE (2008)
Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: bayesian personalized ranking from implicit feedback. In: Bilmes, J.A., Ng, A.Y. (eds.) UAI 2009, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009, pp. 452–461. AUAI Press (2009). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1630&proceeding_id=25
Sapiezynski, P., Kassarnig, V., Wilson, C.: Academic performance prediction in a gender-imbalanced environment. In: 1st FATREC Workshop on Responsible Recommendation (2017)
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Analysis of recommendation algorithms for e-commerce. In: Proceedings of the 2nd ACM Conference on Electronic Commerce, pp. 158–167. ACM (2000)
Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Shen, V.Y., Saito, N., Lyu, M.R., Zurko, M.E. (eds.) Proceedings of the 10th International World Wide Web Conference, WWW 10, Hong Kong, China, May 1–5, 2001, pp. 285–295. ACM (2001). https://doi.org/10.1145/371920.372071
Shani, G., Heckerman, D., Brafman, R.I.: An mdp-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005). http://jmlr.org/papers/v6/shani05a.html
Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Guo, Y., Farooq, F. (eds.) Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19–23, 2018, pp. 2219–2228. ACM (2018). https://doi.org/10.1145/3219819.3220088
Speicher, T., Heidari, H., Grgic-Hlaca, N., Gummadi, K.P., Singla, A., Weller, A., Zafar, M.B.: A unified approach to quantifying algorithmic unfairness: Measuring individual & group unfairness via inequality indices. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2239–2248. ACM (2018)
Sühr, T., Biega, A.J., Zehlike, M., Gummadi, K.P., Chakraborty, A.: Two-sided fairness for repeated matchings in two-sided markets: A case study of a ride-hailing platform. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019, pp. 3082–3092 (2019). https://doi.org/10.1145/3292500.3330793
Sürer, Ö., Burke, R., Malthouse, E.C.: Multistakeholder recommendation with provider constraints. In: Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2–7, 2018, pp. 54–62 (2018). https://doi.org/10.1145/3240323.3240350
Tsintzou, V., Pitoura, E., Tsaparas, P.: Bias disparity in recommendation systems. In: Proceedings of the Workshop on Recommendation in Multi-stakeholder Environments co-located with the 13th ACM Conference on Recommender Systems (RecSys 2019), Copenhagen, Denmark, September 20, 2019, CEUR Workshop Proceedings, vol. 2440. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2440/short4.pdf
Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, FairWare@ICSE 2018, Gothenburg, Sweden, May 29, 2018, pp. 1–7. ACM (2018). https://doi.org/10.1145/3194770.3194776
Wang, X., He, X., Chua, T.: Learning and reasoning on graph for recommendation. In: WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3–7, 2020, pp. 890–893. ACM (2020). https://doi.org/10.1145/3336191.3371873
Wang, H., Wu, C.: A mathematical model for product selection strategies in a recommender system. Expert Syst. Appl. 36(3), 7299–7308 (2009). https://doi.org/10.1016/j.eswa.2008.09.006
Article MathSciNet Google Scholar
Yao, S., Huang, B.: Beyond parity: Fairness objectives for collaborative filtering. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 2921–2930 (2017). http://papers.nips.cc/paper/6885-beyond-parity-fairness-objectives-for-collaborative-filtering
Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3-7, 2017, pp. 1171–1180. ACM, (2017). https://doi.org/10.1145/3038912.3052660
Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P.: Fairness constraints: a flexible approach for fair classification. J. Mach. Learn. Res. 20, 75:1–75:42 (2019). http://jmlr.org/papers/v20/18-262.html
Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P., Weller, A.: From parity to preference-based notions of fairness in classification. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 229–239 (2017). http://papers.nips.cc/paper/6627-from-parity-to-preference-based-notions-of-fairness-in-classification
Zamani, H., Croft, W.B.: Learning a joint search and recommendation model from user-item interactions. In: Proceedings of the 13th International Conference on Web Search and Data Mining, WSDM ’20, p. 717–725. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3336191.3371818
Zehlike, M., Bonchi, F., Castillo, C., Hajian, S., Megahed, M., Baeza-Yates, R.A.: Fa*ir: A fair top-k ranking algorithm. In: Lim, E., Winslett, M., Sanderson, M., Fu, A.W., Sun, J., Culpepper, J.S., Lo, E., Ho, J.C., Donato, D., Agrawal, R., Zheng, Y., Castillo, C., Sun, A., Tseng, V.S., Li, C. (eds.) Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 1569–1578. ACM (2017). https://doi.org/10.1145/3132847.3132938
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9–13, 2001, New Orleans, Louisiana, USA, pp. 334–342. ACM, (2001). https://doi.org/10.1145/383952.384019
Zheng, Y., Ghane, N., Sabouri, M.: Personalized educational learning with multi-stakeholder optimizations. In: Papadopoulos, G.A., Samaras, G., Weibelzahl, S., Jannach, D., Santos, O.C. (eds.) Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, UMAP 2019, Larnaca, Cyprus, June 09–12, 2019, pp. 283–289. ACM (2019). https://doi.org/10.1145/3314183.3323843
Zhu, Z., Hu, X., Caverlee, J.: Fairness-aware tensor-based recommendation. In: Cuzzocrea, A., Allan, J., Paton, N.W., Srivastava, D., Agrawal, R., Broder, A.Z., Zaki, M.J., Candan, K.S., Labrinidis, A., Schuster, A., Wang, H. (eds.) Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22–26, 2018, pp. 1153–1162. ACM (2018). https://doi.org/10.1145/3269206.3271795

Download references

Acknowledgements

The authors thank the reviewers for their thoughtful comments and suggestions. This work was supported in part by the Ministerio de Ciencia, Innovación y Universidades (Reference: PID2019-108965GB-I00) and in part by the Center for Intelligent Information Retrieval. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Polytechnic University of Bari, Bari, Italy
Yashar Deldjoo, Vito Walter Anelli & Tommaso Di Noia
University of Massachusetts Amherst, Amherst, MA, USA
Hamed Zamani
Universidad Autónoma de Madrid, Madrid, Spain
Alejandro Bellogín

Authors

Yashar Deldjoo
View author publications
You can also search for this author in PubMed Google Scholar
Vito Walter Anelli
View author publications
You can also search for this author in PubMed Google Scholar
Hamed Zamani
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Bellogín
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Di Noia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yashar Deldjoo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Theoretical analysis of GCE properties

In appendix, we provide a theoretical analysis of the proposed probabilistic metric, i.e., GCE, for measuring unfairness. Previous work (Speicher et al. 2018) has explored four properties to be satisfied by inequality indices, including unfairness inequality. These properties are (1) anonymity, (2) population invariance, (3) transfer principle, and (4) zero normalization.

We claim that GCE satisfies these four properties. In appendix, for the sake of clarity, we prove that the mentioned properties are satisfied by a simplified version of the proposed probabilistic unfairness metric, i.e., GCE when $p_f$ is uniform. We follow our proofs based on the GCE formulation for discrete attributes, presented in Eq. (3). Assuming $p_f$ being uniform, the GCE formulation is:

$$\begin{aligned} I_{\text {uniform}}(m, a)&= \frac{1}{\beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \frac{1}{n}\right) ^\beta \cdot p_m^{(1-\beta )}(a_j)} - 1 \right] \nonumber \\&= \frac{1}{\beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \frac{1}{n}\right) ^\beta \cdot \left( \frac{v_j}{Z}\right) ^{(1-\beta )}} - 1 \right] \nonumber \\&= \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] \end{aligned}$$

(9)

where $p_m(a_j)=v_j/Z$, i.e., $Z = \sum _{j=1}^n{v_j}$, and $\mu = Z / n$ denotes the average value. For brevity, we denote $I_{\text {uniform}}(m, a)$ as $I_{\text {uniform}}({\mathbf {v}})$ where ${\mathbf {v}} = [v_1, v_2, \cdots , v_n] \in {\mathbb {R}}^n$ is the vector of all values corresponding to the attribute a obtained by the model m.

Anonymity. According to the anonymity property, the inequality measure should not depend on the characteristics of attributes except for their values obtained by the model. As shown in Equation (9), the inequality measure only depends on the value of attributes, i.e., $v_j$s and the average value $\mu$ which again is computed based on the values as $\frac{\sum _{j=1}^n{v_j}}{n}$. Therefore, this property is satisfied by $I_{\text {uniform}}$.

Population invariance. This property indicates that the inequality measure is independent of the population size.

Proof

To prove that $I_{\text {uniform}}$ satisfies the population invariance property, assume ${\mathbf {v}}' = <{\mathbf {v}}, {\mathbf {v}}, \cdots , {\mathbf {v}}> \in {\mathbb {R}}^{nk}$ denotes a k-replication of the vector ${\mathbf {v}}$. Therefore, $I_{\text {uniform}}({\mathbf {v}}')$ is computed as:

$$\begin{aligned} I_{\text {uniform}}({\mathbf {v}}')&= \frac{1}{nk \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{nk}{ \left( \frac{v'_j}{\mu } \right) ^{(1-\beta )}} - nk \right] \\&= \frac{1}{nk \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{n}{ \left( k \frac{v_j}{\mu } \right) ^{(1-\beta )}} - nk \right] \\&= \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{n}{ \left( \frac{v_j}{\mu } \right) ^{(1-\beta )}} - n \right] \\&= I_{\text {uniform}}({\mathbf {v}}) \end{aligned}$$

$\square$

The transfer principle. According to the transfer principle, also known as the Pigou–Dalton principle (Dalton 1920; Pigou 1912), transferring benefit from a high-benefit attribute value to a low-benefit value, if it does not reverse the relative position of values, must decrease the inequality.

Proof

Assume we transfer $\delta$ from $v_j$ to $v_{j'}$, such that $v_j > v_{j'}$ and $0< \delta < \frac{v_j - v_{j'}}{2}$ so this transfer does not reverse the relative position of these two attribute values. This results in ${\mathbf {v}}' = [v_1, v_2, \ldots , v_j-\delta , \ldots , v_{j'}+\delta , \ldots , v_n] \in {\mathbb {R}}^n$. Therefore,

$$\begin{aligned}&I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}) \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v'_j}{\mu }\right) ^{(1-\beta )}} - n \right] - \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \left( \frac{v'_j}{\mu }\right) ^{(1-\beta )} - \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}\right) } \right] \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ (v_j - \delta )^{(1-\beta )} + (v_{j'} + \delta )^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \end{aligned}$$

(10)

To obtain the maximum value of this function, we compute its derivative with respect to $\delta$ and set it to zero, as follows:

$$\begin{aligned}&\frac{\partial (I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}))}{\partial \delta } = 0 \\&\quad \Rightarrow \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ -(1-\beta )(v_j - \delta )^{-\beta } + (1-\beta )(v_{j'} + \delta )^{-\beta } \right] = 0 \\&\quad \Rightarrow -(v_j - \delta )^{-\beta } + (v_{j'} + \delta )^{-\beta } = 0 \\&\quad \Rightarrow \delta = \frac{v_j-v_{j'}}{2} \end{aligned}$$

Since $\frac{\partial ^2(I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}))}{\partial \delta ^2} < 0$, the computed $\delta$ gives us the maximum value for the given function. Therefore, according to Eq. (10), since $0< \delta < \frac{v_j-v_{j'}}{2}$, we have:

$$\begin{aligned}&I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}})\\&\quad< \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ \left( v_j - \frac{v_j-v_{j'}}{2} \right) ^{(1-\beta )} + \left( v_{j'} + \frac{v_j-v_{j'}}{2} \right) ^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ \left( \frac{v_j+v_{j'}}{2} \right) ^{(1-\beta )} + \left( \frac{v_j+v_{j'}}{2} \right) ^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2^\beta (v_j+v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad< \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2^\beta (2v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2(v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ (v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} \right] \\&\quad < 0 \end{aligned}$$

Therefore, $I_{\text {uniform}}({\mathbf {v}}') < I_{\text {uniform}}({\mathbf {v}})$, and thus, $I_{\text {uniform}}$ satisfies the transfer principle. $\square$

Zero normalization. According to this property, the inequality measure should be minimized when all attribute values are equal (i.e., the uniform distribution). The minimum value for the fairness metric should be zero.

Proof

To prove this property, we use the Lagrange multiplier approach. The Lagrange function is defined as:

$$\begin{aligned} {\mathcal {L}}({\mathbf {v}}, \lambda ) = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] - \lambda \left( \sum _{j=1}^n{\frac{v_j}{n}} - \mu \right) \end{aligned}$$

(11)

where $\lambda$ is the Lagrange multiplier. Therefore, we have:

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial {\mathcal {L}}({\mathbf {v}}, \lambda )}{\partial v_j} = \frac{1}{n \beta \mu } \cdot \left( \frac{v_j}{\mu }\right) ^{-\beta } - \frac{\lambda }{n} \\ \frac{\partial {\mathcal {L}}({\mathbf {v}}, \lambda )}{\partial \lambda } = \sum _{j=1}^n{\frac{v_j}{n}} - \mu \end{array}\right. } \end{aligned}$$

(12)

Setting the above partial derivatives to zero results in $v_1 = v_2 = \cdots = v_n = \mu$. Therefore, we have:

$$\begin{aligned}&\min _{{\mathbf {v}}}{\frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] } \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{\mu }{\mu }\right) ^{(1-\beta )}} - n \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ n - n \right] \\&\quad = 0 \end{aligned}$$

Therefore, $I_{\text {uniform}}$ satisfies the zero normalization property. $\square$

Summary. In appendix, we theoretically study GCE and the provided proofs show that GCE satisfies the anonymity, population invariance, transfer principle, and zero normalization properties, under the uniformity assumption for the fair distribution. The proofs can be extended to the general case by relaxing the uniformity assumption, since we do not use any property of the uniform distribution in the proofs and just use its simple form to improve the readability and clarity.

Appendix B: Full results

In this section, we present the results for all the datasets and item and user attributes that were not included in the paper because of space constraints. First, we show in Table 14 the item GCE based on the price attribute for the datasets (instead of only limited to toys, as in Sect. 5.1).

Second, our results on user attributes—that is, interactions, helpfulness, and happiness for Amazon datasets, and age and gender for MovieLens—are presented for the datasets (together with the analysis already shown in Sect. 5.2 for Amazon Toys & Games): Amazon Electronics is described in Table 15, Amazon Video Games in Table 16, and MovieLens-1M in Table 17.

Table 14 ItemGCE using price as feature on the tested datasets

Full size table

Table 15 UserGCE for Amazon Electronics dataset using the three user features considered

Full size table

Table 16 UserGCE for Amazon Video Games dataset using the three user features considered

Full size table

Table 17 UserGCE for MovieLens-1M dataset using the two user features considered for the rest of the datasets, plus age and gender

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deldjoo, Y., Anelli, V.W., Zamani, H. et al. A flexible framework for evaluating user and item fairness in recommender systems. User Model User-Adap Inter 31, 457–511 (2021). https://doi.org/10.1007/s11257-020-09285-1

Download citation

Received: 19 August 2019
Accepted: 12 December 2020
Published: 27 January 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11257-020-09285-1

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A flexible framework for evaluating user and item fairness in recommender systems

Abstract

Access this article

Similar content being viewed by others

Evaluation of Fairness in Recommender Systems: A Review

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

Enhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Theoretical analysis of GCE properties

Proof

Proof

Proof

Appendix B: Full results

Rights and permissions

About this article

Cite this article

Navigation

A flexible framework for evaluating user and item fairness in recommender systems

Abstract

Access this article

Similar content being viewed by others

Evaluation of Fairness in Recommender Systems: A Review

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

Enhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Theoretical analysis of GCE properties

Proof

Proof

Proof

Appendix B: Full results

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation