Skip to main content
Log in

A flexible framework for evaluating user and item fairness in recommender systems

  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

One common characteristic of research works focused on fairness evaluation (in machine learning) is that they call for some form of parity (equality) either in treatment—meaning they ignore the information about users’ memberships in protected classes during training—or in impact—by enforcing proportional beneficial outcomes to users in different protected classes. In the recommender systems community, fairness has been studied with respect to both users’ and items’ memberships in protected classes defined by some sensitive attributes (e.g., gender or race for users, revenue in a multi-stakeholder setting for items). Again here, the concept has been commonly interpreted as some form of equality—i.e., the degree to which the system is meeting the information needs of all its users in an equal sense. In this work, we propose a probabilistic framework based on generalized cross entropy (GCE) to measure fairness of a given recommendation model. The framework comes with a suite of advantages: first, it allows the system designer to define and measure fairness for both users and items and can be applied to any classification task; second, it can incorporate various notions of fairness as it does not rely on specific and predefined probability distributions and they can be defined at design time; finally, in its design it uses a gain factor, which can be flexibly defined to contemplate different accuracy-related metrics to measure fairness upon decision-support metrics (e.g., precision, recall) or rank-based measures (e.g., NDCG, MAP). An experimental evaluation on four real-world datasets shows the nuances captured by our proposed metric regarding fairness on different user and item attributes, where nearest-neighbor recommenders tend to obtain good results under equality constraints. We observed that when the users are clustered based on both their interaction with the system and other sensitive attributes, such as age or gender, algorithms with similar performance values get different behaviors with respect to user fairness due to the different way they process data for each user cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html.

  2. https://sifter.org/~simon/journal/20061211.html.

  3. The terms “poverty,” “welfare,” or “inequality” were used interchangeably in the economy literature (Cowell 2000; Cowell and Kuga 1981) when referring to discrimination or unfairness.

  4. https://www.eeoc.gov/statutes/title-vii-civil-rights-act-1964.

  5. https://www.strategy-business.com/article/What-is-fair-when-it-comes-to-AI-bias?gko=827c0.

  6. https://www.etsy.com.

  7. These scenarios are becoming more and more realistic especially in edge computing settings where computational resources are often quite limited.

  8. http://jmcauley.ucsd.edu/data/amazon/.

  9. https://github.com/sisinflab/DatasetsSplits/.

  10. We need to resort to a binary classification for gender since this is the information available in this dataset.

  11. http://ranksys.org/.

  12. Please note that in Sect. 3 we defined an unfairness metric \(\omega\), the one producing a nonnegative value in which if \(\omega (m, a) < \omega (m', a)\) we can conclude that model m is less unfair than model \(m'\) (or more fair). This would make our unfairness metric consistent with the literature, e.g., see Speicher et al. (2018) Section 2.3. “Axioms for measuring inequality” where the authors define inequality as a nonnegative value. Our GCE metric reports values that are all negative, with the maximum occurring when GCE \(\approx 0\). Our proposed GCE metric can be seen as a fairness metric, while the absolute form |GCE| represents unfairness (always positive). For simplicity in discussing the results, however, we keep reporting the raw values for |GCE|, considering the sign when saying larger or smaller.

  13. In this challenge, the users correspond to the items being recommended.

References

  • Abdollahpouri, H., Adomavicius, G., Burke, R., Guy, I., Jannach, D., Kamishima, T., Krasnodebski, J., Pizzato, L.: Multistakeholder recommendation: survey and research directions. User Model. User-Adapt. Interact. 30(1), 127–158 (2020)

    Article  Google Scholar 

  • Abdollahpouri, H., Burke, R., Mobasher, B.: Controlling popularity bias in learning-to-rank recommendation. In: Proceedings of the 11th ACM Conference on Recommender Systems, pp. 42–46 (2017)

  • Abdollahpouri, H., Burke, R., Mobasher, B.: Recommender systems as multistakeholder environments. In: Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, UMAP 2017, Bratislava, Slovakia, July 09–12, 2017, pp. 347–348 (2017). https://doi.org/10.1145/3079628.3079657

  • Abebe, R., Kleinberg, J.M., Parkes, D.C.: Fair division via social comparison. In: Larson, K., Winikoff, M., Das, S., Durfee, E.H. (eds.) Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2017, São Paulo, Brazil, May 8–12, 2017, pp. 281–289. ACM, (2017). http://dl.acm.org/citation.cfm?id=3091171

  • Abel, F., Deldjoo, Y., Elahi, M., Kohlsdorf, D.: Recsys challenge 2017: offline and online evaluation. In: Proceedings of the 11th ACM Conference on Recommender Systems, RecSys ’17, pp. 372–373. ACM, New York (2017). https://doi.org/10.1145/3109859.3109954

  • Adomavicius, G., Zhang, J.: Impact of data characteristics on recommender systems performance. ACM Trans. Manag. Inf. Syst. 3(1), 3:1–3:17 (2012). https://doi.org/10.1145/2151163.2151166

  • Akoglu, L., Faloutsos, C.: Valuepick: towards a value-oriented dual-goal recommender system. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDMW 2010, The 10th IEEE International Conference on Data Mining Workshops, Sydney, Australia, 13 December 2010, pp. 1151–1158. IEEE Computer Society, (2010). https://doi.org/10.1109/ICDMW.2010.68

  • Anelli, V.W., Bellini, V., Di Noia, T., Bruna, W.L., Tomeo, P., Di Sciascio, E.: An analysis on time- and session-aware diversification in recommender systems. In: UMAP, pp. 270–274. ACM (2017)

  • Anelli, V.W., Noia, T.D., Sciascio, E.D., Ragone, A., Trotta, J.: Time-aware personalized popularity in top-n recommendation. In: Workshop on Recommendation in Complex Scenarios co-located with 12th ACM Conference on Recommender Systems (RecSys 2018), Vancouver, BC, Canada, October 2–7, 2018 (2018)

  • Anelli, V.W., Noia, T.D., Sciascio, E.D., Ragone, A., Trotta, J.: Local popularity and time in top-n recommendation. In: L. Azzopardi, B. Stein, N. Fuhr, P. Mayr, C. Hauff, D. Hiemstra (eds.) Advances in Information Retrieval: 41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol. 11437, pp. 861–868. Springer (2019). https://doi.org/10.1007/978-3-030-15712-8_63

  • Azaria, A., Hassidim, A., Kraus, S., Eshkol, A., Weintraub, O., Netanely, I.: Movie recommender system for profit maximization. In: Yang, Q., King, I., Li, Q., Pu, P., Karypis, G. (eds.) 7th ACM Conference on Recommender Systems, RecSys ’13, Hong Kong, China, October 12–16, 2013, pp. 121–128. ACM (2013). https://doi.org/10.1145/2507157.2507162

  • Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the 4th International Conference on Web Search and Web Data Mining, WSDM 2011, Hong Kong, China, February 9–12, 2011, pp. 635–644 (2011)

  • Balabanovic, M., Shoham, Y.: Content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997). https://doi.org/10.1145/245108.245124

  • Barocas, S., Selbst, A.D.: Big data’s disparate impact. Cal. L. Rev. 104, 671 (2016)

    Google Scholar 

  • Belkin, N.J., Robertson, S.E.: Some ethical and political implications of theoretical research in information science. In: Proceedings of the Association for Information Science, ASIS ’76, pp. 597–605 (1976)

  • Bellogín, A., Castells, P., Cantador, I.: Statistical biases in information retrieval metrics for recommender systems. Inf. Retr. J. 20(6), 606–634 (2017). https://doi.org/10.1007/s10791-017-9312-z

    Article  Google Scholar 

  • Biega, A.J., Gummadi, K.P., Weikum, G.: Equity of attention: amortizing individual fairness in rankings. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, pp. 405–414. ACM (2018). https://doi.org/10.1145/3209978.3210063

  • Billsus, D., Pazzani, M.J.: User modeling for adaptive news access. User Model. User-Adapt. Interact. 10(2–3), 147–180 (2000). https://doi.org/10.1023/A:1026501525781

  • Boratto, L., Fenu, G., Marras, M.: The effect of algorithmic bias on recommender systems for massive open online courses. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) Advances in Information Retrieval–41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14–18, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol. 11437, pp. 457–472. Springer (2019). https://doi.org/10.1007/978-3-030-15712-8_30

  • Botev, Z.I., Kroese, D.P.: The generalized cross entropy method, with applications to probability density estimation. Methodol. Comput. Appl. Probab. 13(1), 1–27 (2011)

    Article  MathSciNet  Google Scholar 

  • Breese, J.S., Heckerman, D., Kadie, C.M.: Empirical analysis of predictive algorithms for collaborative filtering. In: Cooper, G.F., Moral, S. (eds.) UAI ’98: proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, University of Wisconsin Business School, Madison, Wisconsin, USA, July 24–26, 1998, pp. 43–52. Morgan Kaufmann (1998). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=231&proceeding_id=14

  • Burke, R.: Multisided fairness for recommendation. arXiv preprint arXiv:1707.00093 (2017)

  • Burke, R.D., Abdollahpouri, H., Mobasher, B., Gupta, T.: Towards multi-stakeholder utility evaluation of recommender systems. In: Cena, F., Desmarais, M.C., Dicheva, D. (eds.) Late-breaking Results, Posters, Demos, Doctoral Consortium and Workshops Proceedings of the 24th ACM Conference on User Modeling, Adaptation and Personalisation (UMAP 2016), Halifax, Canada, July 13-16, 2016., CEUR Workshop Proceedings, vol. 1618. CEUR-WS.org (2016). http://ceur-ws.org/Vol-1618/SOAP_paper2.pdf

  • Burke, R., Sonboli, N., Ordonez-Gauger, A.: Balanced neighborhoods for multi-sided fairness in recommendation. In: Friedler, S.A., Wilson, C. (eds.) Conference on Fairness, Accountability and Transparency, FAT 2018, 23–24 February 2018, New York, NY, USA, Proceedings of Machine Learning Research, vol. 81, pp. 202–214. PMLR (2018). http://proceedings.mlr.press/v81/burke18a.html

  • Campos, P.G., Díez, F., Cantador, I.: Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols. User Model. User-Adapt. Interact. 24(1–2), 67–119 (2014). https://doi.org/10.1007/s11257-012-9136-x

    Article  Google Scholar 

  • Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020)

  • Chen, L., Hsu, F., Chen, M., Hsu, Y.: Developing recommender systems with the consideration of product profitability for sellers. Inf. Sci. 178(4), 1032–1048 (2008). https://doi.org/10.1016/j.ins.2007.09.027

  • Christakopoulou, K., Kawale, J., Banerjee, A.: Recommendation with capacity constraints. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 1439–1448. ACM (2017). https://doi.org/10.1145/3132847.3133034

  • Cowell, F.A.: Measurement of inequality. Handb. Income Distrib. 1, 87–166 (2000)

    Article  Google Scholar 

  • Cowell, F.A., Kuga, K.: Inequality measurement: an axiomatic approach. Eur. Econ. Rev. 15(3), 287–305 (1981)

    Article  Google Scholar 

  • Cremonesi, P., Koren, Y., Turrin, R.: Performance of recommender algorithms on top-n recommendation tasks. In: Amatriain, X., Torrens, M., Resnick, P., Zanker, M. (eds.) Proceedings of the 2010 ACM Conference on Recommender Systems, RecSys 2010, Barcelona, Spain, September 26–30, 2010, pp. 39–46. ACM (2010). https://doi.org/10.1145/1864708.1864721

  • Csiszár, I.: A class of measures of informativity of observation channels. Period. Math. Hung. 2(1–4), 191–213 (1972)

    Article  MathSciNet  Google Scholar 

  • Dalton, H.: The measurement of the inequality of incomes. Econ. J. 30(119), 348–361 (1920)

    Article  Google Scholar 

  • Das, A., Mathieu, C., Ricketts, D.: Maximizing profit using recommender systems. CoRR arXiv:0908.3633 (2009)

  • Deldjoo, Y., Anelli, V.W., Zamani, H., Kouki, A.B., Noia, T.D.: Recommender systems fairness evaluation via generalized cross entropy. In: Proceedings of the Workshop on Recommendation in Multi-stakeholder Environments co-located with the 13th ACM Conference on Recommender Systems (RecSys 2019), Copenhagen, Denmark, September 20, 2019, CEUR Workshop Proceedings, vol. 2440. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2440/short3.pdf

  • Deldjoo, Y., Noia, T.D., Merra, F.A.: Adversarial machine learning in recommender systems (aml-recsys). In: WSDM ’20: The 13th ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3–7, 2020, pp. 869–872. ACM (2020a). https://doi.org/10.1145/3336191.3371877

  • Deldjoo, Y., Noia, T.D., Sciascio, E.D., Merra, F.A.: How dataset characteristics affect the robustness of collaborative recommendation models. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25–30, 2020, pp. 951–960. ACM (2020b). https://doi.org/10.1145/3397271.3401046

  • Deldjoo, Y., Schedl, M., Cremonesi, P., Pasi, G.: Recommender systems leveraging multimedia content. ACM Comput. Surv. 53(5), 1–38 (2020c)

    Article  Google Scholar 

  • Deldjoo, Y., Di Noia, T., Merra, F.A.: A survey on adversarial recommender systems: from attack/defense strategies to generative adversarial networks. ACM Comput. Surv. (2021). https://doi.org/10.1145/3439729

    Article  Google Scholar 

  • Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, pp. 577–586. ACM (2011)

  • Ekstrand, M.D., Tian, M., Azpiazu, I.M., Ekstrand, J.D., Anuyah, O., McNeill, D., Pera, M.S.: All the cool kids, how do they fit in?: popularity and demographic biases in recommender evaluation and effectiveness. In: Conference on Fairness, Accountability and Transparency, pp. 172–186 (2018)

  • Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, pp. 51–60. AAAI Press (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16523

  • Gunawardana, A., Shani, G.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 265–308. Springer, (2015). https://doi.org/10.1007/978-1-4899-7637-6_8

  • Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016, Barcelona, Spain, pp. 3315–3323 (2016). http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning

  • Havrda, J., Charvát, F.: Quantification method of classification processes. Concept of structural \(a\)-entropy. Kybernetika 3(1), 30–35 (1967)

    MathSciNet  MATH  Google Scholar 

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.: Neural collaborative filtering. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3–7, 2017, pp. 173–182. ACM (2017). https://doi.org/10.1145/3038912.3052569

  • He, R., McAuley, J.J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Bourdeau, J., Hendler, J., Nkambou, R., Horrocks, I., Zhao, B.Y. (eds.) Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11–15, 2016, pp. 507–517. Springer, (2016). https://doi.org/10.1145/2872427.2883037

  • Herlocker, J.L., Konstan, J.A., Riedl, J.: An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5(4), 287–310 (2002). https://doi.org/10.1023/A:1020443909834

    Article  Google Scholar 

  • Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15–19, 2008, Pisa, Italy, pp. 263–272. IEEE Computer Society (2008). https://doi.org/10.1109/ICDM.2008.22

  • Jannach, D., Adomavicius, G.: Price and profit awareness in recommender systems. CoRR arXiv:1707.08029 (2017)

  • Jannach, D., Lerche, L., Kamehkhosh, I., Jugovac, M.: What recommenders recommend: an analysis of recommendation biases and possible countermeasures. User Model. User-Adapt. Interact. 25(5), 427–491 (2015). https://doi.org/10.1007/s11257-015-9165-3

    Article  Google Scholar 

  • Jannach, D., Resnick, P., Tuzhilin, A., Zanker, M.: Recommender systems: beyond matrix completion. Commun. ACM 59(11), 94–102 (2016). https://doi.org/10.1145/2891406

    Article  Google Scholar 

  • Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  • Kapur, J.N., Kesavan, H.K.: The Generalized Maximum Entropy Principle (with Applications). Sandford Educational Press Waterloo, Ontario (1987)

    MATH  Google Scholar 

  • Kim, Y., Stratos, K., Sarikaya, R.: Frustratingly easy neural domain adaptation. In: Calzolari, N., Matsumoto, Y., Prasad, R. (eds.) COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan, pp. 387–396. ACL (2016). http://aclweb.org/anthology/C/C16/C16-1038.pdf

  • Koren, Y.: Factorization meets the neighborhood a multifaceted collaborative filtering model. In: Li, Y., Liu, B., Sarawagi, S. (eds.) Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24–27, 2008, pp. 426–434. ACM (2008). https://doi.org/10.1145/1401890.1401944

  • Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  • Lang, K.: Newsweeder: learning to filter netnews. In: Proceedings of the 12th International Machine Learning Conference (ML95 (1995))

  • Liu, W., Burke, R.: Personalizing fairness-aware re-ranking. In: 2nd FATREC Workshop on Responsible Recommendation (2018)

  • McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: Baeza-Yates, R.A., Lalmas, M., Moffat, A., Ribeiro-Neto, B.A. (eds.) Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9–13, 2015, pp. 43–52. ACM, (2015). https://doi.org/10.1145/2766462.2767755

  • McNee, S.M., Riedl, J., Konstan, J.A.: Being accurate is not enough: how accuracy metrics have hurt recommender systems. In: Olson, G.M., Jeffries, R. (eds.) Extended Abstracts Proceedings of the 2006 Conference on Human Factors in Computing Systems, CHI 2006, Montréal, Québec, Canada, April 22–27, 2006, pp. 1097–1101. ACM (2006). https://doi.org/10.1145/1125451.1125659

  • Mehrotra, R., Anderson, A., Diaz, F., Sharma, A., Wallach, H.M., Yilmaz, E.: Auditing search engines for differential satisfaction across demographics. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3–7, 2017, pp. 626–633. ACM, (2017). https://doi.org/10.1145/3041021.3054197

  • Mehrotra, R., McInerney, J., Bouchard, H., Lalmas, M., Diaz, F.: Towards a fair marketplace: Counterfactual evaluation of the trade-off between relevance, fairness & satisfaction in recommendation systems. In: Cuzzocrea, A., Allan, J., Paton, N.W., Srivastava, D., Agrawal, R., Broder, A.Z., Zaki, M.J., Candan, K.S., Labrinidis, A., Schuster, A., Wang, H. (eds.) Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22–26, 2018, pp. 2243–2251. ACM (2018). https://doi.org/10.1145/3269206.3272027

  • Ning, X., Karypis, G.: SLIM: sparse linear methods for top-n recommender systems. In: Cook, D.J., Pei, J., Wang, W., Zaïane, O.R., Wu, X. (eds.) 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, pp. 497–506. IEEE Computer Society (2011). https://doi.org/10.1109/ICDM.2011.134

  • Pan, R., Zhou, Y., Cao, B., Liu, N.N., Lukose, R.M., Scholz, M., Yang, Q.: One-class collaborative filtering. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15–19, 2008, Pisa, Italy, pp. 502–511. IEEE Computer Society (2008). https://doi.org/10.1109/ICDM.2008.16

  • Panniello, U., Gorgoglione, M., Hill, S., Hosanagar, K.: Incorporating profit margins into recommender systems: a randomized field experiment of purchasing behavior and consumer trust. University of Pennsylvania, Scholarly Commons (2014)

  • Pigou, A.: Wealth and Welfare, PCMI Collection. Macmillan and Company, Limited, Canada (1912)

    Google Scholar 

  • Qamar, A.M., Gaussier, E., Chevallet, J.P., Lim, J.H.: Similarity learning for nearest neighbor classification. In: Data Mining, 2008. ICDM’08. 8th IEEE International Conference on, pp. 983–988. IEEE (2008)

  • Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: bayesian personalized ranking from implicit feedback. In: Bilmes, J.A., Ng, A.Y. (eds.) UAI 2009, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009, pp. 452–461. AUAI Press (2009). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1630&proceeding_id=25

  • Sapiezynski, P., Kassarnig, V., Wilson, C.: Academic performance prediction in a gender-imbalanced environment. In: 1st FATREC Workshop on Responsible Recommendation (2017)

  • Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Analysis of recommendation algorithms for e-commerce. In: Proceedings of the 2nd ACM Conference on Electronic Commerce, pp. 158–167. ACM (2000)

  • Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Shen, V.Y., Saito, N., Lyu, M.R., Zurko, M.E. (eds.) Proceedings of the 10th International World Wide Web Conference, WWW 10, Hong Kong, China, May 1–5, 2001, pp. 285–295. ACM (2001). https://doi.org/10.1145/371920.372071

  • Shani, G., Heckerman, D., Brafman, R.I.: An mdp-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005). http://jmlr.org/papers/v6/shani05a.html

  • Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Guo, Y., Farooq, F. (eds.) Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19–23, 2018, pp. 2219–2228. ACM (2018). https://doi.org/10.1145/3219819.3220088

  • Speicher, T., Heidari, H., Grgic-Hlaca, N., Gummadi, K.P., Singla, A., Weller, A., Zafar, M.B.: A unified approach to quantifying algorithmic unfairness: Measuring individual & group unfairness via inequality indices. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2239–2248. ACM (2018)

  • Sühr, T., Biega, A.J., Zehlike, M., Gummadi, K.P., Chakraborty, A.: Two-sided fairness for repeated matchings in two-sided markets: A case study of a ride-hailing platform. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019, pp. 3082–3092 (2019). https://doi.org/10.1145/3292500.3330793

  • Sürer, Ö., Burke, R., Malthouse, E.C.: Multistakeholder recommendation with provider constraints. In: Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2–7, 2018, pp. 54–62 (2018). https://doi.org/10.1145/3240323.3240350

  • Tsintzou, V., Pitoura, E., Tsaparas, P.: Bias disparity in recommendation systems. In: Proceedings of the Workshop on Recommendation in Multi-stakeholder Environments co-located with the 13th ACM Conference on Recommender Systems (RecSys 2019), Copenhagen, Denmark, September 20, 2019, CEUR Workshop Proceedings, vol. 2440. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2440/short4.pdf

  • Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, FairWare@ICSE 2018, Gothenburg, Sweden, May 29, 2018, pp. 1–7. ACM (2018). https://doi.org/10.1145/3194770.3194776

  • Wang, X., He, X., Chua, T.: Learning and reasoning on graph for recommendation. In: WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3–7, 2020, pp. 890–893. ACM (2020). https://doi.org/10.1145/3336191.3371873

  • Wang, H., Wu, C.: A mathematical model for product selection strategies in a recommender system. Expert Syst. Appl. 36(3), 7299–7308 (2009). https://doi.org/10.1016/j.eswa.2008.09.006

    Article  MathSciNet  Google Scholar 

  • Yao, S., Huang, B.: Beyond parity: Fairness objectives for collaborative filtering. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 2921–2930 (2017). http://papers.nips.cc/paper/6885-beyond-parity-fairness-objectives-for-collaborative-filtering

  • Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E. (eds.) Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3-7, 2017, pp. 1171–1180. ACM, (2017). https://doi.org/10.1145/3038912.3052660

  • Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P.: Fairness constraints: a flexible approach for fair classification. J. Mach. Learn. Res. 20, 75:1–75:42 (2019). http://jmlr.org/papers/v20/18-262.html

  • Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P., Weller, A.: From parity to preference-based notions of fairness in classification. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 229–239 (2017). http://papers.nips.cc/paper/6627-from-parity-to-preference-based-notions-of-fairness-in-classification

  • Zamani, H., Croft, W.B.: Learning a joint search and recommendation model from user-item interactions. In: Proceedings of the 13th International Conference on Web Search and Data Mining, WSDM ’20, p. 717–725. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3336191.3371818

  • Zehlike, M., Bonchi, F., Castillo, C., Hajian, S., Megahed, M., Baeza-Yates, R.A.: Fa*ir: A fair top-k ranking algorithm. In: Lim, E., Winslett, M., Sanderson, M., Fu, A.W., Sun, J., Culpepper, J.S., Lo, E., Ho, J.C., Donato, D., Agrawal, R., Zheng, Y., Castillo, C., Sun, A., Tseng, V.S., Li, C. (eds.) Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 1569–1578. ACM (2017). https://doi.org/10.1145/3132847.3132938

  • Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9–13, 2001, New Orleans, Louisiana, USA, pp. 334–342. ACM, (2001). https://doi.org/10.1145/383952.384019

  • Zheng, Y., Ghane, N., Sabouri, M.: Personalized educational learning with multi-stakeholder optimizations. In: Papadopoulos, G.A., Samaras, G., Weibelzahl, S., Jannach, D., Santos, O.C. (eds.) Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, UMAP 2019, Larnaca, Cyprus, June 09–12, 2019, pp. 283–289. ACM (2019). https://doi.org/10.1145/3314183.3323843

  • Zhu, Z., Hu, X., Caverlee, J.: Fairness-aware tensor-based recommendation. In: Cuzzocrea, A., Allan, J., Paton, N.W., Srivastava, D., Agrawal, R., Broder, A.Z., Zaki, M.J., Candan, K.S., Labrinidis, A., Schuster, A., Wang, H. (eds.) Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22–26, 2018, pp. 1153–1162. ACM (2018). https://doi.org/10.1145/3269206.3271795

Download references

Acknowledgements

The authors thank the reviewers for their thoughtful comments and suggestions. This work was supported in part by the Ministerio de Ciencia, Innovación y Universidades (Reference: PID2019-108965GB-I00) and in part by the Center for Intelligent Information Retrieval. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yashar Deldjoo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Theoretical analysis of GCE properties

In appendix, we provide a theoretical analysis of the proposed probabilistic metric, i.e., GCE, for measuring unfairness. Previous work (Speicher et al. 2018) has explored four properties to be satisfied by inequality indices, including unfairness inequality. These properties are (1) anonymity, (2) population invariance, (3) transfer principle, and (4) zero normalization.

We claim that GCE satisfies these four properties. In appendix, for the sake of clarity, we prove that the mentioned properties are satisfied by a simplified version of the proposed probabilistic unfairness metric, i.e., GCE when \(p_f\) is uniform. We follow our proofs based on the GCE formulation for discrete attributes, presented in Eq. (3). Assuming \(p_f\) being uniform, the GCE formulation is:

$$\begin{aligned} I_{\text {uniform}}(m, a)&= \frac{1}{\beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \frac{1}{n}\right) ^\beta \cdot p_m^{(1-\beta )}(a_j)} - 1 \right] \nonumber \\&= \frac{1}{\beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \frac{1}{n}\right) ^\beta \cdot \left( \frac{v_j}{Z}\right) ^{(1-\beta )}} - 1 \right] \nonumber \\&= \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] \end{aligned}$$
(9)

where \(p_m(a_j)=v_j/Z\), i.e., \(Z = \sum _{j=1}^n{v_j}\), and \(\mu = Z / n\) denotes the average value. For brevity, we denote \(I_{\text {uniform}}(m, a)\) as \(I_{\text {uniform}}({\mathbf {v}})\) where \({\mathbf {v}} = [v_1, v_2, \cdots , v_n] \in {\mathbb {R}}^n\) is the vector of all values corresponding to the attribute a obtained by the model m.

Anonymity. According to the anonymity property, the inequality measure should not depend on the characteristics of attributes except for their values obtained by the model. As shown in Equation (9), the inequality measure only depends on the value of attributes, i.e., \(v_j\)s and the average value \(\mu\) which again is computed based on the values as \(\frac{\sum _{j=1}^n{v_j}}{n}\). Therefore, this property is satisfied by \(I_{\text {uniform}}\).

Population invariance. This property indicates that the inequality measure is independent of the population size.

Proof

To prove that \(I_{\text {uniform}}\) satisfies the population invariance property, assume \({\mathbf {v}}' = <{\mathbf {v}}, {\mathbf {v}}, \cdots , {\mathbf {v}}> \in {\mathbb {R}}^{nk}\) denotes a k-replication of the vector \({\mathbf {v}}\). Therefore, \(I_{\text {uniform}}({\mathbf {v}}')\) is computed as:

$$\begin{aligned} I_{\text {uniform}}({\mathbf {v}}')&= \frac{1}{nk \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{nk}{ \left( \frac{v'_j}{\mu } \right) ^{(1-\beta )}} - nk \right] \\&= \frac{1}{nk \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{n}{ \left( k \frac{v_j}{\mu } \right) ^{(1-\beta )}} - nk \right] \\&= \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^{n}{ \left( \frac{v_j}{\mu } \right) ^{(1-\beta )}} - n \right] \\&= I_{\text {uniform}}({\mathbf {v}}) \end{aligned}$$

\(\square\)

The transfer principle. According to the transfer principle, also known as the Pigou–Dalton principle (Dalton 1920; Pigou 1912), transferring benefit from a high-benefit attribute value to a low-benefit value, if it does not reverse the relative position of values, must decrease the inequality.

Proof

Assume we transfer \(\delta\) from \(v_j\) to \(v_{j'}\), such that \(v_j > v_{j'}\) and \(0< \delta < \frac{v_j - v_{j'}}{2}\) so this transfer does not reverse the relative position of these two attribute values. This results in \({\mathbf {v}}' = [v_1, v_2, \ldots , v_j-\delta , \ldots , v_{j'}+\delta , \ldots , v_n] \in {\mathbb {R}}^n\). Therefore,

$$\begin{aligned}&I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}) \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v'_j}{\mu }\right) ^{(1-\beta )}} - n \right] - \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{\left( \left( \frac{v'_j}{\mu }\right) ^{(1-\beta )} - \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}\right) } \right] \nonumber \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ (v_j - \delta )^{(1-\beta )} + (v_{j'} + \delta )^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \end{aligned}$$
(10)

To obtain the maximum value of this function, we compute its derivative with respect to \(\delta\) and set it to zero, as follows:

$$\begin{aligned}&\frac{\partial (I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}))}{\partial \delta } = 0 \\&\quad \Rightarrow \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ -(1-\beta )(v_j - \delta )^{-\beta } + (1-\beta )(v_{j'} + \delta )^{-\beta } \right] = 0 \\&\quad \Rightarrow -(v_j - \delta )^{-\beta } + (v_{j'} + \delta )^{-\beta } = 0 \\&\quad \Rightarrow \delta = \frac{v_j-v_{j'}}{2} \end{aligned}$$

Since \(\frac{\partial ^2(I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}}))}{\partial \delta ^2} < 0\), the computed \(\delta\) gives us the maximum value for the given function. Therefore, according to Eq. (10), since \(0< \delta < \frac{v_j-v_{j'}}{2}\), we have:

$$\begin{aligned}&I_{\text {uniform}}({\mathbf {v}}') - I_{\text {uniform}}({\mathbf {v}})\\&\quad< \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ \left( v_j - \frac{v_j-v_{j'}}{2} \right) ^{(1-\beta )} + \left( v_{j'} + \frac{v_j-v_{j'}}{2} \right) ^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ \left( \frac{v_j+v_{j'}}{2} \right) ^{(1-\beta )} + \left( \frac{v_j+v_{j'}}{2} \right) ^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2^\beta (v_j+v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad< \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2^\beta (2v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ 2(v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} - v_{j'}^{(1-\beta )} \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta ) \cdot \mu ^{(1-\beta )}} \left[ (v_{j'})^{(1-\beta )} - v_j^{(1-\beta )} \right] \\&\quad < 0 \end{aligned}$$

Therefore, \(I_{\text {uniform}}({\mathbf {v}}') < I_{\text {uniform}}({\mathbf {v}})\), and thus, \(I_{\text {uniform}}\) satisfies the transfer principle. \(\square\)

Zero normalization. According to this property, the inequality measure should be minimized when all attribute values are equal (i.e., the uniform distribution). The minimum value for the fairness metric should be zero.

Proof

To prove this property, we use the Lagrange multiplier approach. The Lagrange function is defined as:

$$\begin{aligned} {\mathcal {L}}({\mathbf {v}}, \lambda ) = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] - \lambda \left( \sum _{j=1}^n{\frac{v_j}{n}} - \mu \right) \end{aligned}$$
(11)

where \(\lambda\) is the Lagrange multiplier. Therefore, we have:

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial {\mathcal {L}}({\mathbf {v}}, \lambda )}{\partial v_j} = \frac{1}{n \beta \mu } \cdot \left( \frac{v_j}{\mu }\right) ^{-\beta } - \frac{\lambda }{n} \\ \frac{\partial {\mathcal {L}}({\mathbf {v}}, \lambda )}{\partial \lambda } = \sum _{j=1}^n{\frac{v_j}{n}} - \mu \end{array}\right. } \end{aligned}$$
(12)

Setting the above partial derivatives to zero results in \(v_1 = v_2 = \cdots = v_n = \mu\). Therefore, we have:

$$\begin{aligned}&\min _{{\mathbf {v}}}{\frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{v_j}{\mu }\right) ^{(1-\beta )}} - n \right] } \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ \sum _{j=1}^n{ \left( \frac{\mu }{\mu }\right) ^{(1-\beta )}} - n \right] \\&\quad = \frac{1}{n \beta \cdot (1-\beta )} \left[ n - n \right] \\&\quad = 0 \end{aligned}$$

Therefore, \(I_{\text {uniform}}\) satisfies the zero normalization property. \(\square\)

Summary. In appendix, we theoretically study GCE and the provided proofs show that GCE satisfies the anonymity, population invariance, transfer principle, and zero normalization properties, under the uniformity assumption for the fair distribution. The proofs can be extended to the general case by relaxing the uniformity assumption, since we do not use any property of the uniform distribution in the proofs and just use its simple form to improve the readability and clarity.

Appendix B: Full results

In this section, we present the results for all the datasets and item and user attributes that were not included in the paper because of space constraints. First, we show in Table 14 the item GCE based on the price attribute for the datasets (instead of only limited to toys, as in Sect. 5.1).

Second, our results on user attributes—that is, interactions, helpfulness, and happiness for Amazon datasets, and age and gender for MovieLens—are presented for the datasets (together with the analysis already shown in Sect. 5.2 for Amazon Toys & Games): Amazon Electronics is described in Table 15, Amazon Video Games in Table 16, and MovieLens-1M in Table 17.

Table 14 ItemGCE using price as feature on the tested datasets
Table 15 UserGCE for Amazon Electronics dataset using the three user features considered
Table 16 UserGCE for Amazon Video Games dataset using the three user features considered
Table 17 UserGCE for MovieLens-1M dataset using the two user features considered for the rest of the datasets, plus age and gender

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deldjoo, Y., Anelli, V.W., Zamani, H. et al. A flexible framework for evaluating user and item fairness in recommender systems. User Model User-Adap Inter 31, 457–511 (2021). https://doi.org/10.1007/s11257-020-09285-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-020-09285-1

Navigation