Skip to main content
Log in

The effect of strategic noise in linear regression

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

We build on an emerging line of work which studies strategic manipulations in training data provided to machine learning algorithms. Specifically, we focus on the ubiquitous task of linear regression. Prior work focused on the design of strategyproof algorithms, which aim to prevent such manipulations altogether by aligning the incentives of data sources. However, algorithms used in practice are often not strategyproof, which induces a strategic game among the agents. We focus on a broad class of non-strategyproof algorithms for linear regression, namely \(\ell _p\) norm minimization (\(p > 1\)) with convex regularization. We show that when manipulations are bounded, every algorithm in this class admits a unique pure Nash equilibrium outcome. We also shed light on the structure of this equilibrium by uncovering a surprising connection between strategyproof algorithms and pure Nash equilibria of non-strategyproof algorithms in a broader setting, which may be of independent interest. Finally, we analyze the quality of equilibria under these algorithms in terms of the price of anarchy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. In regression literature there are often known as independent and dependent variables respectively.

  2. Following standard convention, we assume the last component of each \(\varvec{x_i}\) is a constant, say 1.

  3. Equilibria can generally depend on the full preferences, but results in Sect. 4 show only peaks matter.

  4. This is a mild condition which requires treating the agents symmetrically.

  5. A function is proper if the domain on which it is finite is non-empty.

  6. \(\mathcal {L}^{\infty }(\varvec{0},\beta )\) is known as the horizon function of \(\mathcal {L}\).

  7. This is weaker because for a strategyproof rule f, \(f(\varvec{y})\) is a dominant strategy equilibrium outcome (and thus also a Nash equilibrium outcome) under f itself.

  8. We slightly abuse notation here with \(\lambda ^*(y'_i)\) referring to the unique value satisfying equation Eq. (8) for \(y'_i\).

  9. Whether agents 1 and 4 are strategic or honest does not matter in this example. We chose them to he strategic for ease of exposition

  10. We abuse the terminology slightly for simplicity. The average PPoA refers to the average ratio of the loss under the PNE outcome of a mechanism to the loss under the OLS with honest reporting in our experiments.

  11. Our work allows for an agent to be honest or strategic while in most relevant literature, all agents are considered strategic. Thus we chose \(\alpha =1\) to be the default value.

References

  1. Ben-Porat, O., & Tennenholtz, M. (2017). Best response regression. In Proceedings of the annual conference on neural information processing systems (NeurIPS) (pp. 1499–1508).

  2. Ben-Porat, O., & Tennenholtz, M. (2019). Regression equilibrium. In Proceedings of the 20th ACM conference on economics and computation (EC) (pp. 173–191).

  3. Bousquet, O., von Luxburg, U., & Gunnar, R. (2004). Introduction to statistical learning theory. Springer.

  4. Bshouty, N. H., Eiron, N., & Kushilevitz, E. (2002). PAC learning with nasty noise. Theoretical Computer Science, 288(2), 255–275.

    Article  MathSciNet  Google Scholar 

  5. Cai, Y., Daskalakis, C., & Papadimitriou, CH. (2015). Optimum statistical estimation with strategic data sources. In Proceedings of the 28th conference on computational learning theory (COLT) (pp. 280–296).

  6. California S. (1990). California housing prices. Data retrieved from Kaggle, https://www.kaggle.com/camnugent/california-housing-prices

  7. Caragiannis, I., Procaccia, AD., & Shah, N. (2016). Truthful univariate estimators. In Proceedings of the 33rd international conference on machine learning (ICML).

  8. Caro, F., Gallien, J., Díaz, M., García, J., Corredoira, J. M., Montes, M., et al. (2010). Zara uses operations research to reengineer its global distribution process. Interfaces, 40(1), 71–84.

    Article  Google Scholar 

  9. Chawla, S., & Hartline, J. D. (2013). Auctions with unique equilibria. In Proceedings of the 14th ACM conference on economics and computation (EC) (pp. 181–196).

  10. Chen, Y., Caramanis, C., & Mannor, S. (2013). Robust sparse regression under adversarial corruption. In Proceedings of the 30th international conference on machine learning (ICML) (pp. 774–782).

  11. Chen, Y., Podimata, C., Procaccia, A. D., & Shah, N. (2018). Strategyproof linear regression in high dimensions. In Proceedings of the 19th ACM conference on economics and computation (EC) (pp. 9–26).

  12. Cummings, R., Ioannidis, S., & Ligett, K. (2015). Truthful linear regression. In Proceedings of the 28th conference on computational learning theory (COLT) (pp. 448–483).

  13. Dasgupta, P., Hammond, P., & Maskin, E. (1979). The implementation of social choice rules: Some general results on incentive compatibility. The Review of Economic Studies, 46(2), 185–216.

    Article  MathSciNet  Google Scholar 

  14. Dekel, O., Fischer, F., & Procaccia, A. D. (2010). Incentive compatible regression learning. Journal of Computer and System Sciences, 76(8), 759–777.

    Article  MathSciNet  Google Scholar 

  15. Dong, J., Roth, A., Schutzman, Z., Waggoner, B., & Wu, Z. S. (2018). Strategic classification from revealed preferences. In Proceedings of the 19th ACM conference on economics and computation (EC) (pp. 55–70).

  16. Feldman, M., Fiat, A., & Golomb, I. (2016). On voting and facility location. In Proceedings of the 2016 ACM conference on economics and computation (pp. 269–286).

  17. Freeman, R., Pennock, D. M., Peters, D., & Wortman Vaughan, J. (2019). Truthful aggregation of budget proposals. In Proceedings of the 2019 ACM conference on economics and computation (pp. 751–752).

  18. Frénay, B., & Verleysen, M. (2013). Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25(5), 845–869.

    Article  Google Scholar 

  19. Goldman, S. A., & Sloan, R. H. (1995). Can PAC learning algorithms tolerate random attribute noise? Algorithmica, 14(1), 70–84.

    Article  MathSciNet  Google Scholar 

  20. Gu, S., & Rigazio, L. (2014). Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068

  21. Hardt, M., Megiddo, N., Papadimitriou, C. H., & Wootters, M. (2016). Strategic classification. In Proceedings of the 7th innovations in theoretical computer science conference (ITCS) (pp. 111–122).

  22. Immorlica, N., Kalai, A. T., Lucier, B., Moitra, A., Postlewaite, A., & Tennenholtz, M. (2011). Dueling algorithms. In Proceedings of the 43rd annual ACM symposium on theory of computing (STOC) (pp. 215–224).

  23. Kearns, M., & Li, M. (1993). Learning in the presence of malicious errors. SIAM Journal on Computing, 22(4), 807–837.

    Article  MathSciNet  Google Scholar 

  24. Koutsoupias, E., & Papadimitriou, C. (1999). Worst-case equilibria. In Proceedings of the 16th international symposium on theoretical aspects of computer science (STACS) (pp. 404–413).

  25. Laffont, J. J., & Maskin, E. (1982). Nash and dominant strategy implementation in economic environments. Journal of Mathematical Economics, 10(1), 17–47.

    Article  MathSciNet  Google Scholar 

  26. Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2, 285–318.

    Google Scholar 

  27. Mansour, Y., Slivkins, A., & Wu, Z. S. (2017). Competing bandits: Learning under competition. arXiv:1702.08533

  28. Meir, R., Procaccia, A. D., & Rosenschein, J. S. (2012). Algorithms for strategyproof classification. Artificial Intelligence, 186, 123–156.

    Article  MathSciNet  Google Scholar 

  29. Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis, (Vol. 821). John Wiley & Sons.

  30. Moulin, H. (1980). On strategy-proofness and single peakedness. Public Choice, 35(4), 437–455.

    Article  Google Scholar 

  31. Natarajan, N., Dhillon, I. S., Ravikumar, P. K., & Tewari, A. (2013). Learning with noisy labels. In Proceedings of the annual conference on neural information processing systems (NeurIPS) (pp. 1196–1204).

  32. Nisan, N., Roughgarden, T., Tardos, E., & Vazirani, V. V. (2007). Algorithmic game theory. Cambridge University Press.

  33. Perote, J., & Perote-Pena, J. (2004). Strategy-proof estimators for simple regression. Mathematical Social Sciences, 47(2), 153–176.

    Article  MathSciNet  Google Scholar 

  34. Pugh, C. C. (2003). Real mathematical analysis: Undergraduate texts in mathematics. Springer.

  35. Renault, R., & Trannoy, A. (2005). Protecting minorities through the average voting rule. Journal of Public Economic Theory, 7(2), 169–199.

    Article  Google Scholar 

  36. Renault, R., & Trannoy, A. (2011). Assessing the extent of strategic manipulation: The average vote example. SERIEs, 2(4), 497–513.

    Article  Google Scholar 

  37. Roberts, K. (1979). The characterization of implementable choice rules. Aggregation and Revelation of Preferences, 12(2), 321–348.

    MATH  Google Scholar 

  38. Rockafellar, R. T., & Wets, R. J. B. (2009). Variational analysis. Springer Science & Business Media.

  39. Yamamura, H., & Kawasaki, R. (2013). Generalized average rules as stable nash mechanisms to implement generalized median rules. Social Choice and Welfare, 40(3), 815–832.

    Article  MathSciNet  Google Scholar 

  40. Yeh, I. C., & Hsu, T. K. (2018). Building real estate valuation models with comparative approach through case-based reasoning. Applied Soft Computing, 65, 260–271.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Safwan Hossain.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hossain, S., Shah, N. The effect of strategic noise in linear regression. Auton Agent Multi-Agent Syst 35, 21 (2021). https://doi.org/10.1007/s10458-021-09502-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10458-021-09502-0

Keywords

Navigation