Abstract
We build on an emerging line of work which studies strategic manipulations in training data provided to machine learning algorithms. Specifically, we focus on the ubiquitous task of linear regression. Prior work focused on the design of strategyproof algorithms, which aim to prevent such manipulations altogether by aligning the incentives of data sources. However, algorithms used in practice are often not strategyproof, which induces a strategic game among the agents. We focus on a broad class of non-strategyproof algorithms for linear regression, namely \(\ell _p\) norm minimization (\(p > 1\)) with convex regularization. We show that when manipulations are bounded, every algorithm in this class admits a unique pure Nash equilibrium outcome. We also shed light on the structure of this equilibrium by uncovering a surprising connection between strategyproof algorithms and pure Nash equilibria of non-strategyproof algorithms in a broader setting, which may be of independent interest. Finally, we analyze the quality of equilibria under these algorithms in terms of the price of anarchy.
Similar content being viewed by others
Notes
In regression literature there are often known as independent and dependent variables respectively.
Following standard convention, we assume the last component of each \(\varvec{x_i}\) is a constant, say 1.
Equilibria can generally depend on the full preferences, but results in Sect. 4 show only peaks matter.
This is a mild condition which requires treating the agents symmetrically.
A function is proper if the domain on which it is finite is non-empty.
\(\mathcal {L}^{\infty }(\varvec{0},\beta )\) is known as the horizon function of \(\mathcal {L}\).
This is weaker because for a strategyproof rule f, \(f(\varvec{y})\) is a dominant strategy equilibrium outcome (and thus also a Nash equilibrium outcome) under f itself.
We slightly abuse notation here with \(\lambda ^*(y'_i)\) referring to the unique value satisfying equation Eq. (8) for \(y'_i\).
Whether agents 1 and 4 are strategic or honest does not matter in this example. We chose them to he strategic for ease of exposition
We abuse the terminology slightly for simplicity. The average PPoA refers to the average ratio of the loss under the PNE outcome of a mechanism to the loss under the OLS with honest reporting in our experiments.
Our work allows for an agent to be honest or strategic while in most relevant literature, all agents are considered strategic. Thus we chose \(\alpha =1\) to be the default value.
References
Ben-Porat, O., & Tennenholtz, M. (2017). Best response regression. In Proceedings of the annual conference on neural information processing systems (NeurIPS) (pp. 1499–1508).
Ben-Porat, O., & Tennenholtz, M. (2019). Regression equilibrium. In Proceedings of the 20th ACM conference on economics and computation (EC) (pp. 173–191).
Bousquet, O., von Luxburg, U., & Gunnar, R. (2004). Introduction to statistical learning theory. Springer.
Bshouty, N. H., Eiron, N., & Kushilevitz, E. (2002). PAC learning with nasty noise. Theoretical Computer Science, 288(2), 255–275.
Cai, Y., Daskalakis, C., & Papadimitriou, CH. (2015). Optimum statistical estimation with strategic data sources. In Proceedings of the 28th conference on computational learning theory (COLT) (pp. 280–296).
California S. (1990). California housing prices. Data retrieved from Kaggle, https://www.kaggle.com/camnugent/california-housing-prices
Caragiannis, I., Procaccia, AD., & Shah, N. (2016). Truthful univariate estimators. In Proceedings of the 33rd international conference on machine learning (ICML).
Caro, F., Gallien, J., Díaz, M., García, J., Corredoira, J. M., Montes, M., et al. (2010). Zara uses operations research to reengineer its global distribution process. Interfaces, 40(1), 71–84.
Chawla, S., & Hartline, J. D. (2013). Auctions with unique equilibria. In Proceedings of the 14th ACM conference on economics and computation (EC) (pp. 181–196).
Chen, Y., Caramanis, C., & Mannor, S. (2013). Robust sparse regression under adversarial corruption. In Proceedings of the 30th international conference on machine learning (ICML) (pp. 774–782).
Chen, Y., Podimata, C., Procaccia, A. D., & Shah, N. (2018). Strategyproof linear regression in high dimensions. In Proceedings of the 19th ACM conference on economics and computation (EC) (pp. 9–26).
Cummings, R., Ioannidis, S., & Ligett, K. (2015). Truthful linear regression. In Proceedings of the 28th conference on computational learning theory (COLT) (pp. 448–483).
Dasgupta, P., Hammond, P., & Maskin, E. (1979). The implementation of social choice rules: Some general results on incentive compatibility. The Review of Economic Studies, 46(2), 185–216.
Dekel, O., Fischer, F., & Procaccia, A. D. (2010). Incentive compatible regression learning. Journal of Computer and System Sciences, 76(8), 759–777.
Dong, J., Roth, A., Schutzman, Z., Waggoner, B., & Wu, Z. S. (2018). Strategic classification from revealed preferences. In Proceedings of the 19th ACM conference on economics and computation (EC) (pp. 55–70).
Feldman, M., Fiat, A., & Golomb, I. (2016). On voting and facility location. In Proceedings of the 2016 ACM conference on economics and computation (pp. 269–286).
Freeman, R., Pennock, D. M., Peters, D., & Wortman Vaughan, J. (2019). Truthful aggregation of budget proposals. In Proceedings of the 2019 ACM conference on economics and computation (pp. 751–752).
Frénay, B., & Verleysen, M. (2013). Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25(5), 845–869.
Goldman, S. A., & Sloan, R. H. (1995). Can PAC learning algorithms tolerate random attribute noise? Algorithmica, 14(1), 70–84.
Gu, S., & Rigazio, L. (2014). Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068
Hardt, M., Megiddo, N., Papadimitriou, C. H., & Wootters, M. (2016). Strategic classification. In Proceedings of the 7th innovations in theoretical computer science conference (ITCS) (pp. 111–122).
Immorlica, N., Kalai, A. T., Lucier, B., Moitra, A., Postlewaite, A., & Tennenholtz, M. (2011). Dueling algorithms. In Proceedings of the 43rd annual ACM symposium on theory of computing (STOC) (pp. 215–224).
Kearns, M., & Li, M. (1993). Learning in the presence of malicious errors. SIAM Journal on Computing, 22(4), 807–837.
Koutsoupias, E., & Papadimitriou, C. (1999). Worst-case equilibria. In Proceedings of the 16th international symposium on theoretical aspects of computer science (STACS) (pp. 404–413).
Laffont, J. J., & Maskin, E. (1982). Nash and dominant strategy implementation in economic environments. Journal of Mathematical Economics, 10(1), 17–47.
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2, 285–318.
Mansour, Y., Slivkins, A., & Wu, Z. S. (2017). Competing bandits: Learning under competition. arXiv:1702.08533
Meir, R., Procaccia, A. D., & Rosenschein, J. S. (2012). Algorithms for strategyproof classification. Artificial Intelligence, 186, 123–156.
Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis, (Vol. 821). John Wiley & Sons.
Moulin, H. (1980). On strategy-proofness and single peakedness. Public Choice, 35(4), 437–455.
Natarajan, N., Dhillon, I. S., Ravikumar, P. K., & Tewari, A. (2013). Learning with noisy labels. In Proceedings of the annual conference on neural information processing systems (NeurIPS) (pp. 1196–1204).
Nisan, N., Roughgarden, T., Tardos, E., & Vazirani, V. V. (2007). Algorithmic game theory. Cambridge University Press.
Perote, J., & Perote-Pena, J. (2004). Strategy-proof estimators for simple regression. Mathematical Social Sciences, 47(2), 153–176.
Pugh, C. C. (2003). Real mathematical analysis: Undergraduate texts in mathematics. Springer.
Renault, R., & Trannoy, A. (2005). Protecting minorities through the average voting rule. Journal of Public Economic Theory, 7(2), 169–199.
Renault, R., & Trannoy, A. (2011). Assessing the extent of strategic manipulation: The average vote example. SERIEs, 2(4), 497–513.
Roberts, K. (1979). The characterization of implementable choice rules. Aggregation and Revelation of Preferences, 12(2), 321–348.
Rockafellar, R. T., & Wets, R. J. B. (2009). Variational analysis. Springer Science & Business Media.
Yamamura, H., & Kawasaki, R. (2013). Generalized average rules as stable nash mechanisms to implement generalized median rules. Social Choice and Welfare, 40(3), 815–832.
Yeh, I. C., & Hsu, T. K. (2018). Building real estate valuation models with comparative approach through case-based reasoning. Applied Soft Computing, 65, 260–271.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hossain, S., Shah, N. The effect of strategic noise in linear regression. Auton Agent Multi-Agent Syst 35, 21 (2021). https://doi.org/10.1007/s10458-021-09502-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10458-021-09502-0