当前位置: X-MOL 学术Mach. Learn. › 论文详情
Sparse hierarchical regression with polynomials
Machine Learning ( IF 2.809 ) Pub Date : 2020-01-24 , DOI: 10.1007/s10994-020-05868-6
Dimitris Bertsimas, Bart Van Parys

We present a novel method for sparse polynomial regression. We are interested in that degree r polynomial which depends on at most k inputs, counting at most \(\ell\) monomial terms, and minimizes the sum of the squares of its prediction errors. Such highly structured sparse regression was denoted by Bach (Advances in neural information processing systems, pp 105–112, 2009) as sparse hierarchical regression in the context of kernel learning. Hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We propose an efficient two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve the remaining reduced hierarchical \((k, \ell )\)-sparse problem exactly. The ability of our method to identify all k relevant inputs and all \(\ell\) monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. Hierarchical sparsity can retain the flexibility of general nonparametric methods such as nearest neighbors or regression trees (CART), without sacrificing much statistical power. The presented work hence fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with \(n\approx 10{,}000\) observations for input dimension \(p\approx 1000\).
更新日期:2020-01-24

 

全部期刊列表>>
智控未来
聚焦商业经济政治法律
跟Nature、Science文章学绘图
控制与机器人
招募海内外科研人才,上自然官网
中洪博元
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
湖南大学化学化工学院刘松
上海有机所
李旸
南方科技大学
西湖大学
伊利诺伊大学香槟分校
徐明华
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug