Abstract
AbstractIn this contribution, we review boosting, one of the most effective machine learning methods for classification and regression. Most of the article takes the gradient descent point of view, even though we do include the margin point of view as well. In particular, AdaBoost in classification and various versions of L2boosting in regression are covered. Advice on how to choose base (weak) learners and loss functions and pointers to software are also given for practitioners. Copyright © 2009 John Wiley & Sons, Inc.This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Classification and Regression Trees (CART)
Topics

No keywords indexed for this article. Browse by subject →

References
35
[1]
Freund Y (1996)
[3]
The strength of weak learnability

Robert E. Schapire

Machine Learning 10.1007/bf00116037
[4]
Arcing classifier (with discussion and a rejoinder by the author)

Leo Breiman

The Annals of Statistics 10.1214/aos/1024691079
[6]
Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)

Jerome Friedman, Trevor Hastie, Robert Tibshirani

The Annals of Statistics 10.1214/aos/1016218223
[7]
Mason L (2000)
[8]
Greedy function approximation: A gradient boosting machine.

Jerome H. Friedman

The Annals of Statistics 10.1214/aos/1013203451
[9]
Boosting the margin: a new explanation for the effectiveness of voting methods

Peter Bartlett, Yoav Freund, Wee Sun Lee et al.

The Annals of Statistics 10.1214/aos/1024691352
[11]
Rosset S "Boosting as a regularized path to a maximum margin classifier" J Mach Learn Res (2004)
[12]
Rätsch G "Efficient margin maximizing with boosting" J Mach Learn Res (2005)
[14]
Warmuth M (2008)
[17]
Lugosi G "On the Bayes‐risk consistency of regularized boosting methods (with discussion)" Ann Stat (2004) 10.1214/aos/1079120129
[20]
Bartlett P "AdaBoost is consistent" J Mach Learn Res (2007)
[21]
Lutz R "Boosting for high‐multivariate responses in high‐dimensional linear regression" Stat Sin (2006)
[22]
Lozano A (2006)
[23]
Regression Shrinkage and Selection Via the Lasso

Robert Tibshirani

Journal of the Royal Statistical Society Series B:... 1996 10.1111/j.2517-6161.1996.tb02080.x
[24]
Bickel P "Some theory for generalized boosting algorithms" J Mach Learn Res (2006)
[25]
Hastie T (2001)
[26]
Least angle regression

Bradley Efron, Trevor Hastie, Iain Johnstone et al.

The Annals of Statistics 10.1214/009053604000000067
[27]
Tukey J (1977)
[28]
Zhao P "Stagewise Lasso" J Mach Learn Res (2007)
[29]
Bühlmann P "Sparse boosting" J Mach Learn Res (2006)
[30]
ZhangT.Adaptive forward‐backward greedy algorithm for sparse learning with linear models. Technical report Rutgers Statistics Department. NIPS 2008.
[31]
Predictive learning via rule ensembles

Jerome H. Friedman, Bogdan E. Popescu

The Annals of Applied Statistics 10.1214/07-aoas148
[33]
Warmuth M (2008)
Metrics
24
Citations
35
References
Details
Published
Dec 31, 2009
Vol/Issue
2(1)
Pages
69-74
License
View
Cite This Article
Peter Bühlmann, Bin Yu (2009). Boosting. WIREs Computational Statistics, 2(1), 69-74. https://doi.org/10.1002/wics.55
Related

You May Also Like

Principal component analysis

Hervé Abdi, Lynne J. Williams · 2010

9,162 citations

ggplot2

Hadley Wickham · 2011

3,754 citations

The Bayesian information criterion: background, derivation, and applications

Andrew A. Neath, Joseph E. Cavanaugh · 2011

843 citations

Multicollinearity

Aylin Alin · 2010

839 citations