journal article Dec 31, 2009

Statistical data mining

Abstract
AbstractData mining is widely used in modern science to extract signal from complex data sets. This article summarizes some of the key intellectual issues in the development of this field, largely from a historical perspective. There is particular emphasis on the Curse of Dimensionality, and its implications for non‐parametric regression, classification, and cluster analysis. Copyright © 2009 John Wiley & Sons, Inc.This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification
Topics

No keywords indexed for this article. Browse by subject →

References
34
[1]
The Elements of Statistical Learning

Trevor Hastie, Jerome Friedman, Robert Tibshirani

Springer Series in Statistics 10.1007/978-0-387-21606-5
[2]
Mitchell T (1997)
[3]
Bishop C (2008)
[5]
Robust Locally Weighted Regression and Smoothing Scatterplots

William S. Cleveland

Journal of the American Statistical Association 10.1080/01621459.1979.10481038
[7]
Hastie T (1990)
[8]
Generalized Linear Models

P. McCullagh, J. A. Nelder

10.1007/978-1-4899-3242-6
[9]
Projection Pursuit Regression

Jerome H. Friedman, Werner Stuetzle

Journal of the American Statistical Association 10.1080/01621459.1981.10477729
[12]
Breiman L (1984)
[13]
Multivariate Adaptive Regression Splines

Jerome H. Friedman

The Annals of Statistics 10.1214/aos/1176347963
[14]
Regression Shrinkage and Selection Via the Lasso

Robert Tibshirani

Journal of the Royal Statistical Society Series B:... 1996 10.1111/j.2517-6161.1996.tb02080.x
[17]
Vapnik V (1996)
[19]
A training algorithm for optimal margin classifiers

Bernhard E. Boser, Isabelle M. Guyon, Vladimir N. Vapnik

Proceedings of the fifth annual workshop on Comput... 10.1145/130385.130401
[20]
Random Forests

Leo Breiman

Machine Learning 10.1023/a:1010933404324
[21]
The strength of weak learnability

Robert E. Schapire

Machine Learning 10.1007/bf00116037
[22]
Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)

Jerome Friedman, Trevor Hastie, Robert Tibshirani

The Annals of Statistics 10.1214/aos/1016218223
[23]
Sibson R (1971)
[29]
MacqueenJB. Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability University of California Press 1967;281–297.
[32]
Maximum Likelihood from Incomplete Data Via the EM Algorithm

A. P. Dempster, N. M. Laird, D. B. Rubin

Journal of the Royal Statistical Society Series B:... 1977 10.1111/j.2517-6161.1977.tb01600.x
Metrics
8
Citations
34
References
Details
Published
Dec 31, 2009
Vol/Issue
2(1)
Pages
9-25
License
View
Cite This Article
David L. Banks (2009). Statistical data mining. WIREs Computational Statistics, 2(1), 9-25. https://doi.org/10.1002/wics.53
Related

You May Also Like

Principal component analysis

Hervé Abdi, Lynne J. Williams · 2010

9,162 citations

ggplot2

Hadley Wickham · 2011

3,754 citations

The Bayesian information criterion: background, derivation, and applications

Andrew A. Neath, Joseph E. Cavanaugh · 2011

843 citations

Multicollinearity

Aylin Alin · 2010

839 citations