Principal component analysis for interval data

L. Billard; J. Le‐Rademacher

doi:10.1002/wics.1231

journal article Sep 18, 2012

Principal component analysis for interval data

L. Billard J. Le‐Rademacher

WIREs Computational Statistics Vol. 4 No. 6 pp. 535-540 · Wiley

View at Publisher Save 10.1002/wics.1231

Abstract

AbstractPrincipal component analysis for classical data is a method used frequently to reduce the effective dimension underlying a data set from p random variables to s ≪ p linear functions of those p random variables and their observed values. With contemporary large data sets, it is often the case that the data are aggregated in some meaningful scientific way such that the resulting data are symbolic data (such as lists, intervals, histograms, and the like); though symbolic data can and do occur naturally and in smaller data sets. Since symbolic data have internal variations along with the familiar (between observations) variation of classical data, direct application of classical methods to symbolic data will ignore much of the information contained in the data. Our focus is to describe and illustrate principal component methodology for interval data. The significance of symbolic data in general and of this article in particular is illustrated by its applicability for our analysis of three key 21st century challengers: networks, security data, and translational medicine. It is relatively easy to visualize the applicability to security data and translational medicine, though less easy to visualize its applicability to networks. Since an interval is typically denoted by (a,b), in a network interval, we let a be a pair of nodes and b be their edge with characteristics c and d, respectively. If this representation of a network interval is valid, then we can more easily visualize its applicability to networks also. WIREs Comput Stat 2012, 4:535–540. doi: 10.1002/wics.1231This article is categorized under:

Statistical and Graphical Methods of Data Analysis > Multivariate Analysis

Topics

No keywords indexed for this article. Browse by subject →

References

17

[1]

10.1007/978-3-642-57155-8

[2]

10.1002/9780470090183

[3]

10.1155/2011/523937

[4]

10.1002/sam.10115

[5]

10.1007/978-1-4757-1904-8

[6]

Johnson RA (2002)

[7]

Anderson TW. (1984)

[8]

Billard L. (2008)

[9]

Billard L "Symbolic principal components for interval‐valued data" Revue des Nouvelles Technologies de l'Information (2012)

[10]

10.1080/10618600.2012.679895

[11]

10.1007/978-1-4613-8431-1

[12]

Davidson KR (2002)

[13]

10.1007/978-1-4613-0019-9

[14]

Le‐RademacherJ.Principal Component Analysis for Interval‐Valued and Histogram‐Valued Data and Likelihood Functions and Some Maximum Likelihood Estimators for Symbolic Data Doctoral Dissertation. University of Georgia 2008.

[15]

10.1002/sam.10118

[16]

LeroyB ChouakriaA HerlinI DidayE.Approche géométrique et classification pour la reconnaissance de visage. Reconnaissance des Forms et Intelligence Articelle INRIA and IRISA and CNRS France 1996 548–557.

[17]

10.1002/sam.10112

Metrics

20

Citations

17

References

Details

Published: Sep 18, 2012
Vol/Issue: 4(6)
Pages: 535-540
License: View

Authors

Cite This Article

L. Billard, J. Le‐Rademacher (2012). Principal component analysis for interval data. WIREs Computational Statistics, 4(6), 535-540. https://doi.org/10.1002/wics.1231

Principal component analysis for interval data

You May Also Like