Function File: [coeff] = pca(X)
Function File: [coeff] = pca(X, Name, Value)
Function File: [coeff,score,latent] = pca(…)
Function File: [coeff,score,latent,tsquared] = pca(…)
Function File: [coeff,score,latent,tsquared,explained,mu] = pca(…)

Performs a principal component analysis on a data matrix X

A principal component analysis of a data matrix of n observations in a p-dimensional space returns a p-by-p transformation matrix, to perform a change of basis on the data. The first component of the new basis is the direction that maximizes the variance of the projected data.

Input argument:

  • x : a n-by-p data matrix

Pair arguments:

  • Algorithm : the algorithm to use, it can be either eig, for eigenvalue decomposition, or svd (default), for singular value decomposition
  • Centered : boolean indicator for centering the observation data, it is true by default
  • Economy : boolean indicator for the economy size output, it is true by default; pca returns only the elements of latent that are not necessarily zero, and the corresponding columns of coeff and score, that is, when n <= p, only the first n - 1
  • NumComponents : the number of components k to return, if k < p, then only the first k columns of coeff and score are returned
  • Rows : action to take with missing values, it can be either complete (default), missing values are removed before computation, pairwise (only with algorithm eig), the covariance of rows with missing data is computed using the available data, but the covariance matrix could be not positive definite, which triggers the termination of pca, complete, missing values are not allowed, pca terminates with an error if there are any
  • Weights : observation weights, it is a vector of positive values of length n
  • VariableWeights : variable weights, it can be either a vector of positive values of length p or the string variance to use the sample variance as weights

Return values:

  • coeff : the principal component coefficients, a p-by-p transformation matrix
  • score : the principal component scores, the representation of x in the principal component space
  • latent : the principal component variances, i.e., the eigenvalues of the covariance matrix of x
  • tsquared : Hotelling’s T-squared Statistic for each observation in x
  • explained : the percentage of the variance explained by each principal component
  • mu : the estimated mean of each variable of x, it is zero if the data are not centered

Matlab compatibility note: the alternating least square method ’als’ and associated options ’Coeff0’, ’Score0’, and ’Options’ are not yet implemented

References

  1. Jolliffe, I. T., Principal Component Analysis, 2nd Edition, Springer, 2002

Package: statistics