CORRCOEF calculates the correlation matrix from pairwise correlations. The input data can contain missing values encoded with NaN. Missing data (NaN's) are handled by pairwise deletion [15]. In order to avoid possible pitfalls, use case-wise deletion or or check the correlation of NaN's with your data (see below). A significance test for testing the Hypothesis 'correlation coefficient R is significantly different to zero' is included. [...] = CORRCOEF(X); calculates the (auto-)correlation matrix of X [...] = CORRCOEF(X,Y); calculates the crosscorrelation between X and Y NOTE: matlab's CORRCOEF(X,Y) returns the result of CORRCOEF([X,Y]) use CORRCOEF([X,Y]) if your software should be compatible with both. [...] = CORRCOEF(..., Mode); Mode='Pearson' or 'parametric' [default] gives the correlation coefficient also known as the 'product-moment coefficient of correlation' or 'Pearson''s correlation' [1]. Currently, the unstable one-pass or single pass method [7,8] is implemented. If this is a problem, use instead the two-pass method by doing corrcoef(center(X),center(Y)) Mode='Spearman' gives 'Spearman''s Rank Correlation Coefficient' This replaces SPEARMAN.M Mode='Rank' gives a nonparametric Rank Correlation Coefficient This is the "Spearman rank correlation with proper handling of ties" This replaces RANKCORR.M [...] = CORRCOEF(..., param1, value1, param2, value2, ... ); param value 'Mode' type of correlation 'Pearson','parametric' 'Spearman' 'rank' 'rows' how do deal with missing values encoded as NaN's. 'complete': remove all rows with at least one NaN 'pairwise': [default] 'alpha' 0.01 : significance level to compute confidence interval [R,p,ci1,ci2,nan_sig] = CORRCOEF(...); R is the correlation matrix R(i,j) is the correlation coefficient r between X(:,i) and Y(:,j) p gives the significance of R It tests the null hypothesis that the product moment correlation coefficient is zero using Student's t-test on the statistic t = r*sqrt(N-2)/sqrt(1-r^2) where N is the number of samples (Statistics, M. Spiegel, Schaum series). p > alpha: do not reject the Null hypothesis: 'R is zero'. p < alpha: The alternative hypothesis 'R is larger than zero' is true with probability (1-alpha). ci1 lower (1-alpha) confidence interval ci2 upper (1-alpha) confidence interval If no alpha is provided, the default alpha is 0.01. This can be changed with function flag_implicit_significance. nan_sig p-value whether H0: 'NaN''s are not correlated' could be correct if nan_sig < alpha, H1 ('NaNs are correlated') is very likely. The result is only valid if the occurence of NaN's is uncorrelated. In order to avoid this pitfall, the correlation of NaN's should be checked or case-wise deletion should be applied. Case-Wise deletion can be implemented ix = ~any(isnan([X,Y]),2); [...] = CORRCOEF(X(ix,:),Y(ix,:),...); Correlation (non-random distribution) of NaN's can be checked with [nan_R,nan_sig]=corrcoef(X,isnan(X)) or [nan_R,nan_sig]=corrcoef([X,Y],isnan([X,Y])) or [R,p,ci1,ci2] = CORRCOEF(...); Further recommandation related to the correlation coefficient: + LOOK AT THE SCATTERPLOTS to make sure that the relationship is linear + Correlation is not causation because it is not clear which parameter is 'cause' and which is 'effect' and the observed correlation between two variables might be due to the action of other, unobserved variables. see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS, PARTCORRCOEF, flag_implicit_significance REFERENCES: on the correlation coefficient [ 1] http://mathworld.wolfram.com/CorrelationCoefficient.html [ 2] http://www.geography.btinternet.co.uk/spearman.htm [ 3] Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 5th ed. New York: Macmillan, pp. 338 and 400, 1995. [ 4] Lehmann, E. L. and D'Abrera, H. J. M. Nonparametrics: Statistical Methods Based on Ranks, rev. ed. Englewood Cliffs, NJ: Prentice-Hall, pp. 292, 300, and 323, 1998. [ 5] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 634-637, 1992 [ 6] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html [ 7] https://stats.stackexchange.com/questions/94056/instability-of-one-pass-algorithm-for-correlation-coefficient [ 8] https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#For_a_sample on the significance test of the correlation coefficient [11] http://www.met.rdg.ac.uk/cag/STATS/corr.html [12] http://www.janda.org/c10/Lectures/topic06/L24-significanceR.htm [13] http://faculty.vassar.edu/lowry/ch4apx.html [14] http://davidmlane.com/hyperstat/B134689.html [15] http://www.statsoft.com/textbook/stbasic.html%Correlations others [20] http://www.tufts.edu/~gdallal/corr.htm [21] Fisher transformation http://en.wikipedia.org/wiki/Fisher_transformation
Package: nan