Sei sulla pagina 1di 8

11/26/2019 Correlation coefficients - MATLAB corrcoef

corrcoef
Correlation coefficients

Syntax

R = corrcoef(A)
R = corrcoef(A,B)

[R,P] = corrcoef( ___ )


[R,P,RL,RU] = corrcoef( ___ )

___ = corrcoef( ___ ,Name,Value)

Description
R = corrcoef(A) returns the matrix of correlation coefficients for A, where the columns of A represent example
random variables and the rows represent observations.
example
R = corrcoef(A,B) returns coefficients between two random variables A and B.

[R,P] = corrcoef( ___ ) returns the matrix of correlation coefficients and the matrix of p-values for example
testing the hypothesis that there is no relationship between the observed phenomena (null hypothesis).
Use this syntax with any of the arguments from the previous syntaxes. If an off-diagonal element of P is
smaller than the significance level (default is 0.05), then the corresponding correlation in R is considered
significant. This syntax is invalid if R contains complex elements.
example
[R,P,RL,RU] = corrcoef( ___ ) includes matrices containing lower and upper bounds for a 95%
confidence interval for each coefficient. This syntax is invalid if R contains complex elements.

___ = corrcoef( ___ ,Name,Value) returns any of the output arguments from the previous syntaxes example
with additional options specified by one or more Name,Value pair arguments. For example,
corrcoef(A,'Alpha',0.1) specifies a 90% confidence interval, and
corrcoef(A,'Rows','complete') omits all rows of A containing one or more NaN values.

Examples collapse all

 Random Columns of Matrix

Compute the correlation coefficients for a matrix with two normally


distributed, random columns and one column that is defined in terms View MATLAB Command
of another. Since the third column of A is a multiple of the second,
these two variables are directly correlated, thus the correlation coefficient in the (2,3) and (3,2) entries of R is
1.

x = randn(6,1);
y = randn(6,1);
A = [x y 2*y+3];
R = corrcoef(A)

R = 3×3

1.0000 -0.6237 -0.6237

https://www.mathworks.com/help/matlab/ref/corrcoef.html 1/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

-0.6237 1.0000 1.0000


-0.6237 1.0000 1.0000

 Two Random Variables

Compute the correlation coefficient matrix between two normally


distributed, random vectors of 10 observations each. View MATLAB Command

A = randn(10,1);
B = randn(10,1);
R = corrcoef(A,B)

R = 2×2

1.0000 0.4518
0.4518 1.0000

 P-Values of Matrix

Compute the correlation coefficients and p-values of a normally


distributed, random matrix, with an added fourth column equal to the View MATLAB Command
sum of the other three columns. Since the last column of A is a linear
combination of the others, a correlation is introduced between the fourth variable and each of the other three
variables. Therefore, the fourth row and fourth column of P contain very small p-values, identifying them as
significant correlations.

A = randn(50,3);
A(:,4) = sum(A,2);
[R,P] = corrcoef(A)

R = 4×4

1.0000 0.1135 0.0879 0.7314


0.1135 1.0000 -0.1451 0.5082
0.0879 -0.1451 1.0000 0.5199
0.7314 0.5082 0.5199 1.0000

P = 4×4

1.0000 0.4325 0.5438 0.0000


0.4325 1.0000 0.3146 0.0002
0.5438 0.3146 1.0000 0.0001
0.0000 0.0002 0.0001 1.0000

 Correlation Bounds

https://www.mathworks.com/help/matlab/ref/corrcoef.html 2/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

Create a normally distributed, random matrix, with an added fourth View MATLAB Command
column equal to the sum of the other three columns, and compute the
correlation coefficients, p-values, and lower and upper bounds on the
coefficients.

A = randn(50,3);
A(:,4) = sum(A,2);
[R,P,RL,RU] = corrcoef(A)

R = 4×4

1.0000 0.1135 0.0879 0.7314


0.1135 1.0000 -0.1451 0.5082
0.0879 -0.1451 1.0000 0.5199
0.7314 0.5082 0.5199 1.0000

P = 4×4

1.0000 0.4325 0.5438 0.0000


0.4325 1.0000 0.3146 0.0002
0.5438 0.3146 1.0000 0.0001
0.0000 0.0002 0.0001 1.0000

RL = 4×4

1.0000 -0.1702 -0.1952 0.5688


-0.1702 1.0000 -0.4070 0.2677
-0.1952 -0.4070 1.0000 0.2825
0.5688 0.2677 0.2825 1.0000

RU = 4×4

1.0000 0.3799 0.3575 0.8389


0.3799 1.0000 0.1388 0.6890
0.3575 0.1388 1.0000 0.6974
0.8389 0.6890 0.6974 1.0000

The matrices RL and RU give lower and upper bounds, respectively, on each correlation coefficient according to a
95% confidence interval by default. You can change the confidence level by specifying the value of Alpha, which
defines the percent confidence, 100*(1-Alpha)%. For example, use an Alpha value equal to 0.01 to compute a
99% confidence interval, which is reflected in the bounds RL and RU. The intervals defined by the coefficient
bounds in RL and RU are bigger for 99% confidence compared to 95%, since higher confidence requires a more
inclusive range of potential correlation values.

[R,P,RL,RU] = corrcoef(A,'Alpha',0.01)

R = 4×4

1.0000 0.1135 0.0879 0.7314


0.1135 1.0000 -0.1451 0.5082
0.0879 -0.1451 1.0000 0.5199
0.7314 0.5082 0.5199 1.0000

P = 4×4

https://www.mathworks.com/help/matlab/ref/corrcoef.html 3/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

1.0000 0.4325 0.5438 0.0000


0.4325 1.0000 0.3146 0.0002
0.5438 0.3146 1.0000 0.0001
0.0000 0.0002 0.0001 1.0000

RL = 4×4

1.0000 -0.2559 -0.2799 0.5049


-0.2559 1.0000 -0.4792 0.1825
-0.2799 -0.4792 1.0000 0.1979
0.5049 0.1825 0.1979 1.0000

RU = 4×4

1.0000 0.4540 0.4332 0.8636


0.4540 1.0000 0.2256 0.7334
0.4332 0.2256 1.0000 0.7407
0.8636 0.7334 0.7407 1.0000

 NaN Values

Create a normally distributed matrix involving NaN values, and


View MATLAB Command
compute the correlation coefficient matrix, excluding any rows that
contain NaN.

A = randn(5,3);
A(1,3) = NaN;
A(3,2) = NaN;
A

A = 5×3

0.5377 -1.3077 NaN


1.8339 -0.4336 3.0349
-2.2588 NaN 0.7254
0.8622 3.5784 -0.0631
0.3188 2.7694 0.7147

R = corrcoef(A,'Rows','complete')

R = 3×3

1.0000 -0.8506 0.8222


-0.8506 1.0000 -0.9987
0.8222 -0.9987 1.0000

Use 'all' to include all NaN values in the calculation.

R = corrcoef(A,'Rows','all')

R = 3×3

https://www.mathworks.com/help/matlab/ref/corrcoef.html 4/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

1 NaN NaN
NaN NaN NaN
NaN NaN NaN

Use 'pairwise' to compute each two-column correlation coefficient on a pairwise basis. If one of the two
columns contains a NaN, that row is omitted.

R = corrcoef(A,'Rows','pairwise')

R = 3×3

1.0000 -0.3388 0.4649


-0.3388 1.0000 -0.9987
0.4649 -0.9987 1.0000

Input Arguments collapse all

A — Input array
 matrix

Input array, specified as a matrix.

• If A is a scalar, corrcoef(A) returns NaN.


• If A is a vector, corrcoef(A) returns 1.

Data Types: single | double


Complex Number Support: Yes

B — Additional input array


 vector | matrix | multidimensional array

Additional input array, specified as a vector, matrix, or multidimensional array.

• A and B must be the same size.


• If A and B are scalars, then corrcoef(A,B) returns 1. If A and B are equal, however, corrcoef(A,B)
returns NaN.
• If A and B are matrices or multidimensional arrays, then corrcoef(A,B) converts each input into its vector
representation and is equivalent to corrcoef(A(:),B(:)) or corrcoef([A(:) B(:)]).
• If A and B are 0-by-0 empty arrays, corrcoef(A,B) returns a 2-by-2 matrix of NaN values.

Data Types: single | double


Complex Number Support: Yes

Name-Value Pair Arguments


Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the
corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any
order as Name1,Value1,...,NameN,ValueN.
https://www.mathworks.com/help/matlab/ref/corrcoef.html 5/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

Example: R = corrcoef(A,'Alpha',0.03)

'Alpha' — Significance level


 0.05 (default) | number between 0 and 1

Significance level, specified as a number between 0 and 1. The value of the 'Alpha' parameter defines the
percent confidence level, 100*(1-Alpha)%, for the correlation coefficients, which determines the bounds in RL and
RU.

Data Types: single | double

'Rows' — Use of NaN option


 'all' (default) | 'complete' | 'pairwise'

Use of NaN option, specified as one of these values:

• 'all' — Include all NaN values in the input before computing the correlation coefficients.
• 'complete' — Omit any rows of the input containing NaN values before computing the correlation
coefficients. This option always returns a positive semi-definite matrix.
• 'pairwise' — Omit any rows containing NaN only on a pairwise basis for each two-column correlation
coefficient calculation. This option can return a matrix that is not positive semi-definite.

Data Types: char

Output Arguments collapse all

R — Correlation coefficients
 matrix

Correlation coefficients, returned as a matrix.

• For one matrix input, R has size [size(A,2) size(A,2)] based on the number of random variables
(columns) represented by A. The diagonal entries are set to one by convention, while the off-diagonal entries
are correlation coefficients of variable pairs. The values of the coefficients can range from -1 to 1, with -1
representing a direct, negative correlation, 0 representing no correlation, and 1 representing a direct, positive
correlation. R is symmetric.
• For two input arguments, R is a 2-by-2 matrix with ones along the diagonal and the correlation coefficients
along the off-diagonal.
• If any random variable is constant, its correlation with all other variables is undefined, and the respective row
and column value is NaN.

P — P-values
 matrix

https://www.mathworks.com/help/matlab/ref/corrcoef.html 6/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

P-values, returned as a matrix. P is symmetric and is the same size as R. The diagonal entries are all ones and the
off-diagonal entries are the p-values for each variable pair. P-values range from 0 to 1, where values close to 0
correspond to a significant correlation in R and a low probability of observing the null hypothesis.

RL — Lower bound for correlation coefficient


 matrix

Lower bound for correlation coefficient, returned as a matrix. RL is symmetric and is the same size as R. The
diagonal entries are all ones and the off-diagonal entries are the 95% confidence interval lower bound for the
corresponding coefficient in R. The syntax returning RL is invalid if R contains complex values.

RU — Upper bound for correlation coefficient


 matrix

Upper bound for correlation coefficient, returned as a matrix. RU is symmetric and is the same size as R. The
diagonal entries are all ones and the off-diagonal entries are the 95% confidence interval upper bound for the
corresponding coefficient in R. The syntax returning RL is invalid if R contains complex values.

More About collapse all

 Correlation Coefficient
The correlation coefficient of two random variables is a measure of their linear dependence. If each variable has N
scalar observations, then the Pearson correlation coefficient is defined as

 (‾A‾‾‾
N
−‾μ‾‾
)(
B − μB
)
1
(
ρ A, B )= i A i
,
N − 1 σA σB
i=1

where μ and σ are the mean and standard deviation of A, respectively, and μ and σ are the mean and
A A B B

standard deviation of B. Alternatively, you can define the correlation coefficient in terms of the covariance of A and
B:

(
ρ A, B ) = cov( A, B) .
σ Aσ B

The correlation coefficient matrix of two random variables is the matrix of correlation coefficients for each pairwise
variable combination,
( )
(
ρ A, A ) (
ρ A, B )
R = .
ρ( B, A) ρ( B, B)

Since A and B are always directly correlated to themselves, the diagonal entries are just 1, that is,
( )
1 (
ρ A, B )
R = .
(
ρ B, A ) 1

References

https://www.mathworks.com/help/matlab/ref/corrcoef.html 7/8
11/26/2019 Correlation coefficients - MATLAB corrcoef

[1] Fisher, R.A. Statistical Methods for Research Workers, 13th Ed., Hafner, 1958.

[2] Kendall, M.G. The Advanced Theory of Statistics, 4th Ed., Macmillan, 1979.

[3] Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. Numerical Recipes in C, 2nd Ed., Cambridge
University Press, 1992.

Extended Capabilities

 Tall Arrays
Calculate with arrays that have more rows than fit in memory.

 C/C++ Code Generation


Generate C and C++ code using MATLAB® Coder™.

 GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing
Toolbox™.

 Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing
Toolbox™.

See Also
cov | mean | plotmatrix | std

Introduced before R2006a

https://www.mathworks.com/help/matlab/ref/corrcoef.html 8/8