Sei sulla pagina 1di 23

Spearman’s Rank Correlation Coefficient:

In obtaining Karl Pearson’s coefficient of correlation


the calculations are done from the actual observations based
on the assumption that the population data is normally
distributed. For population which is not normal or when the
shape of the distribution is not known, the coefficient of
correlation is not calculated from the actual observations but
from the ranks of both the variables either in ascending or in
descending order. This method developed by Edward
Spearman is termed the rank correlation coefficient & given
by

R=1-6∑D2÷N(N2-1)

Where R denotes rank correlation coefficient.

D denotes the difference in ranks between paired items of


the two series.

N denotes the number of pairs of observation.

Just as Karl Pearson’s coefficient of correlation lies between


+1 & -1, Spearman’s rank correlation coefficient R also lies
between +1 & -1.

Rank correlation coefficient when ranks are given:-

Steps : 1) Find the difference in ranks of the N paired items


R1-R2=D.

2) Calculate the squares of the rank differences &


add them to get ∑D2.

3) Use formula R=1-6∑D2÷N(N2-1) to get the value


of rank correlation
coefficient.
Example 22. The ranking of 10 students in two subjects A &
B are as follows:

A 6 5 3 10 2 4 9 7 8 1
B 3 8 4 9 1 6 10 7 5 2
Calculate rank correlation coefficient

Solution:

Rank R1 (A) Rank R2 (B) D=R1-R2 D2(R1-R2)2


6 3 3 9
5 8 -3 9
3 4 -1 1
10 9 1 1
2 1 1 1
4 6 -2 4
9 10 -1 1
7 7 0 0
8 5 3 9
1 2 -1 1

∑D2=36

R=1-6∑D2÷N (N2-1)

=1-6×36÷10 (102-1)

=1-6×36÷10×99

= 0.7818

Example 23: two judges in a beauty competition rank the 12


entries as follows:

X 1 2 3 4 5 6 7 8 9 10 11 12
Y 12 9 6 10 3 5 4 7 8 2 11 1
What degree of agreement is there between the two judges?

Solution:
Rank R1 (X) Rank R2 (Y) R1-R2 (D) (D2) (R1-R2)2
1 12 -11 121
2 9 -7 49
3 6 -3 9
4 10 -6 36
5 3 2 4
6 5 1 1
7 4 3 9
8 7 1 1
9 8 1 1
10 2 8 64
11 11 0 0
12 1 11 121

∑D2=416

R=1-6∑D2÷N (N2-1)

=1-6×416÷12 (122-1)

=1-6×416÷12×143

=1-1.4545

= -0.4545

The degree of agreement between the two judges is the rank


correlation coefficient which is negative in this case
indicating disagreement.

Example 24: the rank of the same 15 students in two


subjects A & B are given below. The two numbers within
brackets denote the ranks of the same students in A & B
respectively.

(1,10), (2,7), (3,2), (4,6), (5,4), (6,8), (7,3), (10,1), (9,1),


(10,15), (11,9), (12,5), (13,14), (14,12), (15,13).

Find the Spearman’s Rank Correlation Coefficient.


Solution:

R1 (A) R2 (B) R1-R2 (D) (R1-R2)2 (D2)


1 10 9 81
2 7 -5 25
3 2 1 1
4 6 -2 4
5 4 1 1
6 8 -2 4
7 3 4 16
10 1 9 81
9 11 -2 4
10 15 -5 25
11 9 2 4
12 5 7 49
13 14 -1 1
14 12 2 4
15 13 2 4

∑D2=304

Spearman’s correlation coefficient is given by

R=1-6∑D2÷N (N2-1)

=1-6×304÷15×224

=1-0.5428

=0.4571

Example 25: ten competitors in a beauty contest are ranked


by three judges in the following order:

1st 1 6 5 10 3 2 4 9 7 8
judge
2nd 3 5 8 4 7 10 2 1 6 9
judge
3rd 6 4 9 8 1 2 3 10 5 7
judge
Use the rank correlation coefficient to determine which pair
of judges has the nearest approach to common taste in
beauty.

Solution: if R1, R2 & R3 are the respective ranking of the three


judges, the pair of judgments will be three namely R1R2, R1R3
& R2R3.

R1 R2 R3 R1-R2 R1-R3 R2-R3 (D1) 2 (D2)2 (D3)2


(D1) (D2) (D3)
1 3 6 -2 -5 -3 4 25 9
6 5 4 1 2 1 1 4 1
5 8 9 -3 -4 -1 9 16 1
10 4 8 -6 2 -4 36 4 16
3 7 1 -4 2 6 16 4 36
2 10 2 -8 0 8 64 0 64
4 2 3 2 1 -1 4 1 1
9 1 10 8 -1 -9 64 1 81
7 6 5 1 2 1 1 4 1
8 9 7 -1 1 2 1 1 4

∑D2=200 ∑D2=60 ∑D2=214

Rank correlation coefficient between judgments of 1st & 2nd


judges is given by

R12 = 1-6∑D2÷N (N2-1)

=1-6×200÷10 (102-1)

=1-6×200÷10×99

=1-1.2121

= -0.2121

Rank correlation coefficient between judgments of 1st & 3rd


judges is given by

R13 =1-6∑D2÷N (N2-1)


=1-6×60÷10 (102-1)

=1-6×60÷10×99

= 0.6363

Rank correlation coefficient between judgments of 2nd & 3rd


judges is given by

R23 =1-6∑D2÷N (N2-1)

=1-60×214÷10 (102-1)

=1-60×214÷10×99

=1-1.2969

= -0.2969

Out of R12, R13 & R23 only R13 is positive. Hence the first &
second judges are in agreement as their beauty tastes are
common.
Ex – 26
If the sum of squares of the rank differences of 9 pairs of values is 80, find the correlation
coefficient between them

Solution:
∑D2=80, N =9
Rank correlation coefficient R = 1-6∑D2÷ N (N2-1)
=1- 6*80÷9 (92-1)
= 1- 6*80 ÷ 9*80
= 1- 6÷9
= 1/3
= 0.333

Ex – 27
In a bivariate data of n pairs of observations, the sum of square of differences between
the ranks of observed values of two variables is 231 & the rank correlation coefficient is
– 0.4. Find the value of N.

Solution
∑D2 = 231, R=-0.4
R= 1- 6∑D2÷ N (N2-1)
Or
-0.4=1 – 6*231÷ N(N2-1)
or
6*231÷ N(N2-1)
= 1+0.4
=1.4
or
N(N2-1) = 6*231÷1.4
=990
=10*99
=10(100-1)
= 10(102-1)
N=10

Rank correlation coefficient when ranks are not given:--


Steps:-
1. Assign ranks to all the items in one series (X) & separately to all items in the
other sins (Y). Ranks can start from either the highest or the lowest values but the
same criterion is to be followed both the variables.
2. find the difference in ranks of the N paired items R1 – R2 =D
3. Calculate the squares of the rank differences and add them to get ∑D2
4. Use formula R = 1 - 6∑D2÷ N(N2-1) to get the value of rank correlation
coefficient.

EX – 28

The co-efficient of rank correlation of the marks obtained by 10 students in


statistics and accountancy was found to be 0.2. It was later discovered that
the difference in ranks in the two subjects obtained by one of the students
was wrongly taken as i instead of 7. Find the correct co-efficient of rank
correlation.

Solution

Let Rc and Rw be the correct & wrong co-efficient of rank correlation


respectively & Dc ad Dw be the correct and wrong differences respectively.

R = 1- 6∑D2 ÷ N (N2 – 1)

so

Rw = 1- 6∑Dw 2 ÷ N (N2 – 1)

Or
0.2 = 1- 6∑Dw 2 ÷ 10 (102 – 1)

= 1- 6∑Dw2 ÷ 10*99

or

6∑D w2 ÷ 10*99

= 1-0.2

=0.8

∑Dw2 = 0.8 *10*99÷6

= 132

Now ∑Dc 2 = ∑Dw 2


- (wrong rank difference)2 + (correct rank difference)2

= 132- 92 + 72

=132- 81+49

= 100

So

Rc= 1- 6∑Dc 2 ÷ N (N2 – 1)

= 1-6*100 ÷ 10(102- 1)

= 1- 6*100÷10*99

= 1 – 0.606

= 0.394

Ex -29

A test in statistics was taken by 7 students. The teacher ranked his students
according to their academic achievements .The order of achievement from
high to low together with family income for each pupil, is given follows:

Rai (Rs 8700), bhatnagar (Rs 4200), Tuli (Rs 5700), Desai (Rs8200), Gupta
(Rs 20000), Choudhary (Rs 18000) & Singh (Rs 17500)

Complete the spearman’s coefficient of rank correlation between academic


achievement & family income.
Solution:

The students have been ranked from high to low in academic achievements
as there are 7 students whose academic achievements & family income are
to be correlated as Rai , Bhatnagar , _ _ _ _ _ _ _ , singh

However their ranking from high to low as per family income will be will be
Gupta, Choudhary, Singh, Rai, Desai, Tuli, and Bhatnagar

Name of Rank as per Rank as per R1 - R2(D) D2


students academics family
(R1) income (R2)

RAI 1 4 -3 9

BHATNAGAR 2 7 -5 25

TULI 3 6 -3 9

DESAI 4 5 -1 1

GUPTA 5 1 4 16

CHOUDHARY 6 2 4 16

SINGH 7 3 4 16

∑D2=92

Spearman’s coefficient of rank correlation is

R=1- 6∑D2÷ N (N2-1)

= 1- 6*92÷7 (72-1)

= 1 – 6*92÷7*48

=1- 1.6248

= -0.6428

Ex- 30

Quotation of index numbers of security prices of a certain joint stock


company are given below:
Year Debenture price Share prices

1 97.8 73.2

2 99.2 85.8

3 98.8 78.9

4 98.3 75.8

5 98.4 77.2

6 96.7 87.2

7 97.1 83.8

Using rank correlation method, determine the relationship between


debenture prices & share prices.

Solution:

7 yrs debenture and share prices data is given so N=7

Ranking from highest to lowest for both debenture & share prices, stabulating

Debenture Debenture Share price Share price R1 - R2(D) D2


price price rank (R2)
rank(R1)

97.8 5 73.2 7 -2 4

99.2 1 85.8 2 -1 1

98.8 2 78.9 4 -2 4

98.3 4 75.8 6 -2 4

98.4 3 77.2 5 -2 4

96.7 7 87.2 1 6 36

97.1 6 83.8 3 3 9

∑D2=62

Coefficient of rank correlation is R =1-6∑D2÷ N (N2-1)

=1-6*62÷7 (72-1)
=1-6*62÷7*48

=1-1.1071

=0.1071

Ex – 31

Calculate spearman’s coefficient of correlation between marks assigned to 10


students by judge X&Y in a certain competitive test as shown below:

No. 1 2 3 4 5 6 7 8 9 10

Marks by 52 53 42 60 45 41 37 38 25 27
judge X

Marks by 65 68 43 38 77 48 35 30 25 50
judge Y

Solution:

There are 10 students marked by judges X & Y so N = 10

Ranking students by marks given by both the judges from lowest to highest &
tabulating.

Marks by Rank by Marks by Rank by R1-R2 (D) D2


judge X judge X(R1) judge Y judge Y(R2)

52 8 65 8 0 0

53 9 68 9 0 0

42 6 43 5 1 1

60 10 38 4 6 36

45 7 77 10 -3 9

41 5 48 6 -1 1

37 3 35 3 0 0

38 4 30 2 2 4

25 1 25 1 0 0

27 2 50 7 -5 25
∑D2=76

Spearman’s coefficient of correlation

R=1- 6∑D2÷ N (N2-1)

=1-6*76÷10(102-1)

=1-6*76÷990

=1-0.4604

=0.5396

Ranks correlation coefficient when ranks are equal:--

This is a special case of finding rank correlation coefficient when ranks are
not given at the same time two or more items in a series have equal ranks, in
other words, they are repeated.

The steps calculating rank correlation coefficient will be the same as the
previous case (ranks not given) however, the ranks will be assigned in the
following manner.

For a set of two repeated items in a series , if one is getting assigned rank R
then the other is supposed to get rank ( R+1) on the assumption that it has a
marginally higher value than the other one (ranking consideration is on the
basis of increasing values of items ) . In reality this is not the case hence both
the repeated items are assigned the average of two ranks as R + (R+1) ÷ 2.
The next rank for non repeated items in the series will be R+2. For repeated
n times in the series, the average rank will be r+(r+1) +………..[r+(n-1)] &
the subsequent rank for a non repeated item in the series will be (r+n)

The rank correlation coefficient in this case is given by formula.

R=1-6[∑D2+1÷12(m13-m1)+1÷12(m23-m2)+1÷12(m33-m3)+……..]

N (N2-1)

Where D is the rank difference of N paired items in the two series X&Y, m1 &
m2, are the repeated items In the two series X&Y respectively.

Ex – 32
Relationship between height and weight of a batch of 10 students is given in
the following table:

Height (inches): 48 49 50 51 52 53 54 55 56 57

Weight (lbs): 100 105 105 104 111 115 125 130 132 137

There are 10 pairs of observations so N = 10 & m1= 2 as 105 lbs figures


twice in the weight series .

There is no repetition in the data for height, however, the weight 105 lbs is
repeated. Arranging the weight in ascending order 100, 104, 105, 105, 111,
115, 125, 130, 132, 137.

Ranks of 100 & 104 are 1 & 2. Ranks of the repeated weight 105 lbs will be
3+4 = 3.5. Tabulating the data.

X(height) Ranks (R1) Y(weight) Ranks(R2) R1-R2 (D) D2

48 1 100 1 0 0

49 2 105 3.5 -1.5 2.25

50 3 105 3.5 -0.5 0.25

51 4 104 2 2 4

52 5 111 5 0 0

53 6 115 6 0 0

54 7 125 7 0 0

55 8 130 8 0 0

56 9 132 9 0 0

57 10 137 10 0 0

∑D2=6.5

Rank correlation coefficient

R=1-6[∑D2+1÷12 (m13-m1)+1÷12(m23-m2)+1÷12(m33-m3)+……..]

N (N2-1)

=1-6*[6.5+1÷12 (23-2) ÷10(102-1)


=1-6*(6.5+0.5) ÷ 10*99

=1-6*7÷990

=0.957

Ex -33

Calculate rank correlation coefficient of the following data:

Marks in 1st subject: 40 46 54 60 70 80 82 85 85 90 95

Marks in 2nd subject: 45 45 50 43 40 75 55 72 65 42 70

Solution:

There are 11 pairs of observations so N =11, marks in both subjects have


been repeated. In 1st subject 85 is repeated and in 2nd subject 45 is repeated
so m1 = 2 & m2 = 2.

Arranging marks in ascending order

1st subject: 40 46 54 60 70 80 82 85 85 90 95

85 lies in 8th & 9th places so average rank = 8+9 = 8.5

2nd subject: 40 42 43 45 45 50 55 65 70 72 75

45 lies in 4th & 5th places so average rank = 4+5 = 4.5

1st subject Ranks (R1) 2nd subject Ranks (R2) R1-R2 (D) (D2)
(X) (Y)

40 1 45 4.5 -3.5 12.25

46 2 45 4.5 -2.5 6.25

54 3 50 6 -3 9

60 4 43 3 1 1

70 5 40 1 4 16

80 6 75 11 -5 25
82 7 55 7 0 0

85 8.5 72 10 -1.5 2.25

85 8.5 65 8 0.5 0.25

90 10 42 2 8 64

95 11 70 9 2 4

∑D2=140

Rank correlation coefficient

R = 1-6[∑D2+1÷12 (m13-m1)+1÷12(m23-m2)+1÷12(m33-m3)+……..]

N (N2-1)

=1-6[140+1÷12(23-2)+1÷12(23-2) ÷11(112-1)

=1-6(140+1÷2+1÷2) ÷11*120

=1-6*141÷11*120

=0.359

Ex 34

Obtain the rank correlation coefficient between the variables X & Y from the
following pairs of observed values.

X = 50 55 65 50 55 60 50 65 70 75

Y = 110 110 115 125 140 115 130 120 115 160

Solution:

There are 10 pairs of observations so N = 10.

In X series 50 figures 3 times , 55 figures twice and 65 figures twice so m1 =


3 ,m2 = 2 & m3 = 2

In Y series 115 figures thrice & 110 figures twice so m4 = 3 & m5 = 2.

Arranging in ascending order


X series: 50 50 50 55 55 60 65 65 70 75

50 lies in first, second and third places so their average rank = 1+2+3 = 2

55 lies in fourth & fifth places so their average rank = 4+5 = 4.5

65 lies in seventh & eighth places so their average rank = 7+8 = 7.5

Y series: 110 110 115 115 115 120 125 130 140 160

110 lies in first & second places so their average rank = 1+2 =1.5

115 lies in third , fourth & fifth places so their average rank = 3+4+5 = 4

X Rank (R1) Y Rank (R2) R1-R2 (D) D2

50 2 110 1.5 0.5 0.25

55 4.5 110 1.5 3 9

65 7.5 115 4 3.5 12.25

50 2 125 7 -5 25

55 4.5 140 9 -4.5 20.25

60 6 115 4 2 4

50 2 130 8 -6 36

65 7.5 120 6 1.5 2.25

70 9 115 4 5 25

75 10 160 10 0 0

∑D2=134

Rank correlation coefficient

R= 1-6[∑D2+1÷12 (m13-m1)+1÷12(m23-m2)+1÷12(m33-m3)+……..]

N (N2-1)
=1-6[134+1÷12(33-3)+1÷12(23-2)+1÷12(23-2)+1÷12(33-3)+1÷12(23-2)]
÷10(102-1)

=1-6[134+2+1÷2+1÷2+2+1÷2] ÷10*99

=1-6*139.5÷990

=0.1545

Ex 35

Calculate the coefficient of correlation from the following data by the method
of rank differences.

Rank of X: 10 4 2 5 8 5 6 9

Rank of Y: 10 6 2 5 8 4 5 9

Solution:

N= 8 as there are only 8 pairs of observations & in the data ranks are
mentioned as 9 & 10 also. This is not possible.

If rank correlation coefficient is to be calculated then ranks in the data are to


be treated as observations & not ranks. Then ranks are to be assigned to
these values. The values in ascending order will be:

X series: 2 4 5 5 6 8 9 10 & Y series: 2 4 5 5 6 8 9 10

Both X and Y series have repeated items 5 so m1= 2 & m2 = 2. in both the
series they are placed at third and fourth positions. Hence for both the series
their average rank will be 3+4 = 3.5

X Rank (R1) Y Rank (R2) R1-R2 (D) D2

10 8 10 8 0 0

4 2 6 5 -3 9

2 1 2 1 0 0

5 3.5 5 3.5 0 0
8 6 8 6 0 0

5 3.5 4 2 1.5 2.25

6 5 5 3.5 1.5 2.25

9 7 9 7 0 0

∑D2=13.5

Rank correlation coefficient

R = 1-6[∑D2+1÷12 (m13-m1)+1÷12(m23-m2)+1÷12(m33-m3)+……..]

N (N2-1)

=1-6[13.5+1÷12(23-2)+1÷12(23-2)] ÷8(82-1)

=1-6[13.5+0.5+0.5] ÷8*63

=1-6*14.5÷8*63

=0.8273

Ex 36

If the coefficient of rank correlation between debenture prices & share prices
of a company found to be 0.143. if the sum of the squares of the differences
in ranks is 48. Find the value of N?

Solution:

∑D2=48 & R=0.143

The rank correlation coefficient R=1- 6∑D2÷ N (N2-1)

Substituting the value from data

0.143=1-6*48÷ N (N2-1)

Or

6*48÷ N (N2-1)=1-0.143

=0.857
N (N2-1)=6*48÷0.857

=336=7*48=7*(49-1)=7(72-1)

Therefore, N=7

Concurrent deviation method:

This is the easiest of all the methods of studying correlation. The basis of this
method is to study the direction of change, in other words, to find the
increase or decrease in value of the variables X and Y. Then the concurrent
deviation which is the product of the changes in variables X and Y is observed
; only the positive sign or negative sign is considered and not the actual
change in magnitude.

The coefficient of correlation by the concurrent deviation method is given by


rc= ±√± (2c-n), where c donates the numbers of concurrent.

Deviation is the number of +ve signs only obtained as a product


of the deviation dx and dy (signs only and not the actual deviation values) in
variables X&Y respectively n is one less than N, the number of pairs of
observations. This is due to the fact that in both X and Y series, no value
preceedes the first place value so change (deviation) can not be found.

Calculated by this method, the value of the correlation coefficient also


termed the coefficient of concurrent deviation lies between +1 & -1.

Steps:
1. In the X variable find the deviation or the direction of change Dx. The first
place change cannot be determined due to non- existence of predecessor to
the first place value so it is left blank. Compare the first & second place
values of the X series. If the second place value is more than the first place
value, mark +ve sign in the second place of the deviation Dx column. In case
the second place value is less than the first place value, mark –ve sign & if
both values are equal mark zero in the second place of the deviation Dx
column. In the same manner the second and third place & subsequently all
the remaining adjacent values of the X variables are to be compared and
accordingly marked in the deviation Dx column.

2. The same treatment is to be given to values of the Y variable & the +ve,
-ve sign or zero as the case may be marked in the deviation Dy coloumn.

3. Find the product of Dx & corresponding Dy marking & record them in the
DxDy coloumn.

4. Add all the +ve signs in DxDy coloumn to get ∑+ve signs = C

5. To obtain the value of the coefficient of correlation substitute the values of


C & n in the formula Rc = ±√± (2c-n),

The +ve & -ve signs inside and outside the square root sign have
a significance. The square root of +ve numbers are real numbers which could
be either +ve or –ve but of same magnitude. Square root of –ve numbers are
not real numbers. If 2c-n is –ve sign then 2c-n will also be –ve

as n as a +ve number. So to make 2c – n positive the negative 2c-n has to be


multiplied by -1 otherwise real values of the correlation coefficient cannot be
obtained. Once the negative sign inside the square root sign has been
considered, the negative sign outside the square root sign will have to be
considered thereby establishing –ve correlation. In other words, if 2c-n is –ve,
the correlation is –ve otherwise +ve correlation exists. There is absolute
absolutely no ambiguity on account of the +ve sign & -ve sign.

Ex 37

Calculate the coefficient of concurrent deviation from the following:

X: 60 55 50 56 30 70 40 35 80 80 75

Y: 65 40 35 75 63 80 35 20 80 60 60

Solution:
X Dx Y Dy Dx.Dy

60 65

55 - 40 - +

50 - 35 - +

56 + 75 + +

30 - 63 - +

70 + 80 + +

40 - 35 - +

35 - 20 - +

80 + 80 + +

80 0 60 - 0

75 - 60 0 0

C=∑Dx.Dy=8

N=11, n=N-1=10, c=8

Cofficient of concurrent deviation

Rc=±√± (2c-n),

=±√±(2*8-10)

10

=±√6÷10

=0.7747

Ex 38

Calculate the coefficient of concurrent deviation from the following data:


Price: 368 384 385 361 347 384 395 403 400 385

Imports: 22 21 24 20 22 26 24 29 28 27
Solution:

X Dx Y Dy Dx.Dy

368 22

384 + 21 - -

385 + 24 + +

361 - 20 - +

347 - 22 + -

384 + 26 + +

395 + 24 - -

403 + 29 + +

400 - 28 - +

385 - 27 - +

C=∑Dx.Dy=6

N=10, n= N-1=10-1=9, c=6

Cofficient of concurrent deviation

Rc=±√± (2c-n),

=±√±(2*6-9)

=±√±1÷3

=0.5773

Ex 39

Calculate the coefficient of correlation using the method of concurrent


deviation from te following data:

Year: 1998 1999 2000 2001 2002 2003 2004


Supply: 150 154 160 172 :160 165 180

Demand: 200 180 170 160 190 180 172

Solution:

Year Supply X Dx Demand Y Dy Dx.Dy

1998 150 200

1999 154 + 180 - -

2000 160 + 170 - -

2001 172 + 160 - -

2002 160 - 190 + -

2003 165 + 180 - -

2004 180 + 172 - -

C=∑Dx.Dy=0

N=7, n=N-1=7-1=6, c=0 as there is not a single positive +ve sign

Cofficient of concurrent deviation

Rc=±√± (2c-n),

=±√±(2*0-6÷6)

=±√±(-6÷6)

=±√±(-1)

Square root of -1 is not a real number so consider –ve sign inside & outside
the square root sign

Rc= -√-(-1)= -√1= -1

This indicate perfect –ve correlation between supply & demand.

Potrebbero piacerti anche