Sei sulla pagina 1di 2

Joint statistics: The straight forward approach

By Wijit Anusasananan, 2012.


So many people tend to look at statistics like a sophisticated monster rather a
branch of mathematics that is built upon simple quantitatie methods. Some problems in
our real life do not need a fashioned !ay to sole at all. "o!eer, !ron# assumptions and
theory phobia can make people try to sole problem a hard !ay or !aste their time to find
a support theory. Some een refuse to follo! the method that has no name or no
publication.
$esterday % #ae my students some in&class e'ercises. (ne of them !as about
a dispute bet!een a police department and a lorry company on !hether a total !ei#ht of a
truck !ould e'ceed the le#islation limit. )ach of the parties sho!ed their statistical data to
the court. *he court decided to combine both sets of data to#ether and dre! conclusion
from that. Because of the class time !as too short for e'planation for the joint standard
deiation, % asked students to follo! this ho! to on the uni e&learnin# !ebsite. +ata and
instruction !ere as follo!in#s.
,olice -ompany .nit
/umber of trucks been !ei#hted 20 00 *rucks
Aera#e !ei#ht 21.1 20.2 *ons
Standard deiation 1.2 1.2 *ons
And the le#islation said that the total !ei#ht of a ten !heel lorry shall not
e'ceed 21 tons.
What !e hae to do is to find a ne! set of data for hypothesis testin#. Since
each truck of a set could be assumed to hae a !ei#ht of aera#e. *here are at least 2
approaches to find a shared aera#e for both sets of data. (ne approach is to find the total
sum of the !ei#hts of all trucks and diide the sum by 30. *he other !ay is to calculate the
summation of the aera#e !ei#hts !hich are !ei#hted by their number of trucks as a
proportion to the oerall number, ie., 21.14253620.24053. Both approaches !ould #ie the
same fi#ure, 20.78 tons.
We cannot use simple proportional or !ei#hted aera#e for the standard
deiations because, first, standard deiation is an aera#e difference bet!een each of the
data from its mean !hich !e can see that the ne! mean is different from the old 2 means.
Second, S+ is a euclidean distance and !e are reluctant to conclude that !ei#hted aera#e
!ill #ie us a correct ans!er. We hae no indiidual pieces of data but !e can #o half !ay
back to find the ori#inal sum of the square of the errors of each set. Also !e can find the
differences bet!een the ne! partial sums of the square and the ori#inal sums of the square.
We then sum all the ori#inal sums of the square and all the differences to#ether to arrie at
the ne! sum of the square. *he fi#ure diided by 30 & 1 !ould be the ne! ariance !hich
could be square&rooted to yield the ne! standard deiation. +etailed procedures are sho!n
belo!.
1. 9ind the ori#inal sums of the square
,olice:s set; 1.2<2 4 =20 & 1> ? 31.71
-ompany:s set ; 1.2<2 4 =00 & 1> ? 11.2@
2. 9ind the differences bet!een the ne! sum of square and the ori#inals
,olice:s set; =21.1 & 20.78><2420 ? 0.328
-ompany:s set ; =20..2 & 20.78><2400 ? 2.032
0. Sum them up ;
31.71 6 11.2@ 6 0.328 6 2.032 ? 102.33
1. +iide the result of by =30 & 1> to #et a sample ariance ;
102.33 5 17 ? 2.070
3. Square root the Aar to #et the ne! S+ ;
sqrt=2.070> ? 1.11@@2 tons
%f suppose the court allo! the *ype % )rror for not more than 1B then !e are
ready for hypothesis testin#.
What !as demonstrated here is a strai#ht for!ard !ay to #et a joint mean and
a joint standard deiation !here the sample siCe, sample mean and sample standard
deiation of each set of data are aailable. *here is no need to deal !ith the ori#inal ra!
data. /o restriction on !hether the dispersions be si#nificantly different bet!een sets since
all procedures are based on elementary leel of mathematics. Also there is no limitation on
number of sets of data nor their siCes. Whether data can be mer#ed or !hether the ne!
joint statistics be meanin#ful !ould be the responsibility of the user. Any comments are
!elcome. -itation to the author !ould be appreciated.

Potrebbero piacerti anche