Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Analysis
Sample Variance:
s 2=
1
( x i x )2
n1 i=1
Box Plots:
lower fence= LQ - 1.5*IQR upper
fence= UQ + 1.5*IQR
Find limits of whiskers (LX=
smallest value >LF, UX= greatest
value <UF)
UQ LQ = spread
Standardised score (z-score):
z=
xx
s
measures # and
s 2y =c2 s 2x
sy =
n!
( nr ) !
n
r
C=
n!
r ! (nr) !
P ( A|B )=
P ( B| A ) P( A)
P( B)
E [ ( X )2 ]
2 =
x i2P ( X =xi ) 2
i=1
TOPIC 2 Probability
P=
P ( A|B )=
|c| * sx
n
r
Bayes Rule:
Linear Transformations:
yi = a + c*xi
y =a+c x
P ( A B)
P(B)
Statistical Independence
Events A & B stat ind if:
a. P(A|B) = P(A) i.e. knowledge
that B occurred no effect on
P(A) vice versa
R.v. Transformations:
y = a + c*X
Mean= E[Y] = a + c*E[X]
Variance = Var[a+c*X] = c2 Var(X)
Sd.= y = |c|* x
Bernoulli Distributions: exp has 2
mutually exclusive outcomes
(success, fail), inde trials,
constant
Binomial Distrib: fixed # identical
trials, trials independent,
constant for exp., 2 poss.
outcomes on each trial, binom.
variable x=# success in n trials
Poisson Distrib: events occur
randomly over a continuum of
time at constant rate, each event
occurs inde of the other events,
expected # events each unit= ,
events occur at low frequency
Poisson Approx to Binom.
1
x + n
X
2
P ( X x )=P( z=
)
Var ( X ) n ( 1 )
Normal Distn:
If X~ N(, 2) Standardizing:
z=
Hypergeometric: N= populatn
size, n= # elements drawn, r=#
success in pop,
Hyperge variable x= # success in
sample
z=
Then Z~N(0,1)
Empirical Rule (Normdist) and
Chebyshev (Any prob. dist):
1sd of =68.26%
0
2sd of =95.44%
3/4
3sd of = 99.7%
8/9
TOPIC 5 Sampling
Distributions
C.I. for :
If X~ N (, 2) then
X N ( , )
n
x
z=
/n
t
x
z obs =
x
/n
x 0
s/ n
t obs=
df= n-1 as df tZ
C.I. for :
Investigating Normality:
IQR/S ~1.3
Construct normal probability
plotif data approx.
normdist, points fall
approx. on a straight line
within confidence bounds
TOPIC 8 Inference Regarding
two Populatn Means
z obs =
t obs=
1 1
+ )
n1 n 2
Confidence Interval:
( x1 x2 )( 12 )
sp
1 1
+
n1 n2
P z / 2
p(1 p)
n
Two proportions:
Pooled Variance:
( n1 1 )s21 + ( n21 )s 22
2
s p=
( n1+ n22 )
Confidence interval:
( x1 x2 ) t n1 +n 22s p
( x1 x2)
1
2n
(1 )
n
|P|
2-Sample Test
2(
S larger
< 2
S smalle r
1= 2 if
t s
T-test Requirements:
Parent populatn normdist
z obs =
2 samples inde,
observations within each
sample inde,
The 2 populatn s.d. are
same 1= 2
1 1
+
n1 n2
Assumptions:
Data come from normal or
approx.. normdist
sample 1+ sample 2
total n
p 1p 2
z obs =
1
1
^ (1 ^ )( + )
n1 n 2
( p 1p 2 ) z
2
p 1 ( 1 p 1 ) p 2(1 p 2)
+
n1
n2
Assumptions:
Linearity: true linear tread for the
conditional expected value of Y
given X
Normality: residuals normdist
Constant Variance: variability
about regression line constant
Independence: response values
inde
^
y i= ^
0+ ^
1 x
SSE= ( y i ^
y i )2=SSYY ^1 SS xy
SS2XY
SS tot =SSYY =
+ ( SS YY ^ SS XY )
SS XX
i=1
^ 1= SS XY
SS XX
sample intercept: ^ 0= y ^1 x
regression coeff:
SS XX = xi2n x2 =( n1 )S 2x
SS XY = xi y in x y
^ =s=
Coeff of Determination;
Proportion of total var in Y
explained my linear relation of Y
and X
2
R2=
SS YY ^ 1 SS XY
n2
Correlation Coeff:
SS XY
SS XX SSYY
r n2
t obs= 2
1r
1 0
t obs=
est . se .(b)
est . se. ( b ) =
r= ^p =
s
SS XX
Confidence interval of B:
Predicted Values:
Est. standard error:
x x
^ ( Y | X=x c ) ) =s 1 + ( c )
s e^ ( E
n
SS XX
c.i.:
^ ( Y| X=x c ) )
0 t n2, / 2s e^ ( E
Prediction interval:
SS reg
SS XY
=
SS tot SS XX SS YY
2
1 ( xc x )
0 t n2, / 2s 1+ +
n
SS XX
Goodness of Fit:
Ei= expected count under H0 in
cell i
Contingency Tables
k
2
obs =
i=1
1
Oi Ei
2
Ei
2obs
If H0 true ij= i x j
But be do not have i or j
Estimate:
^i=
row ( i ) total
=Pi
N
^j=
column ( j ) total
=P j
N
Therefore if H0 true
^i
Eij =^
j ^i N=
N
N
N
Test Statistic:
totally dependent
^ij =^
j *
=
2
o bs
i =1 j=1
Oij Eij
1
2
E ij