Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Brian Franczak
MacEwan University
Winter 2018
Introduction
2 / 28
The Course Website
3 / 28
Statistics
4 / 28
Statistics: Formal Definitions
5 / 28
6 / 28
Population vs. Sample
I A population is the collection of all elements, or units, (e.g.,
humans, animals, etc...) under consideration.
I A sample is a subset, or group of units, taken from the
population under consideration.
I Consider the following simple illustration:
-7 Population of interest .
#
of all 1
e.
g. heights year
\ selected
>
Sample , randomly
of students
group
.
7 / 28
Assumptions
* Model :
Most often a normal distribution with meanie .
-1M¥
,
8 / 28
Parameters vs. Statistics
9 / 28
Sample "
Population
xbar
' ' ' ' "
mu
→ Pn , PN
i=1 xi i=1 xi
Mean x̄0 = n 0=
µ N
" "
'
"
q Sigma
'
s
→ q Pn → PN
(xi x̄)2 (xi µ)2
Standard Deviation s0 = i=1
n 1 0= i=1
N
Pn PN
(xi x̄)2 (xi µ)2
Variance s2 = i=1
n 1
2 = i=1
N
Pn Pn
Proportion i=1 xi i=1 xi
p̂ = n p= N
formula -7
each "
is an
' '
estimator .
10 / 28
Parameters vs. Statistics: Exercise
1. A randomly selected sample of final exam scores in a large
introductory statistics course is as follows:
88 67 64 76 86 85 82 39 75 34
90 63 89 90 84 81 96 100 70 96
a. Identify the population of interest.
b. Estimate the center of the population of interest, interpret
your estimate(s).
c. Estimate the population standard deviation, interpret your
estimate.
all students this uiro slats car
for in
Final
.
a. exanm scores 20
Zxiln
n
[ 88+67+64
*
Verify ;E×i/n 70+963120
=
I = =
b. Med ( × ) =
83 , i= , . . .
+
-
= 77.5
estimator W
score of the 20 observed students estimate .
The average
exam
is 77.5% .
If =[
2
88-77.532-+167 -77.512+(64-77.5)
<
=P s2= ( ⇒ % (
'
x -
,
'
.
s
.
+ "
it (96-77.55)/(20-1) 11 / 28
if population is behaved
310.112 For these 20 students the
Ct
=
,
final deviate
17.61 scores
'
. . s= exam
Estimation -
17.61% on each side of
I Recall: X
E [X ] = xp(x).
X
I ˆ is
As such, the expected value of an unbiased estimator, ✓,
h i
E ✓ˆ = ✓.
12 / 28
Random Variables: Review
I Consider a random variable X from a normally distributed
population with µ = 15 and = 8, i.e., X ⇠ N .(15, 64).
Normal with 15 }, 64
X is distributed mean variance .
13 / 28
14 / 28
Sampling Error
→ It will help as
15 / 28
I Standard deviation: = p .
X̄ n
to the truth !
closer
gets us
16 / 28
Normally Distributed Random Variables
I Therefore, if X ⇠ N µ, 2
⇣
than X̄ ⇠ N µ, n .
2
⌘
a. FK
=
%n
I Suppose X ⇠ N (70, 100). Consider the histograms on the
next slide.
P1 The distribution of 10000 realizations of X .
P2 The distribution of x̄ when n = 2 (µ̂X̄ = 69.89, ˆX̄ = 7.16).
P3 The distribution of x̄ when n = 5 (µ̂X̄ = 69.90, ˆX̄ = 4.47).
P4 The distribution of x̄ when n = 10 (µ̂X̄ = 69.91, ˆX̄ = 3.18).
*Note: for all figures, µ is represented by the solid black line and µ̂X̄ is represented by the dashed red line.
17 / 28
0.15
Relative Frequency
Relative Frequency
0.10
0.10
0.05
0.05
0.00
0.00
40 50 60 70 80 90 100 40 50 60 70 80 90 100
0.15
Relative Frequency
Relative Frequency
0.10
0.10
0.05
0.05
0.00
0.00
40 50 60 70 80 90 100 40 50 60 70 80 90 100
18 / 28
Making a Statement about the Sampling Error
*
I According to an article in the Journal of the American
There is
evidence that
Geriatrics Society, the standard deviation of the lengths of
variable
hospital stays on the intervention ward is 8.3 days.
the ,
I
For all samples of size 80, what is the probability the sampling
' '
4
length
of stay
distributed error made in estimating the population mean length of stay is
is
normally
.
at most 2 days?
± ✓ ◆
probability 2 2
being
within
2 days
{ P µ 2 < X̄ < µ + 2 = P 8.3/p80 < Z < 8.3/p80
Many
of true *
= P ( 2.16 < Z < 2.16) = 0.9692
L
Recall :
÷
I Interpretation in terms of the Sampling Error: There is
approximately a 96.92% chance that the sampling error made
y←y¥a÷y
tied
*
in estimating the average length of stay using samples of size
80 will be at most 2 days.
t.q.hn?fi:YaFtYzau
*
19 / 28
- =
=
0.9692
Discussion on Normally Distributed Random Variables
20 / 28
Non-Normally Distributed Populations
I There are many examples of populations that are not normally
distributed.
I For example, consider the following histogram constructed
using data from the Australian Bureau of Statistics.
0 20 40 60 80 100
21 / 28
22 / 28
Non-Normally Distributed Populations: Example
Relative Frequency
−30 −20 −10 0 10 −30 −20 −10 0 10
Value Value
0.00 0.10 0.20 0.30
Relative Frequency
Value Value
23 / 28
24 / 28
Conclusions
I X̄ is an unbiased estimator of µ.
25 / 28
Highlights
I Definition of Statistics
26 / 28
Cumulative Exercise
1. Suppose the population of interest has the shape given in the
plot on the next slide, µ = 0.2, and 2 = 0.04.
a. Find the parameters of the sampling distribution of X̄ if n = 5.
b. What shape will the sampling distribution in a. have? Why?
c. What is the sampling distribution of X̄ if n = 50? Explain.
d. For all samples of size 50, what is the probability the sampling
error made in estimating µ is within 0.05?
oE=I o*=FE
anxious ,
,
⇒ Foes
=
⇒
S = o . 0894
= 0.008
distribution unimodal
sampling
the is
of
b .
The expected shape
skewed
and
right
-
distribution of X will be
approximately normally
C . The sampling standard deviation
}
distributed with mean iex
= 0.2
of = 0.0283
d. P( n .
a. os < I < utoos )
=p(k-o.°5)=<z<(nto?gEI3I )
o . 0283
27 / 28
=
P( -
2
1
0
28 / 28