Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Probability
Corresponds to Chapter 2 of
1
Concepts (Review)
is called a variable.
A parameter is a numerical characteristic of a population.
A statistic is a numerical characteristic of a sample.
Statistics are used to infer the values of parameters.
A random sample gives a non-zero chance to every unit of the population to
enter the sample.
In probability, we assume that the population and its parameters are known
and compute the probability of drawing a particular sample.
In statistics, we assume that the population and its parameters are unknown
and the sample is used to infer the values of the parameters.
Different samples give different estimates of population parameters (called
sampling variability).
Sampling variability leads to “sampling error”.
Probability is deductive (general -> particular)
Statistics is inductive (particular -> general)
2
Difference between Statistics and Probability
Probability Concepts
B
A
5
Axioms of Probability
6
Conditional Probabiity:
P(A|B) = P(A∩B)/P(B)
P(A∩B) = P(A|B)P(B)
P(A∩B)=P(A|B)P(B)=P(A)P(B)
8
Summary
9
Bayes Theorem
P(HIV + | Test +) =
95/590 = 0.16
This is one reason why we don’t have mass HIV screening
Suggestions for Solving Probability Problems
Draw a picture
– Venn diagram
– Tree or event diagram (Probabilistic Risk Assessment)
– Sketch
13
Counting rules
Source: Casella, George, and Roger L. Berger. Statistical Inference. Belmont, CA: Duxbury Press, 1990, page 16. 14
Counting rules (from Casella & Berger)
For these examples, see pages 15-16 of: Casella, George, and Roger L.
Berger. Statistical Inference. Belmont, CA: Duxbury Press, 1990.
15
Birthday Problem
different birthdays
1.0
0.8
choose(365, 1:80, order = ....
0.6
0.4
0.2
0.0
0 20 40 60 80
Number of students 17
This graph was created using S-PLUS(R) Software. S-PLUS(R) is a registered trademark of Insightful Corporation.
See “Harry Potter and the
Rowling.
18
Another Counting Rule
(1.6 * 10^57)
19
Random Variables
• See Table 2.1 on p. 21 (p.m.f. and c.d.f. for sum of two dice)
• See Figure 2.5 on p. 22 (p.m.f. and c.d.f. graphs for two dice)
20
Continuous Random Variables
21
P(0<X<1) for standard normal= area
0.4
0.3
dnorm(x)
0.2
0.1
0.0
-4 -2 0 2 4
x
22
This graph was created using S-PLUS(R) Software. S-PLUS(R) is a registered trademark of Insightful Corporation.
Cumulative Distribution Function
x
F ( x) = P ( X ≤ x) = ∫ f ( y
)dy
−∞
dF ( x)
f ( x) =
dx
23
P(0<Z<1) for standard normal= F(1)-F(0)
1.0
0.8
0.6
pnorm(z)
0.4
0.2
0.0
-4 -2 0 2 4
z
24
This graph was created using S-PLUS(R) Software. S-PLUS(R) is a registered trademark of Insightful Corporation.
Expected Value
E (
X ) = µ = ∑ x f ( x) = x1 f ( x1 ) + x2 f ( x2 ) + …
E ( X ) = µ = ∫ x f ( x )
dx
25
Variance and Standard Deviation
P ( X ≤ θ p ) = F (θ p ) = p
θ.5 is called the median
27
Joint Distributions
f(x,y) = P(X=x,Y=y)
28
Marginal Distributions
29
Conditional Distributions
− ∞− ∞
Cov ( X ,Y ) σ XY
ρ XY = corr ( X ,Y ) = =
Var ( X )Var (Y ) σ xσ Y
• See Examples 2.26 and 2.27 on pp. 37-38 (prob vs.
stat grades)
31
Example 2.25 in text
40
20
0
y
-20
-40
0 10 20 30 40 50
x
32
This graph was created using S-PLUS(R) Software. S-PLUS(R) is a registered trademark of Insightful Corporation.
Two Famous Theorems
σ 2
P( X − µ ≥ c) ≤ 2
c
• See Example 2.29 on p. 41 (exact vs. Cheb. for two dice)
P( X − µ ≥ c ) → 0 as n →
∞
33
Selected Discrete Distributions
35
Selected Continuous Distributions
36
Normal Distribution
37
Photograph courtesy of John L. Telford, John Telford Photography. Used with permission. Currency from 1991.
38
Karl Pearson (1857 - 1936)
39
Normal Distribution (“Bell-curve”, Gaussian)
⎛ X −µ x−µ ⎞ ⎛x−µ⎞
P ( X ≤ x ) = P ⎜ Z = ≤ = z ⎟ = Φ⎜ ⎟
⎝ σ σ ⎠ ⎝ σ ⎠
• See Examples 2.37 and 2.38 on pp. 54-55 (computations)
40
Percentiles of the Normal Distribution
1.0
0.8
0.6
0.4
0.2
0.0
42
This graph was created using S-PLUS(R) Software. S-PLUS(R) is a registered trademark of Insightful Corporation.
Linear Combinations of r.v.s
43