Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Learning Objectives
1. 2. Differences between the Two Types of Random V i bl Variables Discrete Random Variables
1. 2.
Describe Discrete Random Variables Compute the Expected Value & Variance of Discrete Random Variables Describe Normal Random Variables Introduce the Normal Distribution Calculate Probabilities for Continuous Random Variables
3.
4.
2
Assessing Normality
Random Variables
A variable defined by the probabilities of each possible value in the population.
Data Types
Data
Numerical
Qualitative
Discrete
Continuous
Jump from one value to the next and cannot take any values in between.
Random Variable
# Girls
Possible Values
0, 1, 2, ..., 10? 0 1 0, 1, 2 2, ..., 33 0, 1, 2, ...,
Answer 33 Questions # Correct Count Cars at Toll # Cars Between 11:00 & 1:00 Arriving
6
2. Mutually Exclusive (No Overlap) 3 Collectively Exhaustive (Nothing Left Out) 3. 4. 0 p(x) 1 5. p(x) = 1
7
Marilyn says: It may sound strange, but more families of 4 children have 3 of one gender and one of the other than any other combination. Explain this.
Construct a sample space and look at the total number of ways each event can occur out of the total number of combinations that can occur, and calculate frequencies. Are all 16 combinations equally likely? Is the sex of each child independent of the other three?
P (girl) = 1/2 P (boy) = 1/2 so, P (BBBB) = x x x = 1/16
Sample Space BBBB GBBB BGBB BBGB BBBG GGBB GBGB GBBG BGGB BGBG BBGG BGGG GBGG GGBG GGGB GGGG
If you have a family of four, what is the probability of P(all girls or all boys) = 2/16 = 1/8 P (2 boys, 2 girls)= 6/16 = 3/8 six different ways to have 2 boys and 2 girls P(3 boys, 1 girl or 3 girls, 1 boy)=
8/16=4/8=1/2 8 ways to have 3 of 1 and 2 of the other.
Assume the random variable X represents the number of girls in a family of 4 kids. (lower case x is a particular value of X, ie: x=3 girls in the family)
Sample Space BBBB GBBB BGBB BBGB BBBG GGBB GBGB GBBG BGGB BGBG BBGG BGGG GBGG GGBG GGGB GGGG Random Variable X x=0 x=1 x=1 x=1 x=1 x=2 x=2 x=2 x=2 x=2 x=2 x=3 x=3 x=3 x=3 x=4
What is the probability of exactly 3 girls in 4 kids? P(X=3) = 4/16 What is the probability of at least 3 girls in 4 kids? P(X3) = 5/16
Table
Number of Girls, x Probability, P(x) 1/16 4/16 6/16 4/16 1/16 16/16=1.00
Graph
Probability, P(x) 0.40 0.35 0 30 0.30 0.25 P(x) 0.20 0.15 0.10 0.05 0.00 0 1 2 Number of Girls, x 3 4 1/16 1/16 4/16 6/16
10
X is random and x is fixed. We can calculate the probability that different values of X will occur and make a probability distribution.
Probability Distributions
Probability, P(x) 0.40 0.35 0.30 0.25 P(x) 0.20 0.15 0.10 0.05 0 00 0.00 0 1 2 Number of Girls, x 3 4 1/16 1/16 4/16 4/16 6/16
11
Probability distributions can be written as probability histograms. Cumulative probabilities: Adding up probabilities of a range of values.
P(x) 0.03500 0.70553 0.21769 0.02966 0.00775 0.00332 0.00088 0 00002 0.00002 0.00000 0.00015 1.00000
5 6 7 8 9 Total
12
13
What is the probability that a household will have no telephone? What is the probability that a household will have 2 or more telephone lines? What is the probability that a household will have 2 to 4 phone lines? Wh t i What is th the probability b bilit a h household h ld will ill h have no phone h lines or more than 4 phone lines? Who do you think is in that 3.5% of the population? What are the implications of this for the quality of the survey?
14
Summary Measures
1. Expected Value
mu
Mean of Probability Distribution Weighted Average of All Possible Values = E(X) = x p(x) Weighted Average Squared Deviation about Mean 2 = V(X)= E[ (x )2 ] = (x )2 p(x) 2 = V(X)=E(X2) [E(X)]2 = 2 = SD(X)
2. Variance
Sigma -squared
3. Standard Deviation
15
What is the average number of telephones in Washington Households and how much does size vary from the average?
# of
Phones
Approach 1: Variance Frequency F 198,286 4,142,030 1,278,026 174,110 45,499 19,473 5,170 118 897 5,863,609 P( ) P(x) 0.04 0.71 0.22 0.03 0.01 0.00 0.00 0.00 0.00 0.00 1.00 xP(x) P( ) 0.00 0.71 0.44 0.09 0.03 0.02 0.01 0.00 0.00 0.00 =1.28 ( -) (x(x -1.3 -0.3 0.7 1.7 2.7 3.7 4.7 5.7 6.7 7.7 32.16 ( - )2 (x(x 1.65 0.08 0.51 2.94 7.38 13.81 22.24 32.67 45.10 59.53 ( -)2P(x) (x(x ( ) 0.06 0.06 0.11 0.09 0.06 0.05 0.02 0.00 0.00 0.01 2=0.45
Approach 2: Variance x2 0 1 4 9 16 25 36 49 64 81 x2P(x) P( ) 0.00 0.71 0.87 0.27 0.12 0.08 0.03 0.00 0.00 0.01 2.10
x 0 1 2 3 4 5 6 7 8 9 Sum
16
Chebyshevs Theorem
Helpful in understanding or interpreting a value of a standard deviation Empirical rule applies only to data sets with a bellbell-shaped distribution Chebyshevs y theorem applies pp to ANY data set, but its result are very approximate
17
Chebyshevs Theorem
The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 1/K2 (where K > 1) For K = 2 and K = 3, the results are as follows:
18
At least (or 75%) off all values lie within 2 standard deviations of the mean At least 8/9 (or 89%) off all values lie within 3 standard deviations of the mean
10
19
Data Types
Data
Numerical
Qualitative
Discrete
Continuous
20
11
21
Random Variable
Weight Hours Spending
Possible Values
45.1, 78, ... 900, 875.9, ... 54 12 42 54.12, 42, ...
12
f(x)
3. Properties
Area under curve sums to 1 Can add up areas of function to get probability less than a specific value
a
Value
23
P(c x d)
f(x)
c
24
1984-1994 T/Maker Co.
13
Normal
Exponential
Example: Binomial
14
Normal Distribution
1. Bell Bell-Shaped & Symmetrical 2. Mean, Median, Mode Are Equal 3. Middle Spread Is 1.33 4. Random Variable Has Infinite Range Mean Median Mode
f(X)
27
f(X)
3 2
+ + 2 + 3
15
2 1 x 2
= Value of Random Variable ( (- < x < ) = Population Standard Deviation = 3.14159 = 2.71828 = Mean of Random Variable x
Dont memorize this!
Notation
X is N(,) The random variable X has a normal distribution (N) with mean and standard deviation . X is N(40,1) ( , ) X is N(10,5) X is N(50,3)
30
16
P(c x d) = ? f (x) dx
d c
f(x )
c
32
17
Z=
Z is N(0,1) ( , )
Standardized Normal Distribution
=1
34
= 0
One table!
18
Standardizing Example
Z=
Normal Distribution
X 6.2 5 = = .12 10
= 10
=1
35
= 5 6.2 X
= 0 .12
.02
=1
.0478
= 0 .12
Probabilities
19
.00
.01
.02
0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
Example P(3.8 X 5)
Z= X 3.8 5 = = .12 10
Standardized Normal Distribution
Normal Distribution
= 10
=1
.0478
37
3.8 = 5
-.12 = 0
.00
.01
.02
0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
Normal Distribution
Normal Distribution
= 10
=1
.1664 1664
.0832 .0832
2.9 5 7.1 X
38
-.21 0 .21
20
.00
.01
.02
0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
Example P(X 8)
Z= X 85 = = .30 10
Standardized Normal Distribution
Normal Distribution
= 10
=1
.5000 .1179
.3821
=5
39
=0
.30 Z
.00
.01
.02
0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
Example P(7.1 X 8)
X 7.1 5 = = .21 10 X 85 Z= = = .30 10 Z=
Normal Distribution
= 10
=1
.1179 .0832
.0347
40
=5
7.1 8
=0
.21 .30 Z
21
41
Process
1. Draw a picture and write down the probability you need. 2. Convert probability to standard scores. 3. Find cumulative probability in the table.
42
22
= 1 P (Z > .1) P ( Z > .4) = 1 (.5 .0398) (.5 .1554) = 1 (.4602) (.3446) = .1952
=1
.00
.01
0.2
= 0 .31
Shaded area exaggerated 44
23
.00
.01
.02
0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
Normal Distribution
= 10
=1
.1217 .1217
=5
= 0 .31
X = + Z = 5 + (. 31 )(10 ) = 8 . 1
45 Shaded areas exaggerated
46
24
3 Convert Z value (SD units) to variable (X) by 3. using mean and SD.
34.25 2.325 X=+Z so X=11+(_____)(10)=
So, the trip can be made 99% of the time in 34.25 minutes.
47
Assessing Normality
1. 2. A histogram of the data is mound shaped and symmetrical about the mean. Determine the percentage of measurements falling in each of the intervals x s, x 2s, and x 3s. If the data are approximately normal, the percentages will be approximately equal to 68%, 95%, and 100% respectively. Find the interquartile range, IQR, and standard deviation, s, for the sample, p then calculate the ratio IQR/s. If the data are approximately normal, then IQR/S 1.3. Construct a normal probability plot for the data. If the data are approximately normal, the points will fall (approximately) on a straight line.
3.
4.
48
25
y c4 n e u q e r F3
49
Anticipated Actual Percent Percent xs [63.40,69.64] 68% 95% 100% 43% 96% 100% x2s [60.29,72.75] x3s [57.17,75.87]
50
26
SPSS: ANALYZE, DESCRIPTIVE STATISTICS, FREQUENCIES then click on STATISTICS and choose the ones you want.
51
72
e70 u l a V l a68 m r o N d 66 e t c e p x 64 E
62
60 60 62 64 66 68 70 72 74
Observed Value
52
27
Exercise 1
Identify the given random variable as being discrete or continuous
a)
b)
c)
The weight of the cola in a randomly selected can. The cost of a randomly selected can of Coke. The time it takes to fill a can of Pepsi.
53
Exercise 2
Below is a case where a probability di t ib ti i distribution is d described. ib d Fi Find d it its mean and standard deviation.
In a study of the MicroSort gender genderselection method, couples in a control group g p are not g given a treatment, , and they y each have three children. The probability distribution for the number of girls is given.
54
x 0 1 2 3
28
Exercise 3
Below is a case where a probability di t ib ti i distribution is d described. ib d Fi Find d it its mean and standard deviation.
To settle a paternity suit, two different people are given blood tests. If x is the number having gg group p A blood, , then x can be 0, 1 or 2, and the corresponding probabilities are 0.36, 0.48 and 0.16, respectively.
55
P(x)
Exercise 4
Let the random variable x represent the number of girls in a family of four children. Construct a table describing the probability distribution, then find the mean and standard deviation.
x P(x)
56
29
Exercise 5
Assume that the readings on the thermometers are normally distributed with a mean of 0 0 C and a standard deviation of 1.00 C. A thermometer is randomly selected and tested. In each case, draw a sketch, and find the probability of each reading in degree. Between 0 and 1.50 Between -1.96 and 0 Less than -1.79 Greater than 2.05 Between 0.50 and 1.50 P(-1.96 < z < 1.96) P(z > -2.575)
a) b) c) d) e) f) 57 g)
Exercise 6
Assume that a test is designed to measure a persons sense of humour and that scores on this test are normally distributed with a mean of 10 and a standard distribution of 2. draw a graph, find the relevant z score, then find the indicated value. a) Find the score separating the top 10% from the bottom 90%. b) Find the score separating the top 25% from the bottom 75%. c) Find the score separating the bottom 20% from the top 80%.
58