Sei sulla pagina 1di 29

Student Lecture Notes

Chapter 2 Probability, Random Variables and Probability Distributions

Learning Objectives
1. 2. Differences between the Two Types of Random V i bl Variables Discrete Random Variables
1. 2.

Describe Discrete Random Variables Compute the Expected Value & Variance of Discrete Random Variables Describe Normal Random Variables Introduce the Normal Distribution Calculate Probabilities for Continuous Random Variables

3.

Continuous Random Variables


1. 2. 3.

4.
2

Assessing Normality

Student Lecture Notes

Random Variables
A variable defined by the probabilities of each possible value in the population.

Data Types
Data

Numerical

Qualitative

Discrete

Continuous

Student Lecture Notes

Types of Random Variables


Discrete Random Variable

Whole Number (0, 1, 2, 3 etc.) Countable, Finite Number of Values


z

Jump from one value to the next and cannot take any values in between.

Continuous Random Variables


Whole or Fractional Number Obtained by Measuring Infinite Number of Values in Interval


z

Too Many to List Like Discrete Variable

Discrete Random Variable Examples


Experiment
Children of One Gender in Family

Random Variable
# Girls

Possible Values
0, 1, 2, ..., 10? 0 1 0, 1, 2 2, ..., 33 0, 1, 2, ...,

Answer 33 Questions # Correct Count Cars at Toll # Cars Between 11:00 & 1:00 Arriving
6

Student Lecture Notes

Discrete Probability Distribution


1 List of All possible [x, p(x)] pairs 1.

x = Value of Random Variable (Outcome) p(x) = Probability Associated with Value

2. Mutually Exclusive (No Overlap) 3 Collectively Exhaustive (Nothing Left Out) 3. 4. 0 p(x) 1 5. p(x) = 1
7

Marilyn says: It may sound strange, but more families of 4 children have 3 of one gender and one of the other than any other combination. Explain this.
Construct a sample space and look at the total number of ways each event can occur out of the total number of combinations that can occur, and calculate frequencies. Are all 16 combinations equally likely? Is the sex of each child independent of the other three?
P (girl) = 1/2 P (boy) = 1/2 so, P (BBBB) = x x x = 1/16
Sample Space BBBB GBBB BGBB BBGB BBBG GGBB GBGB GBBG BGGB BGBG BBGG BGGG GBGG GGBG GGGB GGGG

If you have a family of four, what is the probability of P(all girls or all boys) = 2/16 = 1/8 P (2 boys, 2 girls)= 6/16 = 3/8 six different ways to have 2 boys and 2 girls P(3 boys, 1 girl or 3 girls, 1 boy)=
8/16=4/8=1/2 8 ways to have 3 of 1 and 2 of the other.

Student Lecture Notes

Assume the random variable X represents the number of girls in a family of 4 kids. (lower case x is a particular value of X, ie: x=3 girls in the family)
Sample Space BBBB GBBB BGBB BBGB BBBG GGBB GBGB GBBG BGGB BGBG BBGG BGGG GBGG GGBG GGGB GGGG Random Variable X x=0 x=1 x=1 x=1 x=1 x=2 x=2 x=2 x=2 x=2 x=2 x=3 x=3 x=3 x=3 x=4

Number of Girls, ,x 0 1 2 3 4 Total

Probability, P(x) ( ) 1/16 4/16 6/16 4/16 1/16 16/16=1.00 16/16=1 00

What is the probability of exactly 3 girls in 4 kids? P(X=3) = 4/16 What is the probability of at least 3 girls in 4 kids? P(X3) = 5/16

Visualizing Discrete Probability Distributions


Listing
{(0,1/16), (1,.25), (2,3/8),(3,.25),(4,1/16) }
0 1 2 3 4 Total
4/16

Table
Number of Girls, x Probability, P(x) 1/16 4/16 6/16 4/16 1/16 16/16=1.00

Graph
Probability, P(x) 0.40 0.35 0 30 0.30 0.25 P(x) 0.20 0.15 0.10 0.05 0.00 0 1 2 Number of Girls, x 3 4 1/16 1/16 4/16 6/16

10

X is random and x is fixed. We can calculate the probability that different values of X will occur and make a probability distribution.

Student Lecture Notes

Probability Distributions
Probability, P(x) 0.40 0.35 0.30 0.25 P(x) 0.20 0.15 0.10 0.05 0 00 0.00 0 1 2 Number of Girls, x 3 4 1/16 1/16 4/16 4/16 6/16

11

Probability distributions can be written as probability histograms. Cumulative probabilities: Adding up probabilities of a range of values.

Washington State Population Survey and Random Variables


A telephone survey of number of telephones,x h households h ld th throughout h t 0 1 Washington State. 2 But some households dont have 3 phones. 4
0.70 0.60 0 50 0.50 P(x) 0.40 0.30 0.20 0.10 0.04 0.00 0 1 2 0.03 0.01 0.00 3 4 5 6 7 8 9 0.22 0.71

P(x) 0.03500 0.70553 0.21769 0.02966 0.00775 0.00332 0.00088 0 00002 0.00002 0.00000 0.00015 1.00000

5 6 7 8 9 Total

12

Number of Telephone Lines (x)

Student Lecture Notes

Probabilities about Telephone in Washington State


13

What is the probability that a household will have no telephone? What is the probability that a household will have 2 or more telephone lines? What is the probability that a household will have 2 to 4 phone lines? Wh t i What is th the probability b bilit a h household h ld will ill h have no phone h lines or more than 4 phone lines? Who do you think is in that 3.5% of the population? What are the implications of this for the quality of the survey?

Probability Histogram of Telephone Lines, 1998


0.70 0.60 0.50 P(x) 0.40 0.30 0.20 0.10 0.04 0.00 0 1 2 0.03 0.01 0.00 3 4 5 6 7 8 9 0.22 0.71

Number of Telephone Lines (x)

14

Student Lecture Notes

Summary Measures
1. Expected Value

mu

Mean of Probability Distribution Weighted Average of All Possible Values = E(X) = x p(x) Weighted Average Squared Deviation about Mean 2 = V(X)= E[ (x )2 ] = (x )2 p(x) 2 = V(X)=E(X2) [E(X)]2 = 2 = SD(X)

2. Variance

Sigma -squared

3. Standard Deviation
15

What is the average number of telephones in Washington Households and how much does size vary from the average?
# of
Phones

Approach 1: Variance Frequency F 198,286 4,142,030 1,278,026 174,110 45,499 19,473 5,170 118 897 5,863,609 P( ) P(x) 0.04 0.71 0.22 0.03 0.01 0.00 0.00 0.00 0.00 0.00 1.00 xP(x) P( ) 0.00 0.71 0.44 0.09 0.03 0.02 0.01 0.00 0.00 0.00 =1.28 ( -) (x(x -1.3 -0.3 0.7 1.7 2.7 3.7 4.7 5.7 6.7 7.7 32.16 ( - )2 (x(x 1.65 0.08 0.51 2.94 7.38 13.81 22.24 32.67 45.10 59.53 ( -)2P(x) (x(x ( ) 0.06 0.06 0.11 0.09 0.06 0.05 0.02 0.00 0.00 0.01 2=0.45

Approach 2: Variance x2 0 1 4 9 16 25 36 49 64 81 x2P(x) P( ) 0.00 0.71 0.87 0.27 0.12 0.08 0.03 0.00 0.00 0.01 2.10

x 0 1 2 3 4 5 6 7 8 9 Sum

16

Student Lecture Notes

Chebyshevs Theorem
Helpful in understanding or interpreting a value of a standard deviation Empirical rule applies only to data sets with a bellbell-shaped distribution Chebyshevs y theorem applies pp to ANY data set, but its result are very approximate

17

Chebyshevs Theorem
The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 1/K2 (where K > 1) For K = 2 and K = 3, the results are as follows:

18

At least (or 75%) off all values lie within 2 standard deviations of the mean At least 8/9 (or 89%) off all values lie within 3 standard deviations of the mean

Student Lecture Notes

10

Cherbyshevs Rule and Empirical Rule for a Discrete Random Variable


Let x be a discrete random variable with a probability distribution p(x), mean , and standard deviation . Then, depending on the shape of p(x), the following probability statements can be made:
Chebyshevs Rule Applies to any probability distribution (eg: telephones in i W Washington hi t St State) t ) P( - < x < + ) P( - 2 < x < + 2) P( - 3 < x < + 3) 0 3/4 8/9 Empirical Rule Applies to probability distributions that are moundmound-shaped and symmetric t i (eg: ( girls il b born of f4 children) .68 .95 1.00

19

Data Types
Data

Numerical

Qualitative

Discrete

Continuous

20

Student Lecture Notes

11

Continuous Random Variable


A variable with many possible values at all intervals

21

Continuous Random Variable Examples


Experiment
Weigh 100 People Measure Part Life Ask Food Spending Measure Time Between Arrivals
22

Random Variable
Weight Hours Spending

Possible Values
45.1, 78, ... 900, 875.9, ... 54 12 42 54.12, 42, ...

Inter-Arrival 0, 1.3, 2.78, ... InterTime

Student Lecture Notes

12

Continuous Probability Density Function


1 Mathematical 1. M th ti l F Formula l 2. Shows All Values, x, & Frequencies, f(x)

Frequency (Value, Frequency)

f(X) Is Not Probability

f(x)

3. Properties

Area under curve sums to 1 Can add up areas of function to get probability less than a specific value

a
Value

23

Continuous Random Variable Probability


P b bilit Is Probability I Area A Under Curve!

P(c x d)

f(x)

c
24
1984-1994 T/Maker Co.

Student Lecture Notes

13

Continuous Probability Distribution Models


Continuous Probability Distribution Uniform
25

Normal

Exponential

Importance of Normal Distribution


1 Describes Many Random Processes or 1. Continuous Phenomena 2. Can Be Used to Approximate Discrete Probability Distributions

Example: Binomial

3. Basis for Classical Statistical Inference


26

Student Lecture Notes

14

Normal Distribution
1. Bell Bell-Shaped & Symmetrical 2. Mean, Median, Mode Are Equal 3. Middle Spread Is 1.33 4. Random Variable Has Infinite Range Mean Median Mode

f(X)

27

Normal Distribution Useful Properties


About half of weight below mean (because (b symmetrical) About 68% of probability within 1 standard deviation of mean (at change in curve) About 95% of probability within 2 standard deviations More than 99% of probability within 3 standard deviations 28

f(X)

3 2

+ + 2 + 3

Mean Median Mode

Student Lecture Notes

15

Probability Density Function


f (x) =
x e
29

2 1 x 2

= Value of Random Variable ( (- < x < ) = Population Standard Deviation = 3.14159 = 2.71828 = Mean of Random Variable x
Dont memorize this!

Notation
X is N(,) The random variable X has a normal distribution (N) with mean and standard deviation . X is N(40,1) ( , ) X is N(10,5) X is N(50,3)
30

Student Lecture Notes

16

Effect of Varying Parameters ( & )


f(X) B A C X
31

Normal Distribution Probability


Probability is area under curve!

P(c x d) = ? f (x) dx
d c

f(x )

c
32

Student Lecture Notes

17

Infinite Number of Tables


Normal distributions differ by y mean & standard deviation.
f(X)

Each distribution would require its own table.

Thats an infinite number!


33

Standardize the Normal Distribution


Normal Distribution

Z=

Z is N(0,1) ( , )
Standardized Normal Distribution

=1

34

= 0
One table!

Student Lecture Notes

18

Standardizing Example
Z=
Normal Distribution

X 6.2 5 = = .12 10

Standardized Normal Distribution

= 10

=1

35

= 5 6.2 X

= 0 .12

Obtaining the Probability


Standardized Normal Probability Table (Portion)
Z .00 .01

.02

=1
.0478

0.0 .0000 .0040 .0080

0.1 .0398 .0438 .0478 0 8


0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255
36

= 0 .12
Probabilities

Shaded area exaggerated

Student Lecture Notes

19

.00

.01

.02

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255

Example P(3.8 X 5)
Z= X 3.8 5 = = .12 10
Standardized Normal Distribution

Normal Distribution

= 10

=1
.0478

37

3.8 = 5

-.12 = 0

Shaded area exaggerated

.00

.01

.02

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255

Example P(2.9 X 7.1)


X 2.9 5 = = .21 10 X 7.1 5 Z= = = .21 Standardized 10 Z=

Normal Distribution

Normal Distribution

= 10

=1
.1664 1664
.0832 .0832

2.9 5 7.1 X
38

-.21 0 .21

Shaded area exaggerated

Student Lecture Notes

20

.00

.01

.02

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255

Example P(X 8)
Z= X 85 = = .30 10
Standardized Normal Distribution

Normal Distribution

= 10

=1
.5000 .1179

.3821

=5
39

=0

.30 Z

Shaded area exaggerated

.00

.01

.02

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255

Example P(7.1 X 8)
X 7.1 5 = = .21 10 X 85 Z= = = .30 10 Z=

Normal Distribution

Standardized Normal Distribution

= 10

=1
.1179 .0832

.0347

40

=5

7.1 8

=0

.21 .30 Z

Shaded area exaggerated

Student Lecture Notes

21

Travel Time and the Normal Distribution


To help people plan their travel, WSDOT estimates th t average trip that t i from f Seattle S ttl to t Bellevue B ll at t 5:40 5 40 pm (at peak) takes 11 minutes and with a standard deviation of 10. They also believe this travel time approximates a normal distribution. What Wh t proportion ti of f trips t i take t k less l than th 27 minutes? i t ?

41

Process
1. Draw a picture and write down the probability you need. 2. Convert probability to standard scores. 3. Find cumulative probability in the table.

42

Student Lecture Notes

22

More Travel Time


Suppose we have only 1010-15 minutes to travel to Seattle from Bellevue. What proportion of trips will make it in that time?
15 11 10 11 P(10 < X < 15) = P < Z < P 10 10
= P( 0.1 < Z < .4)

= 1 P(Z < 0.1) P ( Z > .4)

= 1 P (Z > .1) P ( Z > .4) = 1 (.5 .0398) (.5 .1554) = 1 (.4602) (.3446) = .1952

Since normal curves are symmetrical:


43

19.5% of trips will make it in between 10 and 15 minutes.

Finding Z Values for Known Probabilities


What is Z g given P(Z) = .1217? .1217
Standardized Normal Probability Table (Portion)

=1

.00

.01

0.2

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478

= 0 .31
Shaded area exaggerated 44

0.2 .0793 .0832 .0871

0.3 .1179 .1217 .1255

Student Lecture Notes

23

.00

.01

.02

0.0 .0000 .0040 .0080 0.1 .0398 .0438 .0478 0.2 .0793 .0832 .0871 0.3 .1179 .1217 .1255

Finding Z Values for Known Probabilities

Normal Distribution

Standardized Normal Distribution

= 10

=1
.1217 .1217

=5

= 0 .31

X = + Z = 5 + (. 31 )(10 ) = 8 . 1
45 Shaded areas exaggerated

Travel Times Take 3


How much time will the trip take 99% of the time?

46

Student Lecture Notes

24

Finding Z Values for Known Probabilities


1 Write down probability statement and draw a 1. picture
P(Z<____)=.99

2. Look up Z value in table


2.325 P(Z<_____)=.99

3 Convert Z value (SD units) to variable (X) by 3. using mean and SD.
34.25 2.325 X=+Z so X=11+(_____)(10)=
So, the trip can be made 99% of the time in 34.25 minutes.

47

Assessing Normality
1. 2. A histogram of the data is mound shaped and symmetrical about the mean. Determine the percentage of measurements falling in each of the intervals x s, x 2s, and x 3s. If the data are approximately normal, the percentages will be approximately equal to 68%, 95%, and 100% respectively. Find the interquartile range, IQR, and standard deviation, s, for the sample, p then calculate the ratio IQR/s. If the data are approximately normal, then IQR/S 1.3. Construct a normal probability plot for the data. If the data are approximately normal, the points will fall (approximately) on a straight line.

3.

4.

48

Student Lecture Notes

25

Assessing Normality: Is Class Height Normally Distributed?


1. How does the histogram g look? SPSS can produce the line of the normal curve for you. In SPSS select GRAPH, HISTOGRAM. After you choose the variable you want, click on the box Display Normal Curve and youll get something that looks like this.
7

y c4 n e u q e r F3

1 Mean = 66.52 Std. Dev. = 3.117 N = 23 0 60 62 64 66 68 70 72

Height 527 2005

49

Assessing Normality: Is Class Height Normally Distributed?


2. 2 Compute the intervals:
Height 527 2005 Cumulative Frequency Percent Valid Percent Percent Valid 60 1 4.3 4.3 4.3 62 1 4.3 4.3 8.7 63 3 13.0 13.0 21.7 64 2 8.7 8.7 30.4 65 1 4.3 4.3 34.8 66 3 13 0 13.0 13 0 13.0 47 8 47.8 67 2 8.7 8.7 56.5 68 2 8.7 8.7 65.2 69 5 21.7 21.7 87.0 70 1 4.3 4.3 91.3 71 1 4.3 4.3 95.7 72 1 4.3 4.3 100.0 Total 23 100.0 100.0

Anticipated Actual Percent Percent xs [63.40,69.64] 68% 95% 100% 43% 96% 100% x2s [60.29,72.75] x3s [57.17,75.87]

50

SPSS: ANALYZE, DESCRIPTIVE STATISTICS, FREQUENCIES

Student Lecture Notes

26

Assessing Normality: Is Class Height Normally Distributed?


3 D 3. Does IQR/ IQR/s1.3? 1 3? IQR=69-64=5 IQR=69IQR/s=5/3.117=1.6
Statistics Height 527 2005 N Valid Missing Std. Deviation Percentiles 25 50 75 23 0 3.117 64.00 67.00 69.00

SPSS: ANALYZE, DESCRIPTIVE STATISTICS, FREQUENCIES then click on STATISTICS and choose the ones you want.

51

Assessing Normality: Is Class Height Normally Distributed?


4 What 4. Wh t does d the th normal l probability plot look like?
SPSS: Graphs>Q Graphs>Q-Q Test distribution is normal and click estimate distribution parameters from data.
Normal Q-Q Plot of Height 527 2005
74

72

e70 u l a V l a68 m r o N d 66 e t c e p x 64 E
62

60 60 62 64 66 68 70 72 74

Observed Value

52

Student Lecture Notes

27

Exercise 1
Identify the given random variable as being discrete or continuous
a)

b)

c)

The weight of the cola in a randomly selected can. The cost of a randomly selected can of Coke. The time it takes to fill a can of Pepsi.

53

Exercise 2
Below is a case where a probability di t ib ti i distribution is d described. ib d Fi Find d it its mean and standard deviation.
In a study of the MicroSort gender genderselection method, couples in a control group g p are not g given a treatment, , and they y each have three children. The probability distribution for the number of girls is given.
54

x 0 1 2 3

P(x) 0.125 0.375 0.375 0.125

Student Lecture Notes

28

Exercise 3
Below is a case where a probability di t ib ti i distribution is d described. ib d Fi Find d it its mean and standard deviation.
To settle a paternity suit, two different people are given blood tests. If x is the number having gg group p A blood, , then x can be 0, 1 or 2, and the corresponding probabilities are 0.36, 0.48 and 0.16, respectively.
55

P(x)

Exercise 4
Let the random variable x represent the number of girls in a family of four children. Construct a table describing the probability distribution, then find the mean and standard deviation.
x P(x)

56

Student Lecture Notes

29

Exercise 5
Assume that the readings on the thermometers are normally distributed with a mean of 0 0 C and a standard deviation of 1.00 C. A thermometer is randomly selected and tested. In each case, draw a sketch, and find the probability of each reading in degree. Between 0 and 1.50 Between -1.96 and 0 Less than -1.79 Greater than 2.05 Between 0.50 and 1.50 P(-1.96 < z < 1.96) P(z > -2.575)

a) b) c) d) e) f) 57 g)

Exercise 6
Assume that a test is designed to measure a persons sense of humour and that scores on this test are normally distributed with a mean of 10 and a standard distribution of 2. draw a graph, find the relevant z score, then find the indicated value. a) Find the score separating the top 10% from the bottom 90%. b) Find the score separating the top 25% from the bottom 75%. c) Find the score separating the bottom 20% from the top 80%.
58

Potrebbero piacerti anche