Sei sulla pagina 1di 45

10/21/2019

6-7. Probability and


distributions

[1] 2, 4; [2] 3 ;
[3] 4, 5, 6; [5] 2, 8

Data Analysis 10/21/2019

2
Outline

 Probability
 Commonly used Distributions

Data Analysis 10/21/2019

1
10/21/2019

3
Probability

 Basic ideas
 Counting Methods
 Random Variables
 Linear Functions of Random Variables
 Jointly Distributed Random Variables

Data Analysis 10/21/2019

4
Basic ideas

 The set of all possible outcomes of an experiment is called the sample


space for the experiment.
 Ex: For tossing a coin, we can use the set {Heads, Tails} as the sample space. This
sample space is finite.
 A subset of a sample space is called an event.
 Combining Events: union, intersection, complement (not occur)

Data Analysis 10/21/2019

2
10/21/2019

Basic ideas
MUTUALLY EXCLUSIVE EVENTS

Data Analysis

6
Probabilities

 Each event in a sample space has a probability of occurring.


 Given any experiment and any event A:
 The expression P(A)denotes the probability that the event A occurs.
 P(A) is the proportion of times that event A would occur in the long run, if the
experiment were to be repeated over and over again.

Data Analysis 10/21/2019

3
10/21/2019

Axioms of Probability
Data Analysis

Axioms of Probability
Data Analysis

4
10/21/2019

9
Sample Spaces with Equally Likely
Outcomes

Data Analysis 10/21/2019

10
The Addition Rule

Data Analysis 10/21/2019

10

5
10/21/2019

11

Counting Methods
Data Analysis

11

12
Permutations

 A permutation is an ordering of a collection of objects.


 Ex: There are six permutations of the letters A, B, C: ABC, ACB, BAC, BCA,
CAB, and CBA

Data Analysis 10/21/2019

12

6
10/21/2019

13
Permutations

Data Analysis 10/21/2019

13

14
Combinations

Data Analysis 10/21/2019

14

7
10/21/2019

Conditional 15
Probability and
Independence

Data Analysis 10/21/2019

15

16

Independent Events
Data Analysis

16

8
10/21/2019

17

The Multiplication Rule


Data Analysis

17

The Law of Total Probability 18

Data Analysis 10/21/2019

18

9
10/21/2019

19

Data Analysis 10/21/2019

19

20

Bayes' Rule
Data Analysis

20

10
10/21/2019

21
Random Variables

 A random variable assigns a numerical value to each outcome in a


sample space.

Sample space

The function X, which assigns a


numerical value to each outcome in the
sample space, is a random variable.

Data Analysis 10/21/2019

21

22
Discrete Random Variables

 A random variable is discrete if its possible


values form a discrete set.
 The probability mass function of a discrete
random variable X is the function p(x) = P(X = x).
The probability mass function is sometimes
called the probability distribution.

Data Analysis 10/21/2019

22

11
10/21/2019

23
The Cumulative Distribution Function
of a Discrete Random Variable

Data Analysis 10/21/2019

23

24
Mean and Variance for Discrete
Random Variables

Data Analysis 10/21/2019

24

12
10/21/2019

 Example 1: A certain industrial process is brought down for recalibration


whenever the quality of the items produced falls below specifications. Let
X represent the number of times the process is recalibrated during a week, 25
and assume that X has the following probability mass function.

 Find the mean of X.

Data Analysis 10/21/2019

25

26

Population standard deviation


Data Analysis

26

13
10/21/2019

 Example 2: Find the variance and standard deviation for the random variable X 27
described as below, representing the number of times a process is recalibrated.

 Solution: In Example 1 we computed the mean of X to be μX = 1.30. We compute the


variance by using Equation (2.30):

Data Analysis 10/21/2019

27

 Example 3: Use the alternate formula, Equation (2.31), to compute the variance of X, 28
the number of times a process is recalibrated.

 Solution: In Example 1 the mean was computed to be μX = 1.30. The variance is


therefore:

Data Analysis 10/21/2019

28

14
10/21/2019

29
Continuous Random Variables

 A random variable is
continuous if its probabilities
are given by areas under a
curve. The curve is called a
probability density function
for the random variable.
 The probability density
function is sometimes called
the probability distribution.

Data Analysis 10/21/2019

29

30
Computing Probabilities with the
Probability Density Function

Data Analysis 10/21/2019

30

15
10/21/2019

31
Computing Probabilities with the
Probability Density Function

Data Analysis 10/21/2019

31

32

Example

Data Analysis 10/21/2019

32

16
10/21/2019

33
The Cumulative Distribution Function
of a Continuous Random Variable

Data Analysis 10/21/2019

33

34
Mean and Variance for Continuous
Random Variables

Data Analysis 10/21/2019

34

17
10/21/2019

35

Mean and
Variance for
Continuous
Random
Variables

Data Analysis 10/21/2019

35

36
The Population Median and
Percentiles

Data Analysis 10/21/2019

36

18
10/21/2019

37

The probability that a random variable differs from its mean by k standard deviations or more is never greater
than 1/k2.

Chebyshev's Inequality
Data Analysis

37

38
Exercises

Q1. Computer chips often contain surface imperfections. For a certain type of
computer chip, the probability mass function of the number of defects X is
presented in the following table.

a. Find P(X ≤ 2).


b. Find P(X > 1).
c. Find μX.
d. Find σ2X

Data Analysis 10/21/2019

38

19
10/21/2019

Q2. A chemical supply company ships a certain solvent in 10-gallon drums. Let X
represent the number of drums ordered by a randomly chosen customer. Assume X 39
has the following probability mass function:

a. Find the mean number of drums ordered.


b. Find the variance of the number of drums ordered.
c. Find the standard deviation of the number of drums ordered.
d. Let Y be the number of gallons ordered. Find the probability mass function of Y.
e. Find the mean number of gallons ordered.
f. Find the variance of the number of gallons ordered.
g. Find the standard deviation of the number of gallons ordered.

Data Analysis 10/21/2019

39

40
Linear Functions of
Random Variables

 Means of Linear
Combinations of
Random Variables

Data Analysis 10/21/2019

40

20
10/21/2019

41

Means of Linear Combinations of Random


Variables

Data Analysis

41

42

Independent Random Variables


Data Analysis

42

21
10/21/2019

43
Variances of Linear Combinations of
Independent Random Variables

Data Analysis 10/21/2019

43

44
The Mean and Variance of a Sample
Mean

Data Analysis 10/21/2019

44

22
10/21/2019

45
Exercises

Q3. If X and Y are independent random variables with means μX = 9.5 and μY
= 6.8, and standard deviations σX = 0.4 and σY = 0.1, find the means and
standard deviations of the following:
a. 3 X
b. Y − X
c. X + 4Y

Data Analysis 10/21/2019

45

46
Jointly Distributed Random Variables

 When two or more random variables are associated


with each item in a population, the random variables
are said to be jointly distributed. If all the random
variables are discrete, they are said to be jointly
discrete. If all the random variables are continuous, they
are said to be jointly continuous.

Data Analysis 10/21/2019

46

23
10/21/2019

47
Jointly Discrete Random Variables
Joint
probability
mass function
Both X and Y are
discrete, so X and
Y are jointly
discrete.

Probability?

Data Analysis 10/21/2019

47

48
Example

 Find the probability that a CD cover has a length of 129 mm.

Data Analysis 10/21/2019

48

24
10/21/2019

49
Jointly Discrete Random Variables

Data Analysis 10/21/2019

49

50
Jointly Continuous Random Variables

Data Analysis 10/21/2019

50

25
10/21/2019

51
Jointly Continuous Random Variables

Data Analysis 10/21/2019

51

52
More than Two Random Variables

Data Analysis 10/21/2019

52

26
10/21/2019

53
Means of Functions of Random Variables

Data Analysis 10/21/2019

53

54
Means of Functions of Random
Variables

Data Analysis 10/21/2019

54

27
10/21/2019

55
Means of Functions of Random Variables

Data Analysis 10/21/2019

55

56
Conditional Distributions

Data Analysis 10/21/2019

56

28
10/21/2019

57
Conditional probability density
function

Data Analysis 10/21/2019

57

58
Conditional Expectation

 Expectation is another term for mean. A conditional


expectation is an expectation, or mean, calculated
using a conditional probability mass function or
conditional probability density function. The conditional
expectation of Y given X = x is denoted E(Y | X = x) or
μY|X=x.

Data Analysis 10/21/2019

58

29
10/21/2019

59

Independent
Random
Variables

Data Analysis 10/21/2019

59

60

Independent Random
Variables
Data Analysis

60

30
10/21/2019

61

Covariance
When two random
variables are not
independent, it is useful to
have a measure of the
strength of the relationship
between them. The
population covariance is
a measure of a certain
type of relationship known
as a linear relationship.
We will usually drop the
term “population,” and
refer simply to the
covariance.

Data Analysis 10/21/2019

61

62

(a)A random sample of points from a


population with positive covariance.
(b) A random sample of points from a
population with negative covariance.
(c) A random sample of points from a
population with covariance near 0.

Data Analysis 10/21/2019

62

31
10/21/2019

63
Correlation

 The population correlation is a measure of the strength


of a linear relationship that is unitless.

Data Analysis 10/21/2019

63

64

Covariance, Correlation,
and Independence
Data Analysis

64

32
10/21/2019

65
Linear Combinations of Random
Variables

Data Analysis 10/21/2019

65

66
Linear Combinations of Random
Variables

Data Analysis 10/21/2019

66

33
10/21/2019

67
The Mean and Variance of a Sample
Mean

Data Analysis 10/21/2019

67

68
Exercises

Q4. In a certain community, levels of air pollution may exceed federal standards for ozone or for
particulate matter on some days. For a particular summer season, let X be the number of days on
which the ozone standard is exceeded and let Y be the number of days on which the particulate
matter standard is exceeded. Assume that the joint probability mass function of X and Y is given in
the following table:
a. Find P(X = 1 and Y = 2).
b. Find P(X > 0 and Y ≤ 1).
c. Find P(X ≤ 1).
d. Find P(Y > 0).
e. Find the probability that the standard for ozone is exceeded at least once.
f. Find the probability that the standard for particulate matter is never exceeded.
g. Find the probability that neither standard is ever exceeded.

Data Analysis 10/21/2019

68

34
10/21/2019

69
Exercises

Q5. Refer to Q4.


a. Find the conditional probability mass function
pY|X(y |1).
b. Find the conditional probability mass function
pX|Y(x |1).
c. Find the conditional expectation E(Y | X = 1).
d. Find the conditional expectation E(X | Y = 1).

Data Analysis 10/21/2019

69

Commonly used
distributions

Data Analysis 10/21/2019 70

70

35
10/21/2019

71
Bernoulli Distribution

 The probability of success is denoted by p. The probability of failure is


therefore 1 − p. Such a trial is called a Bernoulli trial with success
probability p.
 For any Bernoulli trial, we define a random variable X as follows: If the
experiment results in success, then X = 1. Otherwise X = 0. It follows that X is
a discrete random variable, with probability mass function p(x) defined by
p(0) = P(X = 0) = 1 − p
p(1) = P(X = 1) = p
p(x) = 0 for any value of x other than 0 or 1

Data Analysis 10/21/2019

71

72
Bernoulli
Distribution
 The random variable X
is said to have the
Bernoulli distribution
with parameter p. The
notation is X ∼
Bernoulli( p). Figure 4.1
[1] presents probability
histograms for the
Bernoulli(0.5) and
Bernoulli(0.8)
probability mass
functions.

Data Analysis 10/21/2019

72

36
10/21/2019

73

Bernoulli Distribution
Data Analysis

73

74
Examples

 [1]4.1. A coin has probability 0.5 of landing heads when tossed. Let X = 1 if
the coin comes up heads, and X = 0 if the coin comes up tails. What is the
distribution of X?
Sol: Since X = 1 when heads comes up, heads is the success outcome. The
success probability, P(X = 1), is equal to 0.5. Therefore X ∼ Bernoulli(0.5).
 [1]4.2. A die has probability 1/6 of coming up 6 when rolled. Let X = 1 if the
die comes up 6, and X = 0 otherwise. What is the distribution of X?
Sol: The success probability is p = P(X = 1) = 1/6. Therefore X ∼ Bernoulli(1/6).

Data Analysis 10/21/2019

74

37
10/21/2019

75
Exercises
Q6. When a certain glaze is applied to a ceramic surface, the probability is 5% t
hat there will be discoloration, 20% that there will be a crack, and 23% that
there will be either discoloration or a crack, or both. Let X = 1 if there
is discoloration, and let X = 0 otherwise. Let Y = 1 if there is a crack, and let Y = 0
otherwise. Let Z = 1 if there is either discoloration or a crack, or both, and let Z =
0 otherwise.
a. Let pX denote the success probability for X. Find pX .
b. Let pY denote the success probability for Y. Find pY .
c. Let pZ denote the success probability for Z. Find pZ .
d. Is it possible for both X and Y to equal 1?
e. Does pZ = pX + pY?
f. Does Z = X + Y? Explain.

Data Analysis 10/21/2019

75

76
The Binomial Distribution

Data Analysis 10/21/2019

76

38
10/21/2019

77
Examples

 [1]4.5. A fair coin is tossed 10 times. Let X be the number


of heads that appear. What is the distribution of X?
Sol: There are 10 independent Bernoulli trials, each with
success probability p = 0.5. The random variable X is equal
to the number of successes in the 10 trials. Therefore X ∼
Bin(10, 0.5).

Data Analysis 10/21/2019

77

78

The Binomial Distribution


Data Analysis

78

39
10/21/2019

79

Probability Mass Function of a Binomial


Random Variable

Data Analysis

79

80

Data Analysis 10/21/2019

80

40
10/21/2019

81
Examples

 [1]4.7. Find the probability mass function of the random variable X if X ∼


Bin(10, 0.4). Find P(X = 5).
Sol: We use Equation (4.4) with n = 10 and p = 0.4. The probability mass
function is

Data Analysis 10/21/2019

81

82

 A Binomial Random Variable Is a Sum of Bernoulli Random Variables


 The Mean and Variance of a Binomial Random Variable

Data Analysis 10/21/2019

82

41
10/21/2019

83

Data Analysis 10/21/2019

83

84

10/21/2019
Data Analysis

Q7. Find the following probabilities:

a. P(X = 3) when X ∼ Bin(5, 0.2)


Exercises b. P(X ≤ 2) when X ∼ Bin(10, 0.6)
c. P(X ≥ 5) when X ∼ Bin(9, 0.5)
d. P(3 ≤ X ≤ 4) when X ∼ Bin(8, 0.8)

84

42
10/21/2019

85

Poisson Distribution
Data Analysis

85

86

Poisson Distribution
Data Analysis

86

43
10/21/2019

87

Poisson Distribution
Data Analysis

87

88
Examples

 [1]4.15. If X ∼ Poisson(3), compute P(X = 2), P(X = 10), P(X = 0), P(X = -1), and
P(X = 0.5).
Sol: Using the probability mass function (4.9), with λ = 3, we obtain

Data Analysis 10/21/2019

88

44
10/21/2019

89
Exercises

Q8. Let X ∼ Poisson(4). Find


a. P(X = 1)
b. P(X = 0)
c. P(X < 2)
d. P(X > 1)
e. μX
f. σX
Data Analysis 10/21/2019

89

90
Reading

 [1] 6
 [2] 5
 [3] 9, 10; [4] 9, 10; [5] 10

Data Analysis 10/21/2019

90

45

Potrebbero piacerti anche