Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 1 Learning Objectives (LOs)
1-2
Tween Survey
Survey questions asked to 20 tweens:
Q1. Which radio station was playing on
on a scale of 1 to 4.
Q3. What time should the main dining area
close?
Q4. How much of your own money did you
1-3
Tween Survey
Here are the survey responses from the 20 tweens.
1-4
Tween Survey
1-5
1.1 The Relevance of Statistics
LO 1.1 Describe the importance of statistics.
1-6
LO 1.1 1.1 The Relevance of Statistics
Example 1. Headline of newspaper states
‗What global warming?‘ after record
amounts of snow in 2010.
1-7
LO 1.1 1.1 The Relevance of Statistics
1-8
LO 1.1 1.1 The Relevance of Statistics
Example 3. A Boston Globe poll reported a
15-point lead for Martha Coakley in the
election for U.S. senator for Massachusetts,
implying an easy win for Coakley. Nine days
later, Scott Brown wins.
1-9
LO 1.1 1.1 The Relevance of Statistics
Example 4. The CFO of Starbucks Corp. claims
that business is picking up since sales at stores
open at least a year climbed 4% in the quarter
ended December 27, 2009.
1-10
LO 1.1 1.1 The Relevance of Statistics
1-11
1.2 What Is Statistics?
LO 1.2 Differentiate between descriptive statistics and inferential statistics.
1-12
LO 1.2 1.2 What Is Statistics?
Two branches of statistics
Descriptive Statistics
collecting, organizing, and presenting the
data.
Inferential Statistics
drawing conclusions about a population
based on sample data from that population.
1-13
LO 1.2 1.2 What Is Statistics?
Population
Consists of all items of interest.
Sample
A subset of the population.
A sample statistic is calculated from the
sample data and is used to make inferences
about the population parameter.
1-14
The Need for Sampling
LO 1.3 Explain the need for sampling and discuss various data types.
1-15
LO 1.3 Types of Data
Cross-sectional data
Data collected by recording a characteristic of
many subjects at the same point in time, or without
regard to differences in time.
Subjects might include individuals, households,
firms, industries, regions, and countries.
The survey data from the Introductory Case is an
example of cross-sectional data.
1-16
LO 1.3 Types of Data
Time series data
Data collected by recording a characteristic of a
subject over several time periods.
Data can include daily, weekly, monthly, quarterly,
or annual observations.
This graph plots the U.S.
GDP growth rate from
1980 to 2010 - it is an
example of time series
data.
1-17
LO 1.3 Getting Started on the Web
There is an abundance of data on the
Internet. Here are a few websites for data.
1-18
1.3 Variables and Scales of Measurement
LO 1.4 Describe variables and various types of measurement scales.
Continuous
1-19
LO 1.4 1.3 Variables and Scales of Measurement
Types of Quantitative Variables
Discrete
A discrete variable assumes a
countable number of distinct values.
Examples: Number of children in a
family, number of points scored in a
basketball game.
1-20
LO 1.4 1.3 Variables and Scales of Measurement
Types of Quantitative Variables
Continuous
A continuous variable can assume an
infinite number of values within some
interval.
Examples: Weight, height, investment
return.
1-21
LO 1.4 1.3 Variables and Scales of Measurement
Scales of Measure
Nominal
Qualitative Variables
Ordinal
Interval
Quantitative Variables
Ratio
1-22
LO 1.4 1.3 Variables and Scales of Measurement
The Nominal Scale
The least sophisticated level of measurement.
Data are simply categories for grouping the data.
1-23
LO 1.4 1.3 Variables and Scales of Measurement
1-24
LO 1.4 1.3 Variables and Scales of Measurement
1-25
LO 1.4 1.3 Variables and Scales of Measurement
Example: Tweens Survey
How are the data based on the ratings of the food quality
similar to or different from the radio station data?
1-26
LO 1.4 1.3 Variables and Scales of Measurement
The Interval Scale
Data may be categorized and ranked with respect
to some characteristic or trait.
Differences between interval values are equal and
meaningful. Thus the arithmetic operations of
addition and subtraction are meaningful.
No ―absolute 0‖ or starting point defined.
Meaningful ratios may not be obtained.
1-27
LO 1.4 1.3 Variables and Scales of Measurement
The Interval Scale
For example, consider the Fahrenheit
scale of temperature.
This scale is interval because the data
are ranked and differences (+ or )
may be obtained.
But there is no ―absolute 0‖ (What
does 0F mean?)
80F
What does mean?
40F
1-28
LO 1.4 1.3 Variables and Scales of Measurement
The Ratio Scale
The strongest level of measurement.
Ratio data may be categorized and ranked with
respect to some characteristic or trait.
Differences between interval values are equal and
meaningful.
There is an ―absolute 0‖ or defined starting point.
―0‖ does mean ―the absence of …‖ Thus,
meaningful ratios may be obtained.
1-29
LO 1.4 1.3 Variables and Scales of Measurement
The Ratio Scale
The following variables are measured on a ratio
scale:
General Examples: Weight, Time, and Distance
Business Examples: Sales, Profits, and Inventory
Levels
1-30
LO 1.4 1.3 Variables and Scales of Measurement
Example: Tweens Survey
How are the time data classified? In what ways do the time
data differ from ordinal data? What is a potential weakness
of this measurement scale?
1-31
LO 1.4 1.3 Variables and Scales of Measurement
Example: Tweens Survey
What is the measurement scale of the money data? Why is
it considered the most sophisticated form of data?
1-32
Synopsis of Tween Survey
60% of the tweens listened to KISS108. The resort
may want to direct its advertising dollars to this
station.
55% of the tweens felt that the food was, at best,
fair.
95% of the tweens would like the dining area to
remain open later.
85% of the tweens spent their own money at the
lodge.
1-33
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 2 Learning Objectives (LOs)
LO 2.1: Summarize qualitative data by forming
frequency distributions.
LO 2.2: Construct and interpret pie charts and bar
charts.
LO 2.3: Summarize quantitative data by forming
frequency distributions.
LO 2.4: Construct and interpret histograms, polygons,
and ogives.
LO 2.5: Construct and interpret a stem-and-leaf
diagram.
LO 2.6: Construct and interpret a scatterplot.
1-35
House Prices in Southern California
A relocation specialist for a real estate firm in
Mission Viejo, CA gathers recent house sales
data for a client from Seattle, WA.
The table below shows the sale price (in
$1,000s) for 36 single-family houses.
1-36
House Prices in Southern California
Use the sample information to:
1-37
2.1 Summarizing Qualitative Data
LO 2.1 Summarize qualitative data by forming frequency distributions.
A frequency distribution for qualitative data
groups data into categories and records how many
observations fall into each category.
Weather conditions in Seattle, WA during
February 2010.
1-38
LO 2.1 2.1 Summarizing Qualitative Data
Categories: Rainy, Sunny, or Cloudy.
For each category‘s frequency, count the days
that fall in that category.
Calculate relative frequency by dividing each
category‘s frequency by the sample size.
Relative
Weather
Weather Frequency
Frequency Frequency
Cloudy
Cloudy 11 1/28=0.036
RainyRainy 20
20 20/28=0.714
Sunny
Sunny 77 7/28=0.250
Total
Total 28
28 28/28=1.000
1-39
LO 2.1 2.1 Summarizing Qualitative Data
To express relative frequencies in terms of
percentages, multiply each proportion by 100%.
Relative
Weather Frequency Frequency Percentage
Cloudy 1 1/28=0.036 x 100= 3.6%
Rainy 20 20/28=0.714 x 100=71.4%
Sunny 7 7/28=0.250 x 100=25.0%
Total 28 28/28=1.000 x 100=100%
1-40
LO 2.2 2.1 Summarizing Qualitative Data
A pie chart is a segmented circle whose segments
portray the relative frequencies of the categories of
some qualitative variable.
In this example,
the variable
Region is
proportionally
divided into
4 parts.
1-41
LO 2.2 2.1 Summarizing Qualitative Data
1-42
2.2 Summarizing Quantitative Data
LO 2.3 Summarize quantitative data by forming frequency distributions.
1-43
LO 2.3 2.2 Summarizing Quantitative Data
The number of classes usually ranges from 5
to 20.
1-44
LO 2.3 2.2 Summarizing Quantitative Data
The raw data from the Introductory Case has been
converted into a frequency distribution in the
following table.
Class (in $1000s) Frequency
300 up to 400 4
400 up to 500 11
500 up to 600 14
600 up to 700 5
700 up to 800 2
Total 36
1-45
LO 2.3 2.2 Summarizing Quantitative Data
Class (in $1000s) Frequency
300 up to 400 4
400 up to 500 11
500 up to 600 14
600 up to 700 5
700 up to 800 2
Total 36
1-46
LO 2.3 2.2 Summarizing Quantitative Data
A cumulative frequency distribution specifies how
many observations fall below the upper limit of a
particular class.
1-47
LO 2.3 2.2 Summarizing Quantitative Data
A relative frequency distribution identifies the
proportion or fraction of values that fall into each
class.
Class frequency
Class relative frequency
Total number of observations
1-48
LO 2.3 2.2 Summarizing Quantitative Data
Here are the relative frequency and the cumulative
relative frequency distributions for the house-price
data.
Relative
Class (in $1000s) Frequency Frequency Cumulative Relative Frequency
300 up to 400 4 4/36 = 0.11 0.11
400 up to 500 11 11/36 = 0.31 0.11 + 0.31 = 0.42
500 up to 600 14 14/36 = 0.39 0.11 + 0.31 + 0.39 = 0.81
600 up to 700 5 5/36 = 0.14 0.11 + 0.31 + 0.39 + 0.14 = 0.95
700 up to 800 2 2/36 = 0.06 0.11 + 0.31 + 0.39 + 0.14 + 0.06 1.0
Total 36 1.0
1-49
LO 2.3 2.2 Summarizing Quantitative Data
Use the data on the previous slide to answer the
following two questions.
1-50
2.2 Summarizing Quantitative Data
LO 2.4 Construct and interpret histograms, polygons, and ogives.
Histograms
Polygons
Ogives
1-51
LO 2.4 2.2 Summarizing Quantitative Data
A histogram is a visual representation of a
frequency or a relative frequency distribution.
1-52
LO 2.4 2.2 Summarizing Quantitative Data
1-53
LO 2.4 2.2 Summarizing Quantitative Data
Shape of Distribution: typically symmetric or
skewed
Symmetric—mirror image on both sides of its
center.
Symmetric Distribution
1-54
LO 2.4 2.2 Summarizing Quantitative Data
Skewed distribution
Positively skewed - data
form a long, narrow tail
to the right.
Negatively skewed -
data form a long,
narrow tail to the left.
1-55
LO 2.4 2.2 Summarizing Quantitative Data
A polygon is a visual representation of a
frequency or a relative frequency distribution.
1-57
LO 2.4 2.2 Summarizing Quantitative Data
An ogive is a visual representation of a
cumulative frequency or a cumulative
relative frequency distribution.
1-58
LO 2.4 2.2 Summarizing Quantitative Data
Here is an ogive for the house-price data.
1-59
2.3 Stem-and-Leaf Diagrams
LO 2.5 Construct and interpret a stem-and-leaf diagram.
1-60
LO 2.5 2.3 Stem-and-Leaf Diagrams
The following data set shows the wealthiest
people in the world and their associated ages.
The leftmost digit is the stem while the last digit is
the leaf as shown here.
Age = 36
1-61
2.4 Scatterplots
LO 2.6 Construct and interpret a scatterplot.
1-62
LO 2.6 2.4 Scatterplots
Linear relationship: upward or downward-
sloping trend of the data.
Positive linear
relationship (shown
here): as x increases, so
does y.
Negative linear
relationship: as x
increases, y decreases.
1-63
LO 2.6 2.4 Scatterplots
Curvilinear relationship
As x increases,
y increases at an
increasing (or
decreasing) rate.
As x increases y
decreases, at an
increasing (or
decreasing) rate.
1-64
LO 2.6 2.4 Scatterplots
No relationship: data are randomly scattered
with no discernible pattern.
1-65
S
LOs 2.1, 2.2, and 2.4 Some Excel Commands
Pie chart or Bar chart: select the relevant
categorical names with respective data, then
choose Insert > Pie > 2-D Pie or Insert > Bar > 2-D
Bar.
1-66
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 3 Learning Objectives (LOs)
1-68
Chapter 3 Learning Objectives (LOs)
LO 3.5: Explain mean-variance analysis and
the Sharpe ratio.
LO 3.6: Apply Chebyshev‘s Theorem and the
empirical rule.
LO 3.7: Calculate the mean and the variance
for grouped data.
LO 3.8: Calculate and interpret the covariance
and the correlation coefficient.
1-69
Investment Decision
As an investment counselor at a large bank,
Rebecca Johnson was asked by an
inexperienced investor to explain the differences
between two top-performing mutual funds:
Vanguard‘s Precious Metals and Mining fund
(Metals)
Fidelity‘s Strategic Income Fund (Income)
The investor has collected sample returns for
these two funds for years 2000 through 2009.
These data are presented in the next slide.
1-70
Investment Decision
1-71
3.1 Measures of Central Location
LO 3.1 Calculate and interpret the arithmetic mean, the median, and the mode.
x
x i
n
Population Mean m
m x i
1-72
LO 3.1 3.1 Measures of Central Location
Example: Investment Decision
Use the data in the introductory case to calculate and
interpret the mean return of the Metals fund and the
mean return of the Income fund.
1-73
LO 3.1 3.1 Measures of Central Location
The mean is sensitive to outliers.
Consider the salaries of employees at Acetech.
1-74
LO 3.1 3.1 Measures of Central Location
The median is another measure of central location
that is not affected by outliers.
1-75
LO 3.1 3.1 Measures of Central Location
Consider the sorted salaries of employees at
Acetech (odd number).
3 values below 3 values above
Median = 90,000
1-76
LO 3.1 3.1 Measures of Central Location
The mode is another measure of central location.
The most frequently occurring value in a data set
1-77
3.2 Percentiles and Box Plots
LO 3.2 Calculate and interpret percentiles and a box plot.
In general, the pth percentile divides a data set into
two parts:
Approximately p percent of the observations have
values less than the pth percentile;
Approximately (100 p ) percent of the
1-78
LO 3.2 3.2 Percentiles and Box Plots
Calculating the pth percentile:
First arrange the data in ascending order.
1-79
LO 3.2 3.2 Percentiles and Box Plots
Consider the sorted data from the introductory case.
1-80
LO 3.2 3.2 Percentiles and Box Plots
Calculating the pth percentile
Once you find Lp, observe whether or not it is an
integer.
If Lp is an integer, then the Lpth observation in the
sorted data set is the pth percentile.
If Lp is not an integer, then interpolate between
two corresponding observations to approximate
the pth percentile.
1-81
LO 3.2 3.2 Percentiles and Box Plots
Both L25 = 2.75 and L75 = 8.25 are not integers, thus
1-82
LO 3.2 3.2 Percentiles and Box Plots
A box plot allows you to:
Graphically display the distribution of a data set.
Outliers
Whiskers
Box
**
1-83
LO 3.2 3.2 Percentiles and Box Plots
The box plot displays 5 summary values:
S = smallest value
L = largest value
Q1 = first quartile = 25th percentile
Q2 = median = second quartile = 50th percentile
Q3 = third quartile = 75th percentile
1-84
LO 3.2 3.2 Percentiles and Box Plots
Using the results obtained from the Metals fund
data, we can label the box plot with the 5
summary values:
Second
Quartile
Smallest First Third Largest
Value Quartile Quartile Value
1-85
3.2 Percentiles and Box Plots
Detecting outliers
Calculate IQR = 43.48
Calculate 1.5 × IQR, or 1.5 × 43.48 = 65.22
L – Q3 > 65.22
1-86
3.3 The Geometric Mean
LO 3.3 Calculate and interpret a geometric mean return and an
average growth rate.
1-87
LO 3.3 3.3 The Geometric Mean
1-88
LO 3.3 3.3 The Geometric Mean
1-89
LO 3.3 3.3 The Geometric Mean
Computing an average growth rate
For growth rates g1, g2, . . . , gn, the average growth rate
Gg is calculated as
1-90
LO 3.3 3.3 The Geometric Mean
For example, consider the sales for Adidas (in
millions of €) for the years 2005 through 2009:
1-91
3.4 Measures of Dispersion
LO 3.4 Calculate and interpret the range, the mean absolute deviation,
the variance, the standard deviation, and the coefficient of variation.
1-92
LO 3.4 3.4 Measures of Dispersion
Range
Range Maximum Value Minimum Value
It is the simplest measure.
It is focusses on extreme values.
1-93
LO 3.4 3.4 Measures of Dispersion
Mean Absolute Deviation (MAD)
MAD is an average of the absolute difference of
each observation from the mean.
Sample MAD
x i x
n
Population MAD
x i m
N
1-94
3.4 Measures of Dispersion
Calculate MAD using the data from the Metals
fund.
1-95
LO 3.4 3.4 Measures of Dispersion
Variance and standard deviation
n 1
1-96
3.4 Measures of Dispersion
Calculate the variance and the standard
deviation using the data from the Metals fund.
1-97
LO 3.4 3.4 Measures of Dispersion
Coefficient of variation (CV)
CV adjusts for differences in the magnitudes of
the means.
CV is unitless, allowing easy comparisons of
mean-adjusted dispersion across different data
sets.
s
Sample CV
x
Population CV
m
1-98
LO 3.4 3.4 Measures of Dispersion
Calculate the coefficient of variation (CV)
using the data from the Metals fund and the
Income fund.
Metals fund: CV
Income fund: CV
1-99
Synopsis of Investment Decision
Mean and median returns for the Metals fund are
24.65% and 33.83%, respectively.
1-100
3.5 Mean-Variance Analysis and the Sharpe Ratio
LO 3.5 Explain mean-variance analysis and the Sharpe Ratio.
Mean-variance analysis:
The performance of an asset is measured by its rate of
return.
The rate of return may be evaluated in terms of its reward
(mean) and risk (variance).
Higher average returns are often associated with higher
risk.
The Sharpe ratio uses the mean and variance to
evaluate risk.
1-101
LO 3.5 3.5 Mean-Variance Analysis and the
Sharpe Ratio
Sharpe Ratio
Measures the extra reward per unit of risk.
For an investment І , the Sharpe ratio is computed as:
x R
Sharpe Ratio
s
where is the mean return for the investment
is the mean return for a risk-free asset
is the standard deviation for the investment
1-102
LO 3.5 3.5 Mean-Variance Analysis and the
Sharpe Ratio
Sharpe Ratio Example
Compute the Sharpe ratios for the Metals and Income
funds given the risk free return of 4%.
Since 0.56 > 0.41, the Metals fund offers more reward per
unit of risk as compared to the Income fund.
1-103
3.6 Chebyshev’s Theorem and the Empirical Rule
LO 3.6 Apply Chebyshev’s Theorem and the empirical rule.
Chebyshev’s Theorem
For any data set, the proportion of observations that lie
within k standard deviations from the mean is at least
11/k2 , where k is any number greater than 1.
1-104
LO 3.6 3.6 Chebyshev’s Theorem and the Empirical Rule
1-105
LO 3.6 3.6 Chebyshev’s Theorem and the Empirical Rule
1-106
3.7 Summarizing Grouped Data
LO 3.7 Calculate the mean and the variance for grouped data.
When data are grouped or aggregated, we use
these formulas:
Mean: x
mi i
n
x i
2
m
Variance: s 2
i
n 1
Standard Deviation: s s 2
where mi and i are the midpoint and frequency of the ith
class, respectively.
1-107
LO 3.7 3.7 Summarizing Grouped Data
Consider the frequency distribution of house prices.
Calculate the average house price.
1-108
LO 3.7 3.7 Summarizing Grouped Data
Calculate the sample variance and the standard
deviation.
1-109
LO 3.7 3.7 Summarizing Grouped Data
Weighted Mean
1-110
LO 3.7 3.7 Summarizing Grouped Data
A student scores 60 on Exam 1, 70 on Exam 2, and
80 on Exam 3. What is the student‘s average score
for the course if Exams 1, 2, and 3 are worth 25%,
25%, and 50% of the grade, respectively?
1-111
3.8 Covariance and Correlation
LO 3.8 Calculate and interpret the covariance and the correlation coefficient.
1-112
LO 3.8 3.8 Covariance and Correlation
The sample covariance sxy is computed as
sxy
x i x y i y
n 1
x m x y i my
xy
i
1-113
LO 3.8 3.8 Covariance and Correlation
The sample correlation rxy is computed as
sxy
rxy
s x sy
1-114
LO 3.8 3.8 Covariance and Correlation
Let‘ s calculate the covariance and the correlation
coefficient for the Metals (x) and Income (y) funds.
Positive Relationship
1-115
LO 3.8 3.8 Covariance and Correlation
We use the following table for the calculations.
Covariance:
Correlation:
1-116
S
LOs 3.1, 3.4, and 3.8 Some Excel Commands
Measures of central location and dispersion: select
the relevant data, and choose Data > Data Analysis
> Descriptive Statistics.
1-117
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 4 Learning Objectives (LOs)
LO 4.1: Describe fundamental probability
concepts.
LO 4.2: Formulate and explain subjective,
empirical, and a priori probabilities.
LO 4.3: Calculate and interpret the probability of
the complement of an event, the
probability that at least one of two events
will occur, and a joint probability.
LO 4.4: Calculate and interpret a conditional
probability.
1-119
Chapter 4 Learning Objectives (LOs)
1-120
Sportsware Brands
Annabel Gonzalez, chief retail analyst at marketing
firm Longmeadow Consultants is tracking the sales
of compression-gear produced by Under Armour,
Inc., Nike, Inc., and Adidas Group.
After collecting data from 600 recent purchases,
Annabel wants to determine weather age
influences brand choice.
1-121
4.1 Fundamental Probability Concepts
LO 4.1 Describe fundamental probability concepts.
1-122
LO 4.1 4.1 Fundamental Probability Concepts
An experiment is a trial that results in one of
several uncertain outcomes.
Example: Trying to assess the probability of a
snowboarder winning a medal in the ladies‘ halfpipe event
while competing in the Winter Olympic Games.
Solution: The athlete‘s attempt to predict her chances of
medaling is an experiment because the outcome is
unknown.
The athlete‘s competition has four possible outcomes:
gold medal, silver medal, bronze medal, and no medal.
We formally write the sample space as
S = {gold, silver, bronze, no medal}.
1-123
LO 4.1 4.1 Fundamental Probability Concepts
A sample space, denoted S, of an experiment
includes all possible outcomes of the experiment.
For example, a sample space containing letter
grades is:
S A, B,C, D, F
1-124
LO 4.1 4.1 Fundamental Probability Concepts
Events are considered to be
Exhaustive
If all possible outcomes of a random experiment are
included in the events. For example, the events
―earning a medal‖ and ―failing to earn a medal‖ in a
single Olympic event are exhaustive since these are the
only outcomes.
Mutually exclusive
If they do not share any common outcome of a random
experiment. For example, the events ―earning a medal‖
and ―failing to earn a medal‖ in a single Olympic event
are mutually exclusive.
1-125
LO 4.1 4.1 Fundamental Probability Concepts
A Venn Diagram represents the sample
space for the event(s).
For example, this Venn Diagram illustrates the
sample space for events A and B.
A B
1-126
LO 4.1 4.1 Fundamental Probability Concepts
The intersection of The complement of
two events (A ∩ B) event A (i.e., Ac) is the
consists of all simple event consisting of all
events in both A and B. simple events in the
sample space S that are
A∩B not in A.
A B A Ac
A∪B
1-127
LO 4.1 4.1 Fundamental Probability Concepts
Example: Recall the snowboarder‘s sample space
defined as S = {gold, silver, bronze, no medal}.
Given the following, find A ∪ B, A ∩ B, A ∩ C,
and Bc.
A = {gold, silver, bronze}.
B = {silver, bronze, no medal}.
C = {no medal}.
Solution:
A ∪ B = {gold, silver, bronze, no medal}. Note that there is
no double counting.
A ∩ B = {silver, bronze}. A ∩ C = (null or empty set).
Bc = {gold}.
1-128
4.1 Fundamental Probability Concepts
LO 4.2 Formulate and explain subjective, empirical, and a priori probabilities.
Assigning Probabilities
Subjective probabilities
Draws on personal and subjective judgment.
Objective probabilities
Empirical probability: a relative frequency of
occurrence.
a priori probability: logical analysis.
1-129
LO 4.2 4.1 Fundamental Probability Concepts
Two defining properties of a probability:
The probability of any event A is a value between
0 and 1.
The sum of the probabilities of any list of mutually
exclusive and exhaustive events equals 1.
1-130
LO 4.1 4.1 Fundamental Probability Concepts
Example: Let event A be the probability of earning a
medal:
P(A) = P({gold}) + P({silver}) + P({bronze})
= 0.10 + 0.15 + 0.20 = 0.45.
1-131
LO 4.2 4.1 Fundamental Probability Concepts
Probabilities expressed as odds.
Percentages and odds are an alternative
approach to expressing probabilities include.
Converting an odds ratio to a probability:
Given odds for event A occurring a
of ―a to b,‖ the probability of A is:
ab
Given odds against event A occurring b
of ―a to b,‖ the probability of A is:
ab
1-132
LO 4.2 4.1 Fundamental Probability Concepts
Converting a probability to an odds ratio:
The odds for event A occurring is equal to
P A
1 P A
1-133
LO 4.2 4.1 Fundamental Probability Concepts
Example: Converting an odds ratio to a probability.
Given the odds of 2:1 for beating the Cardinals, what was
the probability of the Steelers‘ winning just prior to the
2009 Super Bowl?
a 2 2
0.67
a b 2 1 3
Example: Converting a probability to an odds ratio.
Given that the probability of an on-time arrival for New
York‘s Kennedy Airport is 0.56, what are the odds for a
plane arriving on-time at Kennedy Airport?
P A 0.56 0.56
1.27 or 1.27:1
1 P A 1 0.56 0.44
1-134
4.2 Rules of Probability
LO 4.3 Calculate and interpret the probability of the complement of an event, the
probability that at least one of two events will occur, and a joint probability.
P Ac 1 P A A Ac
1-135
LO 4.3 4.2 Rules of Probability
The Addition Rule
The probability that event A or B occurs, or that
at least one of these events occurs, is:
P A B P A P B P A B
1-136
LO 4.3 4.2 Rules of Probability
Illustrating the Addition Rule with the Venn
Diagram. Events A and B
A∩B
both occur.
A B
A occurs or B occurs
A∪B
or both occur.
P A B P A P B P A B
1-137
LO 4.3 4.2 Rules of Probability
The Addition Rule for Two Mutually Exclusive
Events Events A and B
A∩B=0 both cannot occur.
A B
P A B P A P B
1-138
LO 4.3 4.2 Rules of Probability
Example: The addition rule.
Anthony feels that he has a 75% chance of getting an A in
Statistics, a 55% chance of getting an A in Managerial
Economics and a 40% chance of getting an A in both
classes. What is the probability that he gets an A in at least
one of these courses?
P AS AM P AS P AM P AS AM
0.75 0.55 0.40 0.90
What is the probability that he does not get an A in either of
these courses? Using the compliment rule, we find
P A A 1 P A A
S M
C
S M 1 0.90 0.10
1-139
LO 4.3 4.2 Rules of Probability
Example: The addition rule for mutually exclusive
events.
Samantha Greene, a college senior, contemplates her
future immediately after graduation. She thinks there is a
25% chance that she will join the Peace Corps and a 35%
chance that she will enroll in a full-time law school
program in the United States.
P A B P A P B 0.25 0.35 0.60
1-141
LO 4.4 4.2 Rules of Probability
Conditional Probability
The probability of an event given that another
event has already occurred.
In the conditional probability statement, the
symbol ― | ‖ means ―given.‖
Whatever follows ― | ‖ has already occurred.
1-142
LO 4.4 4.2 Rules of Probability
Conditional Probability
1-143
LO 4.4 4.2 Rules of Probability
Calculating a Conditional Probability
Given two events A and B, each with a positive probability
of occurring, the probability that A occurs given that B has
occurred ( A conditioned on B ) is equal to
P A B
P A | B
P B
P A B
P B | A
P A
1-144
LO 4.4 4.2 Rules of Probability
Example: Conditional Probabilities
An economist predicts a 60% chance that country A will
perform poorly economically and a 25% chance that
country B will perform poorly economically. There is also
a 16% chance that both countries will perform poorly.
What is the probability that country A performs poorly
given that country B performs poorly?
Let P(A) = 0.60, P(B) = 0.25, and P(A ∩ B) = 0.16
P A B 0.16
P A | B 0.64
P B 0.25
1-145
4.2 Rules of Probability
LO 4.5 Distinguish between independent and dependent events.
P A | B P A or P B | A P B
1-146
LO 4.5 4.2 Rules of Probability
The Multiplication Rule: the probability that A and
B both occur is equal to:
P A B P A | B P B P B | A P A
P A B 0
1-147
LO 4.5 4.2 Rules of Probability
The Multiplication Rule for Independent
Events
The joint probability of A and B equals the product
of the individual probabilities of A and B.
P A B P A P B
1-148
4.3 Contingency Tables and Probabilities
LO 4.6 Calculate and interpret probabilities from a contingency table.
Contingency Tables
A contingency table generally shows frequencies for two
qualitative or categorical variables, x and y.
Each cell represents a mutually exclusive combination of
the pair of x and y values.
Here, x is ―Age Group‖ with two outcomes
while y is ―Brand Name‖ with three outcomes.
1-149
LO 4.6 4.3 Contingency Tables and Probabilities
Contingency Tables
Note that each cell in the contingency table
represents a frequency.
1-150
LO 4.6 4.3 Contingency Tables and Probabilities
The contingency table may be used to calculate
probabilities using relative frequency.
Note: Abbreviated labels have been used in place
of the class names in the table.
1-151
LO 4.6 4.3 Contingency Tables and Probabilities
Joint Probability Table
The joint probability is determined by dividing
each cell frequency by the grand total.
Joint
Probabilities
Marginal
Probabilities
1-152
4.4 The Total Probability Rule and Bayes’ Theorem
LO 4.7 Apply the total probability rule and Bayes’ theorem.
The Total Probability Rule
P(A) is the sum of its intersections with some mutually
exclusive and exhaustive events corresponding to an
experiment.
Consider event B and its
complement Bc. These c
two events are mutually B A B
exclusive and exhaustive.
The circle, representing
event A, consists entirely of P A B P A B c
its intersections with B and Bc.
1-153
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
The Total Probability Rule conditional on two
outcomes
The total probability rule conditional on two
events, B and Bc, is
P A P A B P A Bc
or equivalently,
P A P A | B P B P A | Bc P Bc
1-154
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
Bayes’ Theorem
A procedure for updating probabilities based on
new information.
Prior probability is the original (unconditional)
probability (e.g., P(B) ).
Posterior probability is the updated
(conditional) probability (e.g., P(B | A) ).
1-155
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
Bayes‘ Theorem
Given a set of prior probabilities for an event and
some new information, the rule for updating the
probability of the event is called Bayes’ theorem.
P A B
P B | A
P A B P A Bc
or
P A | B P B
P B | A
P A | B P B P A | Bc P Bc
1-156
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
Example: Bayes‘ Theorem
Assume that 99% of the individuals taking a polygraph test
tell the truth. These tests are considered to be 95%
reliable (i.e., a 95% chance of actually detecting a lie). Let
there also be a 0.5% chance that the test erroneously
detects a lie even when the individual is telling the truth.
An individual has just taken a polygraph test and the test
has detected a lie. What is the probability that the
individual was actually telling the truth?
Let D denote the outcome that the polygraph detects a lie
and T represent the outcome that an individual is telling
the truth.
1-157
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
Example: Bayes‘ Theorem
Given the following probabilities,
We find
P T | D
0.005 0.99
0.00495
0.34256
0.005 0.99 0.95 0.01 0.01445
1-158
4.5 Counting Rules
LO 4.8 Use a counting rule to solve a particular counting problem.
1-159
LO 4.8 4.5 Counting Rules
The Combination Formula
The number of ways to choose x objects from a
total of n objects, where the order in which the x
objects are listed does not matter, is referred to
as a combination and is calculated as:
n Cx
n
x
n!
n x ! x !
1-160
LO 4.8 4.5 Counting Rules
The Permutation Formula
The number of ways to choose x objects from a
total of n objects, where the order in which the x
objects is listed does matter, is referred to as a
permutation and is calculated as:
n!
n Px
n x !
1-161
LO 4.8 4.5 Counting Rules
Example: The Permutation Formula
The little-league coach recruits three more players so that
his team has backups in case of injury. Now his team
totals 12. In how many ways can the coach select nine
players from the 12-player roster?
1-162
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 5 Learning Objectives (LOs)
LO 5.1: Distinguish between discrete and
continuous random variables.
LO 5.2: Describe the probability distribution of a
discrete random variable.
LO 5.3: Calculate and interpret summary
measures for a discrete random variable.
LO 5.4: Differentiate among risk neutral, risk
averse, and risk loving consumers.
LO 5.5: Compute summary measures to evaluate
portfolio returns.
1-164
Chapter 5 Learning Objectives (LOs)
LO 5.6: Describe the binomial distribution and
compute relevant probabilities.
LO 5.7: Describe the Poisson distribution and
compute relevant probabilities.
LO 5.8: Describe the hypergeometric distribution
and compute relevant probabilities.
1-165
Available Staff for Probable Customers
Anne Jones is a manager of a local Starbucks.
Due to a weak economy and higher gas and food
prices, Starbucks announced plans in 2008 to close
500 more U.S. locations.
While Anne‘s store will remain open, she is
concerned about how other nearby closings might
affect her business.
A typical Starbucks customer visits the chain
between 15 and 18 times a month.
Based on all this, Anne believes that customers will
average 18 visits to her store over a 30-day month.
1-166
Available Staff for Probable Customers
Anne needs to decide staffing needs.
Too many employees would be costly to the store.
Not enough employees could result in losing angry
customers who choose not to wait for service.
1-167
5.1 Random Variables and Discrete Probability
Distributions
LO 5.1 Distinguish between discrete and continuous random variables.
Random variable
A function that assigns numerical values to the
outcomes of a random experiment.
Denoted by uppercase letters (e.g., X ).
1-168
5.1 Random Variables and Discrete
LO 5.1
Probability Distributions
1-169
5.1 Random Variables and Discrete
LO 5.1
Probability Distributions
Consider an experiment in which two shirts are
selected from the production line and each can be
defective (D) or non-defective (N).
Here is the sample space:
(D,D)
The random variable X is (D,N)
the number of defective shirts. (N,D)
(N,N)
The possible number of
defective shirts is the set {0, 1, 2}.
Since these are the only possible outcomes, this is
a discrete random variable.
1-170
5.1 Random Variables and Discrete Probability
Distributions
LO 5.2 Describe the probability distribution of a discrete random variable.
Every random variable is associated with a
probability distribution that describes the variable
completely.
A probability mass function is used to describe
discrete random variables.
A probability density function is used to describe
continuous random variables.
A cumulative distribution function may be used to
describe both discrete and continuous random
variables.
1-171
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
The probability mass function of a discrete
random variable X is a list of the values of X with
the associated probabilities, that is, the list of all
possible pairs
x,P X x
The cumulative distribution function of X is
defined as
P X x
1-172
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
Two key properties of discrete probability
distributions:
The probability of each value x is a value between
0 and 1, or equivalently
0 P X x 1
The sum of the probabilities equals 1. In other words,
P X x 1
i
1-173
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
A discrete probability distribution may be viewed
as a table, algebraically, or graphically.
For example, consider the experiment of rolling a
six-sided die. A tabular presentation is:
1-174
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
Another tabular view of a probability distribution is
based on the cumulative probability distribution.
For example, consider the experiment of rolling a six-
sided die. The cumulative probability distribution is
1-175
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
A probability distribution may be expressed
algebraically.
For example, for the six-sided die experiment, the
probability distribution of the random variable X is:
1 6 if x 1,2,3,4,5,6
P X x
0 otherwise
1-176
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
A probability distribution may be expressed
graphically.
The values x of X are placed on the horizontal axis and
the associated probabilities on the vertical axis.
A line is drawn such that its height is associated with the
probability of x.
For example, here is the
graph representing the
six-sided die experiment:
This is a uniform distribution
since the bar heights are all
the same.
1-177
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
Example: Consider the probability distribution
which reflects the number of credit cards that
Bankrate.com‘s readers carry:
Is this a valid probability
distribution?
What is the probability that a
reader carries no credit cards?
What is the probability that a
reader carries less than two?
What is the probability that a reader carries at least two
credit cards?
1-178
5.1 Random Variables and Discrete
LO 5.2
Probability Distributions
Consider the probability distribution which reflects
the number of credit cards that Bankrate.com‘s
readers carry:
Yes, because 0 < P(X = x) < 1
and SP(X = x) = 1.
P(X = 0) = 0.025
P(X < 2) = P(X = 0) + P(X = 1)
= 0.025 + 0.098 = 0.123.
P(X > 2) = P(X = 2) + P(X = 3)
+ P(P = 4*) = 0.166 + 0.165 + 0.546 = 0.877.
Alternatively, P(X > 2) = 1 P(X < 2) = 1 0.123 = 0.877.
1-179
5.2 Expected Value, Variance, and Standard
Deviation
LO 5.3 Calculate and interpret summary measures for a
discrete random variable.
Variance
Standard Deviation
1-180
5.2 Expected Value, Variance, and
LO 5.3
Standard Deviation
Expected Value Population Mean
E(X) m
E(X) is the long-run average value of the random
variable over infinitely many independent
repetitions of an experiment.
For a discrete random variable X with values
x1, x2, x3, . . . that occur with probabilities
P(X = xi), the expected value of X is
E X m xi P X xi
1-181
5.2 Expected Value, Variance, and
LO 5.3
Standard Deviation
Variance and Standard Deviation
For a discrete random variable X with values
x1, x2, x3, . . . that occur with probabilities
P(X = x ),
i
Var X xi m P X xi
2 2
xi2P X xi m 2
The standard deviation is the square root of the
variance.
SD X 2
1-182
5.2 Expected Value, Variance, and
LO 5.3
Standard Deviation
Example: Brad Williams, owner of a car dealership
in Chicago, decides to construct an incentive
compensation program based on performance.
1-183
5.2 Expected Value, Variance, and
LO 5.3
Standard Deviation
Solution: Let the random variable X denote the
bonus amount (in $1,000s) for an employee.
1-184
5.2 Expected Value, Variance, and Standard
Deviation
LO 5.4 Differentiate among risk neutral, risk averse, and risk
loving consumers.
1-185
5.2 Expected Value, Variance, and
LO 5.4
Standard Deviation
Risk Neutrality and Risk Aversion
Risk loving consumers:
May accept a risky prospect even if the expected gain is
negative.
Application of Expected Value to Risk
Suppose you have a choice of receiving $1,000 in cash
or receiving a beautiful painting from your grandmother.
The actual value of the painting is uncertain. Here is a
probability distribution
of the possible worth
of the painting. What
should you do?
1-186
5.2 Expected Value, Variance, and
LO 5.4
Standard Deviation
Application of Expected Value to Risk
First calculate the
expected value:
E X xi P X xi
$2,000 0.20 $1,000 0.50 $500 0.30
$1,050
1-187
5.3 Portfolio Returns
LO 5.5 Compute summary measures to evaluate a portfolio returns.
Investment opportunities often use both:
Expected return as a measure of reward.
Variance or standard deviation of return as a measure of
risk.
Portfolio is defined as a collection of assets such as
stocks and bonds.
Let X and Y represent two random variables of interest,
denoting, say, the returns of two assets.
Since an investor may have invested in both assets, we
would like to evaluate the portfolio return formed by a
linear combination of X and Y .
1-188
LO 5.5 5.3 Portfolio Returns
Properties of random variables useful in evaluating
portfolio returns.
Given two random variables X and Y,
The expected value of X and Y is
E X Y E X E Y
The variance of X and Y is
Var X Y Var X Var Y 2Cov X ,Y
where Cov(X,Y) is the covariance between X and Y.
For constants a, b, the formulas extend to
E aX bY aE X bE Y
Var aX bY a2Var X b2Var Y 2abCov X ,Y
1-189
LO 5.5 5.3 Portfolio Returns
Expected return, variance, and standard
deviation of portfolio returns.
Given a portfolio with two assets, Asset A and
Asset B, the expected return of the portfolio
E(Rp) is computed as:
E Rp w AE RA wBE RB
where
wA and wB are the portfolio weights
wA + wB = 1
E(RA) and E(RB) are the expected returns on assets
A and B, respectively.
1-190
LO 5.5 5.3 Portfolio Returns
Expected return, variance, and standard deviation
of portfolio returns.
Using the covariance or the correlation coefficient of the
two returns, the portfolio variance of return is:
where 2A and 2B are the variances of the returns for
Asset A and Asset B, respectively,
AB is the covariance between the returns for
Assets A and B
rAB is the correlation coefficient between the returns
for Asset A and Asset B.
1-191
LO 5.5 5.3 Portfolio Returns
Example: Consider an investment portfolio of
$40,000 in Stock A and $60,000 in Stock B.
Given the following information, calculate the expected
return of this portfolio.
.
1-192
LO 5.5 5.3 Portfolio Returns
Example: Consider an investment portfolio of
$40,000 in Stock A and $60,000 in Stock B.
Solution:
1-193
LO 5.5 5.3 Portfolio Returns
Example: Consider an investment portfolio of
$40,000 in Stock A and $60,000 in Stock B.
1-194
LO 5.5 5.3 Portfolio Returns
Example: Consider an investment portfolio of
$40,000 in Stock A and $60,000 in Stock B.
1-195
5.4 The Binomial Probability Distribution
LO 5.6 Describe the binomial distribution and compute relevant probabilities.
A binomial random variable is defined as the
number of successes achieved in the n trials of a
Bernoulli process.
A Bernoulli process consists of a series of n
independent and identical trials of an experiment
such that on each trial:
There are only two possible outcomes:
p = probability of a success
1p = q = probability of a failure
Each time the trial is repeated, the probabilities of
success and failure remain the same.
1-196
LO 5.6 5.4 The Binomial Probability Distribution
A binomial random variable X is defined as the
number of successes achieved in the n trials of a
Bernoulli process.
A binomial probability distribution shows the
probabilities associated with the possible values of
the binomial random variable (that is, 0, 1, . . . , n).
For a binomial random variable X , the probability of x
successes in n Bernoulli trials is
P X x nx p x q n x
n!
x ! n x !
p xq nx
1-197
LO 5.6 5.4 The Binomial Probability Distribution
For a binomial distribution:
E X m np
The expected value
(E(X)) is:
1-198
LO 5.6 5.4 The Binomial Probability Distribution
Example: Approximately 20% of U.S. workers are
afraid that they will never be able to retire. Suppose
10 workers are randomly selected.
What is the probability that none of the workers is
afraid that they will never be able to retire?
Solution: Let X = 10, then
1-199
LO 5.6 5.4 The Binomial Probability Distribution
Computing binomial probabilities with Excel:
In 2007 approximately 4.7% of the households in the
Detroit metropolitan area were in some stage of
foreclosure. What is the probability that exactly 5 of these
100 mortgage-holding households in Detroit are in some
stage of foreclosure?
Solution: Using the binomial function on Excel, enter the four
arguments shown here:
Excel returns the
formula result as
0.1783; thus,
P(X = 5) = 0.1783.
1-200
5.5 The Poisson Probability Distribution
LO 5.7 Describe the Poisson distribution and compute relevant probabilities.
A binomial random variable counts the number of
successes in a fixed number of Bernoulli trials.
In contrast, a Poisson random variable counts the
number of successes over a given interval of time or
space.
Examples of a Poisson random variable include
With respect to time—the number of cars that cross the
Brooklyn Bridge between 9:00 am and 10:00 am on a
Monday morning.
With respect to space—the number of defects in a
50-yard roll of fabric.
1-201
LO 5.7 5.5 The Poisson Probability Distribution
A random experiment satisfies a Poisson
process if:
The number of successes within a specified time
or space interval equals any integer between zero
and infinity.
The numbers of successes counted in
nonoverlapping intervals are independent.
The probability that success occurs in any interval
is the same for all intervals of equal size and is
proportional to the size of the interval.
1-202
LO 5.7 5.5 The Poisson Probability Distribution
For a Poisson random variable X, the
probability of x successes over a given
interval of time or space is
e m
m x
P X x for x 0,1,2,
x!
1-203
LO 5.7 5.5 The Poisson Probability Distribution
For a Poisson distribution:
The expected value (E(X)) is: EX m
1-204
LO 5.7 5.5 The Poisson Probability Distribution
Example: Returning to the Starbucks example, Ann
believes that the typical Starbucks customer
averages 18 visits over a 30-day month.
How many visits should Anne expect in a 5-day period
from a typical Starbucks customer?
Solution:
1-205
5.6 The Hypergeometric Probability
Distribution
LO 5.8 Describe the hypergeometric distribution and compute
relevant probabilities.
1-206
LO 5.85.6 The Hypergeometric Probability
Distribution
Use the hypergeometric distribution when sampling
without replacement from a population whose size
N is not significantly larger than the sample size n.
For a hypergeometric random variable X, the probability
of x successes in a random selection of n items is
x n x
S N S
P X x
Nn
for x 0,1,2,, n if n S or x 0,1,2,,S if n S,
1-207
LO 5.85.6 The Hypergeometric Probability
Distribution
For a hypergeometric distribution:
S
The expected value (E(X)) is: E X m n
N
1-208
LO 5.85.6 The Hypergeometric Probability
Distribution
Example: At a convenience store in Morganville,
New Jersey, the manager randomly inspects five
mangoes from a box containing 20 mangoes for
damages due to transportation. Suppose the
chosen box contains exactly 2 damaged mangoes.
What is the probability that one out of five mangoes used
in the inspection are damaged?
Solution
1-209
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 6 Learning Objectives (LOs)
LO 6.1: Describe a continuous random variable.
LO 6.2: Describe a continuous uniform distribution and
calculate associated probabilities.
LO 6.3: Explain the characteristics of the normal distribution.
LO 6.4: Use the standard normal table or the z table.
LO 6.5: Calculate and interpret probabilities for a random
variable that follows the normal distribution.
LO 6.6: Calculate and interpret probabilities for a random
variable that follows the exponential distribution.
LO 6.7: Calculate and interpret probabilities for a random
variable that follows the lognormal distribution.
1-211
Demand for Salmon
Akiko Hamaguchi, manager of a small sushi
restaurant, Little Ginza, in Phoenix, Arizona, has to
estimate the daily amount of salmon needed.
Akiko has estimated the daily consumption of
salmon to be normally distributed with a mean of
12 pounds and a standard deviation of 3.2 pounds.
Buying 20 lbs of salmon every day has resulted in
too much wastage.
Therefore, Akiko will buy salmon that meets the
daily demand of customers on 90% of the days.
1-212
Demand for Salmon
Based on this information, Akiko would like to:
Calculate the proportion of days that demand for
salmon at Little Ginza was above her earlier
purchase of 20 pounds.
Calculate the proportion of days that demand for
salmon at Little Ginza was below 15 pounds.
Determine the amount of salmon that should be
bought daily so that it meets demand on 90% of
the days.
1-213
6.1 Continuous Random Variables and the
Uniform Probability Distribution
LO 6.1 Describe a continuous random variable.
1-214
6.1 Continuous Random Variables and
LO 6.1
1-215
6.1 Continuous Random Variables and
LO 6.1
1-216
6.1 Continuous Random Variables and
LO 6.1
1-217
6.1 Continuous Random Variables and the
Uniform Probability Distribution
LO 6.2 Describe a continuous uniform distribution and calculate associated
probabilities.
1-218
6.1 Continuous Random Variables and
LO 6.2
SD X b a
2
12
1-219
6.1 Continuous Random Variables and
LO 6.2
1-220
6.1 Continuous Random Variables and
LO 6.2
1-221
6.1 Continuous Random Variables and
LO 6.2
1-222
6.2 The Normal Distribution
LO 6.3 Explain the characteristics of the normal distribution.
Scores on SAT
1-223
LO 6.3 6.2 The Normal Distribution
Characteristics of the Normal Distribution
Symmetric about its mean
Mean = Median = Mode
m x
1-224
LO 6.3 6.2 The Normal Distribution
Characteristics of the Normal Distribution
The normal distribution is completely described
by two parameters: m and 2.
m is the population mean which describes the
central location of the distribution.
2 is the population variance which describes
the dispersion of the distribution.
1-225
LO 6.3 6.2 The Normal Distribution
Probability Density Function of the Normal
Distribution
For a random variable X with mean m and
variance 2
m
2
1 x
f x exp
2 2 2
where 3.14159 and exp x e x
e 2.718 is the base of the natural logarithm
1-226
LO 6.3 6.2 The Normal Distribution
Example: Suppose the ages of employees in
Industries A, B, and C are normally distributed.
Here are the relevant parameters:
1-227
6.2 The Normal Distribution
LO 6.4 Use the standard normal table or the z table.
1-228
LO 6.4 6.2 The Standard Normal Distribution
Standard Normal Table (Z Table).
Gives the cumulative probabilities P(Z < z) for
positive and negative values of z.
Since the random variable Z is symmetric about
its mean of 0,
P(Z < 0) = P(Z > 0) = 0.5.
To obtain the P(Z < z), read down the z column
first, then across the top.
1-229
LO 6.4 6.2 The Standard Normal Distribution
Standard Normal Table (Z Table).
Table for positive z values.
1-230
LO 6.4 6.2 The Standard Normal Distribution
Finding the Probability for a Given z Value.
Transform normally distributed random variables into
standard normal random variables and use the z table
to compute the relevant probabilities.
The z table provides cumulative probabilities
P(Z < z) for a given z.
Portion of right-hand page of z table.
If z = 1.52, then look up
1-231
LO 6.4 6.2 The Standard Normal Distribution
Finding the Probability for a Given z Value.
Remember that the z table provides cumulative
probabilities P(Z < z) for a given z.
Since z is negative, we can look up this
probability from the left-hand page of the z table.
1-232
LO 6.4 6.2 The Standard Normal Distribution
Example: Finding Probabilities for a Standard
Normal Random Variable Z.
Find P(1.52 < Z < 1.96) =
P(Z < 1.96) P(Z < 1.52 ) =
P(Z < 1.96) = 0.9750
1-233
LO 6.4 6.2 The Standard Normal Distribution
Example: Finding a z value for a given
probability.
For a standard normal variable Z, find the z
values that satisfy P(Z < z) = 0.6808.
Go to the standard normal table and find 0.6808
in the body of the table.
Find the corresponding
z value from the
row/column of z.
z = 0.47.
1-234
LO 6.4 6.2 The Standard Normal Distribution
Revisiting the Empirical Rule.
P 3 Z 3
P 2 Z 2
P 1 Z 1
1-235
LO 6.4 6.2 The Standard Normal Distribution
Example: The Empirical Rule
An investment strategy has an expected return of
4% and a standard deviation of 6%. Assume that
investment returns are normally distributed.
What is the probability of earning a return greater
than 10%?
A return of 10% is one standard deviation
above the mean, or 10 = m + 1 = 4 + 6.
Since about 68% of observations fall within
one standard deviation of the mean, 32%
(100% 68%) are outside the range.
1-236
LO 6.4 6.2 The Standard Normal Distribution
Example: The Empirical Rule
An investment strategy has an expected return of
4% and a standard deviation of 6%. Assume that
investment returns are normally distributed.
What is the probability of earning a return greater
than 10%?
Using symmetry, we
16% 16%
conclude that 16%
(half of 32%) of the 68%
observations are
greater than 10%.
2
(m )
1-237
6.3 Solving Problems with the Normal
Distribution
LO 6.5 Calculate and interpret probabilities for a random variable that follows
the normal distribution.
1-238
6.3 Solving Problems with the
LO 6.5
Normal Distribution
A z value specifies by how many standard
deviations the corresponding x value falls
above (z > 0) or below (z < 0) the mean.
A positive z indicates by how many standard
deviations the corresponding x lies above m.
A zero z indicates that the corresponding x
equals m.
A negative z indicates by how many standard
deviations the corresponding x lies below m.
1-239
6.3 Solving Problems with the
LO 6.5
Normal Distribution
Use the Inverse Transformation to compute
probabilities for given x values.
A standard normal variable Z can be transformed
to the normally distributed random variable X with
mean m and standard deviation as
1-240
6.3 Solving Problems with the
LO 6.5
Normal Distribution
Example: Scores on a management aptitude exam
are normally distributed with a mean of 72 (m) and a
standard deviation of 8 ().
What is the probability that a randomly selected
manager will score above 60?
First transform the random variable X to Z using the
transformation formula: x m 60 72
z 1.5
8
Using the standard normal table, find
P(Z > 1.5) = 1 P(Z < 1.5) = 1 0.0668 = 0.9332
1-241
6.3 Solving Problems with the
LO 6.5
Normal Distribution
Example:
1-242
6.3 Solving Problems with the
LO 6.5
Normal Distribution
Example:
1-243
6.4 Other Continuous Probability
Distributions
LO 6.6 Calculate and interpret probabilities for a random variable that follows
the exponential distribution.
The Exponential Distribution
A random variable X follows the exponential distribution if
its probability density function is:
f x e x
for x 0 1
and E X SD X
where is the rate parameter
e 2.718
The cumulative distribution
P X x 1 e x
function is:
1-244
LO 6.66.4 Other Continuous Probability
Distributions
The exponential distribution is based entirely on
one parameter, > 0, as illustrated below.
1-245
LO 6.66.4 Other Continuous Probability
Distributions
Example
1-246
6.4 Other Continuous Probability
Distributions
LO 6.7 Calculate and interpret probabilities for a random variable that follows
the lognormal distribution.
1-247
LO 6.76.4 Other Continuous Probability
Distributions
Let X be a normally distributed random variable with
mean m and standard deviation . The random
variable Y = eX follows the lognormal distribution
with a probability density function as
ln y m
2
1
f y exp for y 0,
y 2 2 2
where equals approximately 3.14159
exp x e x is the exponential function
e 2.718
1-248
LO 6.76.4 Other Continuous Probability
Distributions
The graphs below show the shapes of the
lognormal density function based on various values
of .
The lognormal
distribution is
clearly positively
skewed for > 1.
For < 1, the
lognormal
distribution
somewhat
resembles the normal distribution.
1-249
LO 6.76.4 Other Continuous Probability
Distributions
The
1-250
LO 6.76.4 Other Continuous Probability
Distributions
Expected values and standard deviations of
the lognormal and normal distributions.
Let X be a normal random variable with mean m
and standard deviation and let Y = eX be the
corresponding lognormal variable. The mean
mY and standard deviation Y of Y are derived as
2m 2
mY exp
2
Y
exp 2 1 exp 2m 2
1-251
LO 6.76.4 Other Continuous Probability
Distributions
Expected values and standard deviations of
the lognormal and normal distributions.
Equivalently, the mean and standard deviation of
the normal variable X = ln(Y) are derived as
mY2
m ln
m2 2
Y Y
Y2
ln 1 2
mY
1-252
Appendix A Table 1. Standard Normal Curve
1-253
Appendix A Table 2. Standard Normal Curve
1-254
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 7 Learning Objectives (LOs)
LO 7.1: Differentiate between a population
parameter and a sample statistic.
LO 7.2: Explain common sample biases.
LO 7.3: Describe simple random sampling.
LO 7.4: Distinguish between stratified random
sampling and cluster sampling.
LO 7.5: Describe the properties of the sampling
distribution of the sample mean.
1-256
Chapter 7 Learning Objectives (LOs)
LO 7.6: Explain the importance of the central
limit theorem.
LO 7.7: Describe the properties of the sampling
distribution of the sample proportion.
LO 7.8: Use a finite population correction
factor.
LO 7.9: Construct and interpret control charts
for quantitative and qualitative data.
1-257
Marketing Iced Coffee
In order to capitalize on the iced coffee trend,
Starbucks offered for a limited time half-priced
Frappuccino beverages between 3 pm and 5 pm.
Anne Jones, manager at a local Starbucks,
determines the following from past historical data:
43% of iced-coffee customers were women.
1-258
Marketing Iced Coffee
One month after the marketing period ends, Anne
surveys 50 of her iced-coffee customers and finds:
46% were women.
34% were teenage girls.
They spent an average of $4.26 on the drink.
Anne wants to use this survey information to
calculate the probability that:
Customers spend an average of $4.26 or more on iced
coffee.
46% or more of iced-coffee customers are women.
34% or more of iced-coffee customers are teenage girls.
1-259
7.1 Sampling
LO 7.1 Differentiate between a population parameter and sample statistic.
1-260
7.1 Sampling
LO 7.2 Explain common sample biases.
1-261
LO 7.2 7.1 Sampling
Selection bias—a systematic exclusion of certain
groups from consideration for the sample.
The Literary Digest committed selection bias by excluding
a large portion of the population (e.g., lower income
voters).
Nonresponse bias—a systematic difference in
preferences between respondents and non-
respondents to a survey or a poll.
The Literary Digest had only a 24% response rate. This
indicates that only those who cared a great deal about the
election took the time to respond to the survey. These
respondents may be atypical of the population as a whole.
1-262
7.1 Sampling
LO 7.3 Describe simple random sampling.
Sampling Methods
Simple random sample is a sample of n
observations which has the same probability of
being selected from the population as any other
sample of n observations.
Most statistical methods presume simple
random samples.
However, in some situations other sampling
methods have an advantage over simple
random samples.
1-263
LO 7.3 7.1 Sampling
Example: In 1961, students invested 24 hours per
week in their academic pursuits, whereas today‘s
students study an average of 14 hours per week.
A dean at a large university in California wonders if this
trend is reflective of the students at her university. The
university has 20,000 students and the dean would like a
sample of 100. Use Excel to draw a simple random
sample of 100 students.
In Excel, choose
Formulas > Insert function >
RANDBETWEEN and input
the values shown here.
1-264
7.1 Sampling
LO 7.4 Distinguish between stratified random sampling and cluster sampling.
1-265
LO 7.4 7.1 Sampling
Cluster Sampling
Divide population into mutually exclusive and
collectively exhaustive groups, called clusters.
Randomly select clusters.
Sample every observation in those randomly
selected clusters.
Advantages and disadvantages:
Less expensive than other sampling methods.
Less precision than simple random sampling or
stratified sampling.
Useful when clusters occur naturally in the population.
1-266
LO 7.4 7.1 Sampling
Stratified versus Cluster Sampling
Stratified Sampling Cluster Sampling
1-267
7.2 The Sampling Distribution of the Means
LO 7.5 Describe the properties of the sampling distribution of the
sample mean.
Population is described by parameters.
A parameter is a constant, whose value may be unknown.
Only one population.
Sample is described by statistics.
A statistic is a random variable whose value depends on
the chosen random sample.
Statistics are used to make inferences about the
population parameters.
Can draw multiple random samples of size n.
1-268
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
Estimator
A statistic that is used to estimate a population
parameter.
For example, X , the mean of the sample, is an
estimator of m, the mean of the population.
Estimate
A particular value of the estimator.
For example, the mean of the sample x is an
estimate of m, the mean of the population.
1-269
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
Sampling Distribution of the Mean X
Each random sample of size n drawn from the
population provides an estimate of m—the sample
mean x .
Drawing many samples of size n results in many
different sample means, one for each sample.
The sampling distribution of the mean is the
frequency or probability distribution of these
sample means.
1-270
7.2 The Sampling Distribution of the
LO 7.5
X1 X2 X3 X4
Mean of
X
A distribution of means
6
5
10
10
8
4
4
3
5.57
5.71
from each random
1 8 4 3 6.36 draw from the
4 1 6 2 4.07
6 6 8 4 population—a sampling
7 7 8 6
1 5 10 5
distribution.
5 5 9 1
4 6 4 2 Means from each
7 4 9 5
8 5 8 6 distribution (random
9 2 7 7
9 1 2 3 draw) from the
Means
6
5.57
10
5.71
2
6.36
6
4.07 5.43
population.
1-271
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
The Expected Value and Standard Deviation
of the Sample Mean
Expected Value
The expected value of X,
EX m
The expected value of the mean,
E X EX m
1-272
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
The Expected Value and Standard Deviation
of the Sample Mean
Variance of X Var X 2
Standard Deviation
of X SD X 2
1-273
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
Example: Given that m = 16 inches and = 0.8
inches, determine the following:
What is the expected value and the standard
deviation of the sample mean derived from a
random sample of
0.8
2 pizzas E X m 16 SD X 0.57
n 2
4 pizzas E X m 16 SD X
n
0.8
4
0.40
1-274
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
Sampling from a Normal Distribution
For any sample size n, the sampling distribution
of X is normal if the population X from which the
sample is drawn is normally distributed.
If X is normal, then we can transform it into the
standard normal random variable as:
For a sampling For a distribution of
distribution. the values of X.
Z
X m
X E X x EX xm
Z
SD X n SD X
1-275
7.2 The Sampling Distribution of the
LO 7.5
1-276
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
Example: Given that m = 16 inches and = 0.8
inches, determine the following:
What is the probability that a randomly selected pizza is
less than 15.5 inches?
x m 15.5 16 P ( X 15.5) P (Z 0.63)
Z 0.63
0.8 0.2643 or 26.43%
1-277
7.2 The Sampling Distribution of the
Sample Mean
LO 7.6 Explain the importance of the central limit theorem.
The Central Limit Theorem
For any population X with expected value m and standard
deviation , the sampling distribution of X will be
approximately normal if the sample size n is sufficiently
large.
As a general guideline, the normal distribution
approximation is justified when n > 30.
As before, if X is approximately X m
normal, then we can transform it to Z
n
1-278
7.2 The Sampling Distribution of the
LO 7.6
Sample Mean
The Central Limit Theorem
1-279
7.2 The Sampling Distribution of the
LO 7.6
Sample Mean
Example: From the introductory case, Anne wants
to determine if the marketing campaign has had a
lingering effect on the amount of money customers
spend on iced coffee.
Before the campaign, m = $4.18 and = $0.84. Based on 50
customers sampled after the campaign, m = $4.26.
Let‘s find P X 4.26
. Since n > 30, the central limit
theorem states that X is approximately normal. So,
X m 4.26 4.18
P X 4.26 P Z
n
P Z
0.84 50
P Z 0.67 1 0.7486 0.2514
1-280
7.3 The Sampling Distribution of the
Sample Proportion
LO 7.7 Describe the properties of the sampling distribution of the sample
proportion.
Estimator
Sample proportion P is used to estimate the
population parameter p.
Estimate
A particular value of the estimator p .
1-281
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
The Expected Value and Standard
Deviation of the Sample Proportion
Expected Value
The expected value of P,
E P p
The standard deviation of P,
p 1 p
SD P
n
1-282
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
The Central Limit Theorem for the Sample
Proportion
For any population proportion p, the sampling
distribution of P is approximately normal if the
sample size n is sufficiently large .
As a general guideline, the normal distribution
approximation is justified when
np > 5 and n(1 p) > 5.
1-283
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
The Central Limit Theorem for the Sample
Proportion
If P is normal, we can transform it into the
standard normal random variable as
Z
P E P Pp
SD P p 1 p
n
Therefore any value p on P pp
Z
has a corresponding value p 1 p
z on Z given by n
1-284
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
The Central Limit Theorem for the Sample
Proportion
1-285
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
1-286
7.3 The Sampling Distribution of the
LO 7.7
Sample Proportion
0.46 0.43
P P 0.46 P Z
p p
p 1 p
P Z
0.43 1 0.43
n 50
P Z 0.43 1 0.6664 0.3336
1-287
7.4 The Finite Population Correction
Factor
LO 7.8 Use a finite population correction factor.
1-288
7.4 The Finite Population Correction
LO 7.8
Factor
The Finite Population Correction Factor for
the Sample Proportion
Used to reduce the sampling variation of the
sample proportion P .
The resulting standard deviation is
p 1 p N n
SD P
n
N 1
1-289
7.4 The Finite Population Correction
LO 7.8
Factor
Example: A large introductory marketing class with
340 students has been divided up into 10 groups.
Connie is in a group of 34 students that averaged
72 on the midterm. The class average was 73 with
a standard deviation of 10.
The population parameters are: m = 73 and = 10.
E X m 73 but since n = 34 is more than 5% of the
population size N = 340, we need to use the finite
population correction factor.
N n 10 340 34
SD X
n N 1
1.63
340 1
34
1-290
7.5 Statistical Quality Control
LO 7.9 Construct and interpret control charts for quantitative and
qualitative data.
Detection Approach
1-291
LO 7.9 7.5 Statistical Quality Control
Acceptance Sampling
Used at the completion of a production process or
service.
If a particular product does not conform to certain
specifications, then it is either discarded or
repaired.
Disadvantages
It is costly to discard or repair a product.
1-292
LO 7.9 7.5 Statistical Quality Control
Detection Approach
Inspection occurs during the production process
in order to detect any nonconformance to
specifications.
Goal is to determine whether the production
process should be continued or adjusted before
producing a large number of defects.
Types of variation:
Chance variation.
Assignable variation.
1-293
LO 7.9 7.5 Statistical Quality Control
Types of Variation
Chance variation (common variation) is:
Caused by a number of randomly occurring events that
are part of the production process.
Not controllable by the individual worker or machine.
Expected, so not a source of alarm as long as its
magnitude is tolerable and the end product meets
specifications.
Assignable variation (special cause variation) is:
Caused by specific events or factors that can usually be
indentified and eliminated.
Identified and corrected or removed.
1-294
LO 7.9 7.5 Statistical Quality Control
Control Charts
Developed by Walter A. Shewhart.
A plot of calculated statistics of the production
process over time.
Production process is ―in control‖ if the calculated
statistics fall in an expected range.
Production process is ―out of control‖ if calculated
statistics reveal an undesirable trend.
For quantitative data— x chart.
1-295
LO 7.9 7.5 Statistical Quality Control
Control Charts for Quantitative Data
x Control Charts
Centerline—the mean when the process is
under control.
Upper control limit—set at +3 from the mean.
Points falling above the upper control limit are
considered to be out of control.
Lower control limit—set at −3 from the mean.
Points falling below the lower control limit are
considered to be out of control.
1-296
LO 7.9 7.5 Statistical Quality Control
Control Charts for Quantitative Data
x Control Charts
Upper control
limit (UCL):
m 3 UCL
n
Sample means Centerline
Lower control LCL
limit (LCL):
m 3
n Process is in control—all points
fall within the control limits.
1-297
LO 7.9 7.5 Statistical Quality Control
Control Charts for Qualitative Data
p chart (fraction defective or percent defective chart).
Tracks proportion of defects in a production process.
Relies on central limit theorem for normal approximation
for the sampling distribution of the sample proportion.
Centerline—the mean when the process is under control.
Upper control limit—set at +3 from the centerline.
Points falling above the upper control limit are
considered to be out of control.
Lower control limit—set at −3 from the centerline.
Points falling below the lower control limit are
considered to be out of control.
1-298
LO 7.9 7.5 Statistical Quality Control
Control Charts for Qualitative Data
p Control Charts
Upper control
limit (UCL):
p 1 p UCL
p3
Sample proportion
n Centerline
1-299
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 8 Learning Objectives (LOs)
LO 8.1: Discuss point estimators and their desirable properties.
LO 8.2: Explain an interval estimator.
LO 8.3: Calculate a confidence interval for the population mean when
the population standard deviation is known.
LO 8.4: Describe the factors that influence the width of a confidence
interval.
LO 8.5: Discuss features of the t distribution.
LO 8.6: Calculate a confidence interval for the population mean when
the population standard deviation is not known.
LO 8.7: Calculate a confidence interval for the population proportion.
LO 8.8: Select a sample size to estimate the population mean and the
population proportion.
1-301
Fuel Usage of “Ultra-Green” Cars
A car manufacturer advertises that its new
―ultra-green‖ car obtains an average of 100 mpg
and, based on its fuel emissions, has earned an
A+ rating from the Environmental Protection
Agency.
Pinnacle Research, an independent consumer
advocacy firm, obtains a sample of 25 cars for
testing purposes.
Each car is driven the same distance in identical
conditions in order to obtain the car‘s mpg.
1-302
Fuel Usage of “Ultra-Green” Cars
The mpg for each ―Ultra-Green‖ car is given below.
1-303
8.1 Point Estimators and Their Properties
LO 8.1 Discuss point estimators and their desirable properties.
Point Estimator
A function of the random sample used to make
inferences about the value of an unknown population
parameter.
For example, X is a point estimator for m and P is a point
estimator for p.
Point Estimate
The value of the point estimator derived from a given
sample.
For example,x 96.5 is a point estimate of the mean mpg
for all ultra-green cars.
1-304
LO 8.1 8.1 Point Estimators and Their Properties
Example:
1-305
LO 8.1 8.1 Point Estimators and Their Properties
Properties of Point Estimators
Unbiased
An estimator is unbiased if its expected value equals
the unknown population parameter being estimated.
Efficient
An unbiased estimator is efficient if its standard error is
lower than that of other unbiased estimators.
Consistent
An estimator is consistent if it approaches the unknown
population parameter being estimated as the sample
size grows larger.
1-306
LO 8.1 8.1 Point Estimators and Their Properties
Properties of Point Estimators Illustrated:
Unbiased Estimators
The distributions of unbiased (U1) and biased (U2)
estimators.
1-307
LO 8.1 8.1 Point Estimators and Their Properties
Properties of Point Estimators Illustrated:
Efficient Estimators
The distributions of efficient (V1) and less efficient
(V2) estimators.
1-308
LO 8.1 8.1 Point Estimators and Their Properties
Properties of Point Estimators Illustrated:
Consistent Estimator
The distribution of a consistent estimator X
for various sample sizes.
1-309
8.2 Confidence Interval of the Population
Mean When Is Known
LO 8.2 Explain an interval estimator.
Confidence Interval—provides a range of
values that, with a certain level of confidence,
contains the population parameter of interest.
Also referred to as an interval estimate.
Construct a confidence interval as:
Point estimate ± Margin of error.
Margin of error accounts for the variability of the
estimator and the desired confidence level of the
interval.
1-310
8.2 Confidence Interval of the Population
Mean When Is Known
LO 8.3 Calculate a confidence interval for the population mean when the
population standard deviation is known.
1-311
8.2 Confidence Interval of the Population
LO 8.3
X m
We get P 1.96 1.96 0.95
n
1-312
8.2 Confidence Interval of the Population
LO 8.3
1-313
8.2 Confidence Interval of the Population
LO 8.3
1-314
8.2 Confidence Interval of the Population
LO 8.3
1-315
8.2 Confidence Interval of the Population
LO 8.3
Mean When Is Known
Constructing a Confidence Interval for m
When is Known
za/2 is the z value associated
with the probability of a/2
in the upper-tail.
x za 2 n , x za 2 n
Confidence Intervals:
90%, a = 0.10, a/2 = 0.05, za/2 = z.05 = 1.645.
95%, a = 0.05, a/2 = 0.025, za/2 = z.025 = 1.96.
99%, a = 0.01, a/2 = 0.005, za/2 = z.005 = 2.575.
1-316
8.2 Confidence Interval of the Population
LO 8.3
1-317
8.2 Confidence Interval of the Population
LO 8.3
1-318
8.2 Confidence Interval of the Population
LO 8.3
1-319
8.2 Confidence Interval of the Population
Mean When Is Known
LO 8.4 Describe the factors that influence the width of a confidence interval.
1-320
LO 8.38.2 Confidence Interval of the Population
Mean When Is Known
The Width of a Confidence Interval is influenced by:
I. For a given confidence level 100(1 a)% and sample size
n, the width of the interval is wider, the greater the
population standard deviation .
Example: Let the standard deviation of the population of
cereal boxes of Granola Crunch be 0.05 instead of 0.03.
Compute a 95% confidence interval based on the same
sample information.
x za 2
n 1.02 1.96 0.05
25 1.02 0.20
1-321
LO 8.38.2 Confidence Interval of the Population
Mean When Is Known
The Width of a Confidence Interval is influenced by:
II. For a given confidence level 100(1 a)% and population
standard deviation , the width of the interval is wider, the
smaller the sample size n.
Example: Instead of 25 observations, let the sample be
based on 16 cereal boxes of Granola Crunch. Compute a
95% confidence interval using a sample mean of 1.02
pounds and a population standard deviation of 0.03.
x za 2
n 1.02 1.96 0.03
16 1.02 0.015
1-322
LO 8.38.2 Confidence Interval of the Population
Mean When Is Known
The Width of a Confidence Interval is influenced by:
III. For a given sample size n and population standard
deviation , the width of the interval is wider, the greater
the confidence level 100(1 a)%.
Example: Instead of a 95% confidence interval, compute a
99% confidence interval based on the information from the
sample of Granola Crunch cereal boxes.
x za 2
n 1.02 2.575 0.03
25 1.02 0.015
1-323
8.2 Confidence Interval of the Population
LO 8.3
1-324
8.3 Confidence Interval of the Population
Mean When Is Unknown
LO 8.5 Discuss features of the t distribution.
The t Distribution
If repeated samples of size n are taken from a
normal population with a finite variance, then the
statistic T follows the t distribution X m
with (n 1) degrees of freedom, df. T
S n
Degrees of freedom determine
the extent of the broadness of the tails of the
distribution; the fewer the degrees of freedom,
the broader the tails.
1-325
8.3 Confidence Interval of the Population
LO 8.5
1-326
8.3 Confidence Interval of the Population
LO 8.5
1-327
8.3 Confidence Interval of the Population
LO 8.5
1-328
8.3 Confidence Interval of the Population
Mean When Is Unknown
LO 8.6 Calculate a confidence interval for the population mean when
the population standard deviation is not known.
1-329
8.3 Confidence Interval of the Population
LO 8.6
1-330
8.3 Confidence Interval of the Population
LO 8.6
1-331
8.4 Confidence Interval of the Population
Proportion
LO 8.7 Calculate a confidence interval for the population proportion.
1-332
8.4 Confidence Interval of the Population
LO 8.7
Proportion
Thus, a 100(1a)% confidence interval of the
population proportion is
p 1 p p 1 p p 1 p
p za 2 or p za 2 , p za 2
n n n
1-333
8.4 Confidence Interval of the Population
LO 8.7
Proportion
Example: Recall that Jared Beane wants to
estimate the proportion of all ultra-green cars that
obtain over 100 mpg. Use the sample information to
construct a 90% confidence interval of the
population proportion.
Solution: Note that p 7 25 0.28. In addition,
the normality assumption is met since np > 5 and
n(1 p) > 5. Thus,
p 1 p 0.28 1 0.28
p za 2 =0.28 1.645 0.28 0.148
n 28
1-334
8.5 Selecting a Useful Sample Size
LO 8.8 Select a sample size to estimate the population mean and the
population proportion.
1-335
LO 8.8 8.5 Selecting a Useful Sample Size
Selecting n to Estimate m
Consider a confidence interval for m with a
known and let D denote the desired margin of
error.
Since D za 2 n
za 2
2
1-336
LO 8.8 8.5 Selecting a Useful Sample Size
Selecting n to Estimate m
For a desired margin of error D, the minimum
sample size n required to estimate a 100(1 a)%
confidence interval of the population mean m is
za 2ˆ
2
n
D
1-337
LO 8.8 8.5 Selecting a Useful Sample Size
Example: Recall that Jared Beane wants to
construct a 90% confidence interval of the mean
mpg of all ultra-green cars.
Suppose Jared would like to constrain the margin of error
to within 2 mpg. Further, the lowest mpg in the population
is 76 mpg and the highest is 118 mpg.
How large a sample does Jared need to compute the 90%
confidence interval of the population mean?
2
a 2 1.645 10.50
ˆ 2
z
n 74.58 or 75
D 2
1-338
LO 8.8 8.5 Selecting a Useful Sample Size
Selecting n to Estimate p
Consider a confidence interval for p and let D
denote the desired margin of error.
Since
Dz
p 1 p where p is the
sample proportion
a 2
n
2
za 2
we may rearrange to get n p 1 p
D
Since p comes from a sample, we must use a
reasonable estimate of p, that is, p̂ .
1-339
LO 8.8 8.5 Selecting a Useful Sample Size
Selecting n to Estimate p
For a desired margin of error D, the minimum
sample size n required to estimate a 100(1 a)%
confidence interval of the population proportion
p is
2
za 2
n pˆ 1 pˆ
D
Where p̂ is a reasonable estimate of p in the
planning stage.
1-340
LO 8.8 8.5 Selecting a Useful Sample Size
Example: Recall that Jared Beane wants to
construct a 90% confidence interval of the
proportion of all ultra-green cars that obtain over
100 mpg.
Jared does not want the margin of error to be more than
0.10.
How large a sample does Jared need for his analysis of
the population proportion?
2
a2
2
z 1.645
n pˆ 1 pˆ 0.50 1 0.50 67.65 or 68
D 0.10
1-341
Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 9 Learning Objectives (LOs)
LO 9.1: Define the null hypothesis and the alternative
hypothesis.
LO 9.2: Distinguish between Type I and Type II errors.
LO 9.3: Explain the steps of a hypothesis test using
the p-value approach.
LO 9.4: Explain the steps of a hypothesis test using
the critical value approach.
LO 9.5: Differentiate between the test statistics for the
population mean.
LO 9.6: Specify the test statistic for the population
proportion.
1-343
Undergraduate Study Habits
Are today‘s college students studying hard or
hardly studying?
A recent study asserts that over the past five
decades the number of hours that the average
college student studies each week has been
steadily dropping (The Boston Globe, July 4, 2010).
In 1961, students invested 24 hours per week in
their academic pursuits, whereas today‘s students
study an average of 14 hours per week.
1-344
Undergraduate Study Habits
As dean of a large university in California, Susan
Knight wonders if the study trend is reflective of
students at her university.
Susan randomly selected 35 students to ask about
their average study time per week. Using these
results, Susan wants to
1. Determine if the mean study time of students at her
university is below the 1961 national average of 24 hours
per week.
2. Determine if the mean study time of students at her
university differs from today‘s national average of 14
hours per week.
1-345
9.1 Introduction to Hypothesis Testing
LO 9.1 Define the null hypothesis and the alternative hypothesis.
1-346
LO 9.1 9.1 Introduction to Hypothesis Testing
In statistics we use sample information to make
inferences regarding the unknown population
parameters of interest.
We conduct hypothesis tests to determine if sample
evidence contradicts H0.
On the basis of sample information, we either
―Reject the null hypothesis‖
Sample evidence is inconsistent with H0.
―Do not reject the null hypothesis‖
Sample evidence is not inconsistent with H0.
We do not have enough evidence to ―accept‖ H0.
1-347
LO 9.1 9.1 Introduction to Hypothesis Testing
Defining the Null Hypothesis and Alternative
Hypothesis
General guidelines:
Null hypothesis, H0, states the status quo.
Alternative hypothesis, HA, states whatever we
wish to establish (i.e., contests the status quo).
Use the following signs in hypothesis tests
1-348
LO 9.1 9.1 Introduction to Hypothesis Testing
One-Tailed versus Two-Tailed Hypothesis
Tests
Two-Tailed Test
Reject H0 on either side of the hypothesized
value of the population parameter.
For example:
H0: m = m0 versus HA: m ≠ m0
H0: p = p0 versus HA: p ≠ p0
The ―≠‖ symbol in HA indicates that both tail areas
of the distribution will be used to make the
decision regarding the rejection of H0.
1-349
LO 9.1 9.1 Introduction to Hypothesis Testing
One-Tailed versus Two-Tailed Hypothesis
Tests
One-Tailed Test
Reject H0 only on one side of the hypothesized
value of the population parameter.
For example:
H0: m < m0 versus HA: m > m0 (right-tail test)
H0: m > m0 versus HA: m < m0 (left-tail test)
Note that the inequality in HA determines which
tail area will be used to make the decision
regarding the rejection of H0.
1-350
LO 9.1 9.1 Introduction to Hypothesis Testing
Three Steps to Formulate Hypotheses
1. Identify the relevant population parameter of
interest (e.g., m or p).
H0 HA Test Type
2. Determine whether = ≠ Two-tail
it is a one- or a > < One-tail, Left-tail
two-tailed test.
< > One-tail, Right-tail
1-351
LO 9.1 9.1 Introduction to Hypothesis Testing
Example: A trade group predicts that back-to-school
spending will average $606.40 per family this year.
A different economic model is needed if the
prediction is wrong.
1. Parameter of interest is m since we are interested
in the average back-to-school spending.
2. Since we want to determine if the population
mean differs from $606.4 (i.e, ≠), it is a two-tail
test.
3. H0: m = 606.4
HA: m ≠ 606.4
1-352
LO 9.1 9.1 Introduction to Hypothesis Testing
Example: A television research analyst wishes to
test a claim that more than 50% of the households
will tune in for a TV episode. Specify the null and
the alternative hypotheses to test the claim.
1. Parameter of interest is p since we are interested
in the proportion of households.
2. Since the analyst wants to determine whether p
> 0.50, it is a one-tail test.
3. H0: p < 0.50
HA: p > 0.50
1-353
9.1 Introduction to Hypothesis Testing
LO 9.2 Distinguish between Type I and Type II errors.
1-354
LO 9.2 9.1 Introduction to Hypothesis Testing
This table illustrates the decisions that may
be made when hypothesis testing:
Correct Decisions:
Reject H0 when H0 is false.
Do not reject H0 when H0 is true.
Incorrect Decisions:
Reject H0 when H0 is true (Type I Error).
Do not reject H0 when H0 is false (Type II Error).
1-355
LO 9.2 9.1 Introduction to Hypothesis Testing
Example: Consider the following competing
hypotheses that relate to the court of law.
H0: An accused person is innocent
HA: An accused person is guilty
Consequences of Type I and Type II errors:
Type I error: Conclude that the accused is
guilty when in reality, she is innocent.
Type II error: Conclude that the accused is
innocent when in reality, she is guilty.
1-356
9.2 Hypothesis Test of the Population
Mean When Is Known
LO 9.3 Explain the steps of a hypothesis test using the p-value approach.
Hypothesis testing enables us to determine whether
the sample evidence is inconsistent with what is
hypothesized under the null hypothesis (H0).
Basic principle: First assume that H0 is true and
then determine if sample evidence contradicts this
assumption.
Two approaches to hypothesis testing:
The p-value approach.
The critical value approach.
1-357
9.2 Hypothesis Test of the Population
LO 9.3
1-359
9.2 Hypothesis Test of the Population
LO 9.3
1-360
9.2 Hypothesis Test of the Population
LO 9.3
1-361
9.2 Hypothesis Test of the Population
LO 9.3
H A : m 67
Thus, m0 = 67
Step 2. Given that the population is normally
distributed with a known standard deviation,
= 9, we compute the value of the test statistic
as x m0 71 67
z 2.22
n 9 25
1-362
9.2 Hypothesis Test of the Population
LO 9.3
p-value = 0.0132
or 1.32%
1-363
9.2 Hypothesis Test of the Population
LO 9.3
1-364
9.2 Hypothesis Test of the Population
Mean When Is Known
LO 9.4 Explain the steps of a hypothesis test using the critical value approach.
1-365
9.2 Hypothesis Test of the Population
LO 9.4
1-366
9.2 Hypothesis Test of the Population
LO 9.4
Reject H0 if
z > za/2 or z < za/2
Reject H0 if z < za Reject H0 if z > za
1-367
9.2 Hypothesis Test of the Population
LO 9.4
1-368
9.2 Hypothesis Test of the Population
LO 9.4
1-369
9.2 Hypothesis Test of the Population
LO 9.4
1-370
9.2 Hypothesis Test of the Population
LO 9.4
1-371
9.2 Hypothesis Test of the Population
LO 9.4
1-372
9.2 Hypothesis Test of the Population
LO 9.4
x za /2 n or x za /2 n , x za /2 n
or if m0 x za /2 n
1-373
9.2 Hypothesis Test of the Population
LO 9.4
1-374
9.3 Hypothesis Test of the Population
Mean When Is Unknown
LO 9.5 Differentiate between the test statistics for the population mean.
1-375
9.3 Hypothesis Test of the Population
LO 9.5
H A : m 24
Thus, m0 = 24
Step 2. Because n = 35 (i.e, n > 30), we can
assume that the sample mean is normally
distributed and thus compute the value of the test
statistic as t x m0 16.37 24 6.25
34
s n 7.22 35
1-376
9.3 Hypothesis Test of the Population
LO 9.5
1-377
9.3 Hypothesis Test of the Population
LO 9.5
1-378
9.3 Hypothesis Test of the Population
LO 9.5
1-379
9.3 Hypothesis Test of the Population
LO 9.5
1-380
9.3 Hypothesis Test of the Population
LO 9.5
1-381
9.4 Hypothesis Test of the Population Proportion
LO 9.6 Specify the test statistic for the population proportion.
1-382
9.4 Hypothesis Test of the Population
LO 9.6
Proportion
Example:
an 180, x 67, p0 0.4
Step 1. H0: p > 0.4, HA: p < 0.4
Step 2. Compute the value of the test statistic.
First verify that the sample is large enough:
np0 67 0.4 26.8 5
n(1 p0 ) 67 0.6 40.2 5
Compute the test statistic using p = 67/180 = 0.3722:
p p0 0.3722 0.4
z 0.76
p0 1 p0 n 0.4 1 0.4 180
1-383
9.4 Hypothesis Test of the Population
LO 9.6
Proportion
Example:
Step 3. Compute the p-value.
Based on HA: p < 0.4, this is a left-tailed test.
Compute the p-value as:
P(Z < z) = P(Z < 0.76) = 0.2236.
Let the significance
level a = 0.10.
1-384
9.4 Hypothesis Test of the Population
LO 9.6
Proportion
Example:
Step 4. State the conclusion and interpret the
results.
p-value = 0.2236 > a = 0.10.
Do not reject H0: p > 0.4 and conclude
HA: p < 0.4.
Thus, the magazine‘s claim that fewer than
40% of households in the United States have
changed their lifestyles because of escalating
gas prices is not justified by the sample data.
1-385