1

Business Statistics: Communicating with Numbers
By Sanjiv Jaggia and Alison Kelly
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 1 Learning Objectives (LOs)
LO 1.1: Describe the importance of statistics.

LO 1.2: Differentiate between descriptive statistics
and inferential statistics.
LO 1.3: Explain the need for sampling and discuss
various data types.
LO 1.4: Describe variables and various types of
measurement scales.
1-2
Tween Survey
 Survey questions asked to 20 tweens:
 Q1. Which radio station was playing on
your drive to the ski resort?

 Q2. Rate the quality of the food at the resort
on a scale of 1 to 4.
 Q3. What time should the main dining area
close?
 Q4. How much of your own money did you
spend at the lodge today?
1-3
Tween Survey
 Here are the survey responses from the 20 tweens.
1-4
Tween Survey
1. Classify the tweens‘ responses into the

appropriate measurement scale.
2. Extract useful information from each

measurement scale.
3. Provide management with suggestions for

improvement.
1-5
1.1 The Relevance of Statistics
LO 1.1 Describe the importance of statistics.
 With knowledge of statistics:

- Avoid risk of making uninformed
decisions and costly mistakes
- Differentiate between sound statistical
conclusions and questionable
conclusions.
1-6
LO 1.1 1.1 The Relevance of Statistics
 Example 1. Headline of newspaper states
‗What global warming?‘ after record
amounts of snow in 2010.
 Problem with Conclusion: Incorrect to

draw conclusion based on one data point.
1-7
 Example 2. A gambler predicts that he will

roll a 7 on his next roll of the dice since he
was unsuccessful in the last three rolls.
 Problem with Conclusion. The

probability of rolling a 7 stays constant
with each roll of the dice.
1-8
 Example 3. A Boston Globe poll reported a
15-point lead for Martha Coakley in the
election for U.S. senator for Massachusetts,
implying an easy win for Coakley. Nine days
later, Scott Brown wins.
 Problem with Conclusion. The Globe‘s

prediction was based on old information and
included people that were unlikely to vote.
1-9
 Example 4. The CFO of Starbucks Corp. claims
that business is picking up since sales at stores
open at least a year climbed 4% in the quarter
ended December 27, 2009.
 Problem with Conclusion. The CFO overstated

the company‘s financial position by failing to
mention that Starbucks closed more than 800
stores over the past few years.
1-10
 Example 5. Researchers showed that

infants who sleep with a nightlight are much
more likely to develop myopia.
 Problem with Conclusion. This is an

example of the correlation-to-causation
fallacy. Even if two variables are highly
correlated, one does not necessarily
cause the other.
1-11
1.2 What Is Statistics?
LO 1.2 Differentiate between descriptive statistics and inferential statistics.
 Statistics is the methodology of extracting

useful information from a data set.
 To do good statistics, you must
 Find the right data.
 Use the appropriate statistical tools.
 Clearly communicate the numerical
information into written language.
1-12
LO 1.2 1.2 What Is Statistics?
 Two branches of statistics
 Descriptive Statistics
 collecting, organizing, and presenting the
data.
 Inferential Statistics
 drawing conclusions about a population
based on sample data from that population.
1-13
LO 1.2 1.2 What Is Statistics?
 Population
 Consists of all items of interest.
 Sample
 A subset of the population.
 A sample statistic is calculated from the
sample data and is used to make inferences
about the population parameter.
1-14
The Need for Sampling
LO 1.3 Explain the need for sampling and discuss various data types.
 Reasons for sampling from the population

 Too expensive to gather information on the
entire population
 Often impossible to gather information on the
entire population
1-15
LO 1.3 Types of Data
 Cross-sectional data
 Data collected by recording a characteristic of
many subjects at the same point in time, or without
regard to differences in time.
 Subjects might include individuals, households,
firms, industries, regions, and countries.
 The survey data from the Introductory Case is an
example of cross-sectional data.
1-16
LO 1.3 Types of Data
 Time series data
 Data collected by recording a characteristic of a
subject over several time periods.
 Data can include daily, weekly, monthly, quarterly,
or annual observations.
 This graph plots the U.S.
GDP growth rate from
1980 to 2010 - it is an
example of time series
data.
1-17
LO 1.3 Getting Started on the Web
 There is an abundance of data on the
Internet. Here are a few websites for data.
1-18
1.3 Variables and Scales of Measurement
LO 1.4 Describe variables and various types of measurement scales.
 A variable is the general characteristic

being observed on an object of interest.
 Types of Variables
 Qualitative – gender, race, political affiliation
 Quantitative – test scores, age, weight
 Discrete
 Continuous
1-19
LO 1.4 1.3 Variables and Scales of Measurement
 Types of Quantitative Variables
 Discrete
 A discrete variable assumes a
countable number of distinct values.
 Examples: Number of children in a
family, number of points scored in a
basketball game.
1-20
 Types of Quantitative Variables
 Continuous
 A continuous variable can assume an
infinite number of values within some
interval.
 Examples: Weight, height, investment
return.
1-21
 Scales of Measure
 Nominal
Qualitative Variables
 Ordinal
 Interval
Quantitative Variables
 Ratio
1-22
 The Nominal Scale
 The least sophisticated level of measurement.
 Data are simply categories for grouping the data.
Qualitative values may be converted

to quantitative values for
analysis purposes.
1-23
 The Ordinal Scale

 Ordinal data may be categorized and ranked with
respect to some characteristic or trait.
 For example, instructors are often evaluated on an ordinal
scale (excellent, good, fair, poor).
 Differences between categories are meaningless

because the actual numbers used may be arbitrary.
 There is no objective way to interpret the difference
between instructor quality.
1-24
Example: Tweens Survey

 What is the scale of measurement of the radio station data?
Solution: These are nominal data—the values in the

data differ merely in name or label.
1-25
 How are the data based on the ratings of the food quality
similar to or different from the radio station data?
Solution: These are ordinal since they can be both

categorized and ranked.
1-26
 The Interval Scale
 Data may be categorized and ranked with respect
to some characteristic or trait.
 Differences between interval values are equal and
meaningful. Thus the arithmetic operations of
addition and subtraction are meaningful.
 No ―absolute 0‖ or starting point defined.
Meaningful ratios may not be obtained.
1-27
 The Interval Scale
 For example, consider the Fahrenheit
scale of temperature.
 This scale is interval because the data
are ranked and differences (+ or )
may be obtained.
 But there is no ―absolute 0‖ (What
does 0F mean?)
80F
What does mean?
40F
1-28
 The Ratio Scale
 The strongest level of measurement.
 Ratio data may be categorized and ranked with
respect to some characteristic or trait.
 Differences between interval values are equal and
meaningful.
 There is an ―absolute 0‖ or defined starting point.
―0‖ does mean ―the absence of …‖ Thus,
meaningful ratios may be obtained.
1-29
 The Ratio Scale
 The following variables are measured on a ratio
scale:
 General Examples: Weight, Time, and Distance
 Business Examples: Sales, Profits, and Inventory
Levels
1-30
 How are the time data classified? In what ways do the time
data differ from ordinal data? What is a potential weakness
of this measurement scale?
 Solution: Clock time responses are on an interval scale.

With this type of data we can calculate meaningful
differences, however, there is no apparent zero point.
1-31
 What is the measurement scale of the money data? Why is
it considered the most sophisticated form of data?
 Solution: Since the tweens‘ responses are in dollar

amounts, this is ratio-scaled data; ratio-scaled data has a
natural zero point which allows the calculation of ratios.
1-32
Synopsis of Tween Survey
 60% of the tweens listened to KISS108. The resort
may want to direct its advertising dollars to this
station.
 55% of the tweens felt that the food was, at best,
fair.
 95% of the tweens would like the dining area to
remain open later.
 85% of the tweens spent their own money at the
lodge.
1-33
LO 2.1: Summarize qualitative data by forming
frequency distributions.
LO 2.2: Construct and interpret pie charts and bar
charts.
LO 2.3: Summarize quantitative data by forming
frequency distributions.
LO 2.4: Construct and interpret histograms, polygons,
and ogives.
LO 2.5: Construct and interpret a stem-and-leaf
diagram.
LO 2.6: Construct and interpret a scatterplot.
1-35
House Prices in Southern California
 A relocation specialist for a real estate firm in
Mission Viejo, CA gathers recent house sales
data for a client from Seattle, WA.
 The table below shows the sale price (in
$1,000s) for 36 single-family houses.
1-36
House Prices in Southern California
Use the sample information to:
1. Summarize the range of house prices.
2. Comment on where house prices tend to cluster.
3. Calculate percentages to compare house prices.
1-37
2.1 Summarizing Qualitative Data
LO 2.1 Summarize qualitative data by forming frequency distributions.
 A frequency distribution for qualitative data
groups data into categories and records how many
observations fall into each category.
 Weather conditions in Seattle, WA during
February 2010.
1-38
LO 2.1 2.1 Summarizing Qualitative Data
 Categories: Rainy, Sunny, or Cloudy.
 For each category‘s frequency, count the days
that fall in that category.
 Calculate relative frequency by dividing each
category‘s frequency by the sample size.
Relative
Weather
Weather Frequency
Frequency Frequency
Cloudy
Cloudy 11 1/28=0.036
RainyRainy 20
20 20/28=0.714
Sunny
Sunny 77 7/28=0.250
Total
Total 28
28 28/28=1.000
1-39
 To express relative frequencies in terms of
percentages, multiply each proportion by 100%.
Relative
Weather Frequency Frequency Percentage
Cloudy 1 1/28=0.036 x 100= 3.6%
Rainy 20 20/28=0.714 x 100=71.4%
Sunny 7 7/28=0.250 x 100=25.0%
Total 28 28/28=1.000 x 100=100%
 Note that the total of the proportions must add

to 1.0 and the total of the percentages must add
to 100%.
1-40
 A pie chart is a segmented circle whose segments
portray the relative frequencies of the categories of
some qualitative variable.
 In this example,
the variable
Region is
proportionally
divided into
4 parts.
1-41
 A bar chart depicts the frequency or the

relative frequency for each category of the
qualitative data as a bar rising vertically from
the horizontal axis.
 For example, Adidas‘ sales
may be proportionally
compared for each Region
over these two periods.
1-42
2.2 Summarizing Quantitative Data
LO 2.3 Summarize quantitative data by forming frequency distributions.
 A frequency distribution for quantitative data

groups data into intervals called classes, and
records the number of observations that fall into
each class.
 Guidelines when constructing frequency
distribution:
 Classes are mutually exclusive.
 Classes are exhaustive.
1-43
LO 2.3 2.2 Summarizing Quantitative Data
 The number of classes usually ranges from 5
to 20.
 Approximating the class width:
Largest value  Smallest value

Number of classes
1-44
 The raw data from the Introductory Case has been
converted into a frequency distribution in the
following table.
Class (in $1000s) Frequency
300 up to 400 4
400 up to 500 11
500 up to 600 14
600 up to 700 5
700 up to 800 2
Total 36
1-45
Class (in $1000s) Frequency
300 up to 400 4
400 up to 500 11
500 up to 600 14
600 up to 700 5
700 up to 800 2
Total 36
 Question: What is the price range over this time period?

 $300,000 up to $800,000
 Question: How many of the houses sold in the $500,000 up

to $600,000 range?
 14 houses
1-46
 A cumulative frequency distribution specifies how
many observations fall below the upper limit of a
particular class.
 Question: How many of the houses sold for less than

$600,000?
 29 houses
1-47
 A relative frequency distribution identifies the
proportion or fraction of values that fall into each
class.
Class frequency
Class relative frequency 
Total number of observations
 A cumulative relative frequency distribution

gives the proportion or fraction of values that fall
below the upper limit of each class.
1-48
 Here are the relative frequency and the cumulative
relative frequency distributions for the house-price
data.
Relative
Class (in $1000s) Frequency Frequency Cumulative Relative Frequency
300 up to 400 4 4/36 = 0.11 0.11
400 up to 500 11 11/36 = 0.31 0.11 + 0.31 = 0.42
500 up to 600 14 14/36 = 0.39 0.11 + 0.31 + 0.39 = 0.81
600 up to 700 5 5/36 = 0.14 0.11 + 0.31 + 0.39 + 0.14 = 0.95
700 up to 800 2 2/36 = 0.06 0.11 + 0.31 + 0.39 + 0.14 + 0.06  1.0
Total 36 1.0
1-49
Use the data on the previous slide to answer the
following two questions.
 Question: What percent of the houses sold for at

least $500,000 but not more than $600,000?
 39%
 Question: What percent of the houses sold for

less than $600,000?
 81%
1-50
2.2 Summarizing Quantitative Data
LO 2.4 Construct and interpret histograms, polygons, and ogives.
 Histograms
 Polygons
 Ogives
1-51
 A histogram is a visual representation of a
frequency or a relative frequency distribution.
 Bar height represents the respective class

frequency (or relative frequency).
 Bar width represents the class width.
1-52
 Here are the frequency and relative frequency

histograms for the house-price data.
 Note that the only difference is the y-axis scale.
1-53
 Shape of Distribution: typically symmetric or
skewed
 Symmetric—mirror image on both sides of its
center.
Symmetric Distribution
1-54
 Skewed distribution
 Positively skewed - data
form a long, narrow tail
to the right.
 Negatively skewed -
data form a long,
narrow tail to the left.
1-55
 A polygon is a visual representation of a
frequency or a relative frequency distribution.
 Plot the class midpoints on x-axis and

associated frequency (or relative
frequency) on y-axis.
 Neighboring points are connected with a

straight line.
1-56
 Here is a polygon for the house-price data.
1-57
 An ogive is a visual representation of a
cumulative frequency or a cumulative
relative frequency distribution.
 Plot the cumulative frequency (or cumulative

relative frequency) of each class above the
upper limit of the corresponding class.
 The neighboring points are then connected.
1-58
 Here is an ogive for the house-price data.
 Use the ogive to approximate the percentage of

houses that sold for less than $550,000.
Answer: 60%
1-59
2.3 Stem-and-Leaf Diagrams
LO 2.5 Construct and interpret a stem-and-leaf diagram.
 A stem-and-leaf diagram provides a visual

display of quantitative data.
 It gives an overall picture of the data‘s center and

variability.
 Each value of the data set is separated into two

parts: the stem consists of the leftmost digits,
while the leaf is the last digit.
1-60
LO 2.5 2.3 Stem-and-Leaf Diagrams
 The following data set shows the wealthiest
people in the world and their associated ages.
 The leftmost digit is the stem while the last digit is
the leaf as shown here.
Age = 36
1-61
2.4 Scatterplots
LO 2.6 Construct and interpret a scatterplot.
 A scatterplot is used to determine if two

variables are related.
 Each point is a pairing: (x ,y ) i i
(x1,y1), (x2,y2), etc. y-axis
 This scatterplot shows

income against
education. x-axis
1-62
LO 2.6 2.4 Scatterplots
 Linear relationship: upward or downward-
sloping trend of the data.
 Positive linear
relationship (shown
here): as x increases, so
does y.
 Negative linear
relationship: as x
increases, y decreases.
1-63
 Curvilinear relationship
 As x increases,
y increases at an
increasing (or
decreasing) rate.
 As x increases y
decreases, at an
increasing (or
decreasing) rate.
1-64
 No relationship: data are randomly scattered
with no discernible pattern.
 In this scatterplot, there

is no apparent
relationship between x
and y.
1-65
S
LOs 2.1, 2.2, and 2.4 Some Excel Commands
 Pie chart or Bar chart: select the relevant
categorical names with respective data, then
choose Insert > Pie > 2-D Pie or Insert > Bar > 2-D
Bar.
 Histogram: select the relevant data, and choose

Data > Data Analysis > Histogram.
 Scatterplot: select the x- and y-coordinates, choose

Input > Scatter, and select the graph at the top left.
1-66
LO 3.1: Calculate and interpret the arithmetic mean,

the median, and the mode.
LO 3.2: Calculate and interpret percentiles and a box
plot.
LO 3.3: Calculate and interpret a geometric mean
return and an average growth rate.
LO 3.4: Calculate and interpret the range, the mean
absolute deviation, the variance, the
standard deviation, and the coefficient of
variation.
1-68
LO 3.5: Explain mean-variance analysis and
the Sharpe ratio.
LO 3.6: Apply Chebyshev‘s Theorem and the
empirical rule.
LO 3.7: Calculate the mean and the variance
for grouped data.
LO 3.8: Calculate and interpret the covariance
and the correlation coefficient.
1-69
Investment Decision
 As an investment counselor at a large bank,
Rebecca Johnson was asked by an
inexperienced investor to explain the differences
between two top-performing mutual funds:
 Vanguard‘s Precious Metals and Mining fund
(Metals)
 Fidelity‘s Strategic Income Fund (Income)
 The investor has collected sample returns for
these two funds for years 2000 through 2009.
These data are presented in the next slide.
1-70
Investment Decision
 Rebecca would like to

1. Determine the typical return of the mutual
funds.
2. Evaluate the investment risk of the mutual
funds.
1-71
3.1 Measures of Central Location
LO 3.1 Calculate and interpret the arithmetic mean, the median, and the mode.
 The arithmetic mean is a primary measure of central

location.
 Sample Mean x
x
 x i
n
 Population Mean m
m  x i
1-72
LO 3.1 3.1 Measures of Central Location
Example: Investment Decision
 Use the data in the introductory case to calculate and
interpret the mean return of the Metals fund and the
mean return of the Income fund.
1-73
 The mean is sensitive to outliers.
 Consider the salaries of employees at Acetech.
 This mean does not reflect the typical salary!
1-74
 The median is another measure of central location
that is not affected by outliers.
 When the data are arranged in ascending order,

the median is
 the middle value if the number of observations is

odd, or
 the average of the two middle values if the
number of observations is even.
1-75
 Consider the sorted salaries of employees at
Acetech (odd number).
3 values below 3 values above
Median = 90,000
 Consider the sorted data from the Metals funds of

the introductory case study (even number).
 Median = (33.35 + 34.30) / 2 = 33.83%.
1-76
 The mode is another measure of central location.
 The most frequently occurring value in a data set
 Used to summarize qualitative data
 A data set can have no mode, one mode

(unimodal), or many modes (multimodal).
 Consider the salary of employees at Acetech
 The mode is $40,000 since this value appears most

often.
1-77
3.2 Percentiles and Box Plots
LO 3.2 Calculate and interpret percentiles and a box plot.
 In general, the pth percentile divides a data set into
two parts:
 Approximately p percent of the observations have
values less than the pth percentile;
 Approximately (100  p ) percent of the
observations have values greater than the pth

percentile.
1-78
LO 3.2 3.2 Percentiles and Box Plots
 Calculating the pth percentile:
 First arrange the data in ascending order.
 Locate the position, Lp, of the pth percentile by

using the formula:
p
Lp   n  1
100
 We use this position to find the percentile as
shown next.
1-79
 Consider the sorted data from the introductory case.
 For the 25th percentile, we locate the position:

p 25
L25   n  1  10  1  2.75
100 100
 Similarly, for the 75th percentile, we first find:
75
 10  1
p
L75   n  1  8.25
100 100
1-80
Calculating the pth percentile
 Once you find Lp, observe whether or not it is an
integer.
 If Lp is an integer, then the Lpth observation in the
sorted data set is the pth percentile.
 If Lp is not an integer, then interpolate between
two corresponding observations to approximate
the pth percentile.
1-81
 Both L25 = 2.75 and L75 = 8.25 are not integers, thus
 The 25th percentile is located 75% of the distance

between the second and third observations, and it is
7.34  0.75(8.09  (7.34))  4.23
 The 75th percentile is located 25% of the distance

between the eighth and ninth observations, and it is
43.79  0.25(59.45  43.79)  47.71
1-82
 A box plot allows you to:
 Graphically display the distribution of a data set.
 Compare two or more distributions.
 Identify outliers in a data set.
Outliers
Whiskers
Box
**
1-83
 The box plot displays 5 summary values:
 S = smallest value
 L = largest value
 Q1 = first quartile = 25th percentile
 Q2 = median = second quartile = 50th percentile
 Q3 = third quartile = 75th percentile
1-84
 Using the results obtained from the Metals fund
data, we can label the box plot with the 5
summary values:
Second
Quartile
Smallest First Third Largest
Value Quartile Quartile Value
47.71  4.23 = 43.48

 Note that IQR = Q3  Q1 =43.48
1-85
3.2 Percentiles and Box Plots
Detecting outliers
 Calculate IQR = 43.48
 Calculate 1.5 × IQR, or 1.5 × 43.48 = 65.22
There are outliers if

 Q1 – S > 65.22, or if
 L – Q3 > 65.22
 There are no outliers in this data set.
1-86
3.3 The Geometric Mean
LO 3.3 Calculate and interpret a geometric mean return and an
average growth rate.
 Remember that the arithmetic mean is an additive

average measurement.
 Ignores the effects of compounding.
 The geometric mean is a multiplicative average

that incorporate compounding. It is used to
measure:
 Average investment returns over several years,
 Average growth rates.
1-87
LO 3.3 3.3 The Geometric Mean
 For multiperiod returns R1, R2, . . . , Rn, the

geometric mean return GR is calculated as:
where n is the number of multiperiod returns.
1-88
 Using the data from the Metals and Income funds,

we can calculate the geometric mean returns:
1-89
 Computing an average growth rate
 For growth rates g1, g2, . . . , gn, the average growth rate
Gg is calculated as
where n is the number of multiperiod growth rates.
 For observations x1, x2, . . . , xn, the average growth rate

Gg is calculated as
where n1 is the number of distinct growth rates.
1-90
 For example, consider the sales for Adidas (in
millions of €) for the years 2005 through 2009:
 Annual growth rates are:  The average growth rate

using the simplified
formula is:
1-91
3.4 Measures of Dispersion
LO 3.4 Calculate and interpret the range, the mean absolute deviation,
the variance, the standard deviation, and the coefficient of variation.
 Measures of dispersion gauge the variability of a

data set.
 Measures of dispersion include:

 Range
 Mean Absolute Deviation (MAD)
 Variance and Standard Deviation
 Coefficient of Variation (CV)
1-92
LO 3.4 3.4 Measures of Dispersion
 Range
Range  Maximum Value  Minimum Value
 It is the simplest measure.
 It is focusses on extreme values.
 Calculate the range using the data from the Metals

and Income funds
1-93
 Mean Absolute Deviation (MAD)
 MAD is an average of the absolute difference of
each observation from the mean.
Sample MAD 
 x i x
n
Population MAD 
 x i m
N
1-94
 Calculate MAD using the data from the Metals
fund.
1-95
 Variance and standard deviation
 For a given sample,

   x
2
x
s 2
 i
and s s 2
n 1
 For a given population,

   m
2
x
 2
 i
and   2
1-96
 Calculate the variance and the standard
deviation using the data from the Metals fund.
1-97
 Coefficient of variation (CV)
 CV adjusts for differences in the magnitudes of
the means.
 CV is unitless, allowing easy comparisons of
mean-adjusted dispersion across different data
sets.
s
Sample CV 
x

Population CV 
m
1-98
 Calculate the coefficient of variation (CV)
using the data from the Metals fund and the
Income fund.
 Metals fund: CV
 Income fund: CV
1-99
Synopsis of Investment Decision
 Mean and median returns for the Metals fund are
24.65% and 33.83%, respectively.
 Mean and median returns for the Income fund are

8.51% and 7.34%, respectively.
 The standard deviation for the Metals fund and the

Income fund are 37.13% and 11.07%, respectively.
 The coefficient of variation for the Metals fund and

the Income fund are 1.51 and 1.30, respectively.
1-100
3.5 Mean-Variance Analysis and the Sharpe Ratio
LO 3.5 Explain mean-variance analysis and the Sharpe Ratio.
 Mean-variance analysis:
 The performance of an asset is measured by its rate of
return.
 The rate of return may be evaluated in terms of its reward
(mean) and risk (variance).
 Higher average returns are often associated with higher
risk.
 The Sharpe ratio uses the mean and variance to
evaluate risk.
1-101
LO 3.5 3.5 Mean-Variance Analysis and the
Sharpe Ratio
 Sharpe Ratio
 Measures the extra reward per unit of risk.
 For an investment І , the Sharpe ratio is computed as:
x  R
Sharpe Ratio 
s
where is the mean return for the investment
is the mean return for a risk-free asset
is the standard deviation for the investment
1-102
LO 3.5 3.5 Mean-Variance Analysis and the
Sharpe Ratio
 Sharpe Ratio Example
 Compute the Sharpe ratios for the Metals and Income
funds given the risk free return of 4%.
 Since 0.56 > 0.41, the Metals fund offers more reward per
unit of risk as compared to the Income fund.
1-103
3.6 Chebyshev’s Theorem and the Empirical Rule
LO 3.6 Apply Chebyshev’s Theorem and the empirical rule.
 Chebyshev’s Theorem
 For any data set, the proportion of observations that lie
within k standard deviations from the mean is at least
11/k2 , where k is any number greater than 1.
 Consider a large lecture class with 280 students. The mean

score on an exam is 74 with a standard deviation of 8. At
least how many students scored within 58 and 90?
With k = 2, we have 11/22 = 0.75. At least 75% of 280 or
210 students scored within 58 and 90.
1-104
LO 3.6 3.6 Chebyshev’s Theorem and the Empirical Rule
 The Empirical Rule:

 Approximately 68% of all observations fall in the
interval x  s .
 Approximately 95% of all
observations fall in the
interval x  2s .
 Almost all observations
fall in the interval
x  3s .
1-105
LO 3.6 3.6 Chebyshev’s Theorem and the Empirical Rule
 Reconsider the example of the lecture class with

280 students with a mean score of 74 and a
standard deviation of 8. Assume that the distribution
is symmetric and bell-shaped. Approximately how
many students scored within 58 and 90?
 The score 58 is two standard deviations below the
mean while the score 90 is two standard
deviations above the mean.
 Therefore about 95% of 280 students, or
0.95(280) = 266 students, scored within 58 and
90.
1-106
3.7 Summarizing Grouped Data
LO 3.7 Calculate the mean and the variance for grouped data.
 When data are grouped or aggregated, we use
these formulas:
Mean: x 
 mi  i
n
   x i
2
m
Variance: s 2
 i
n 1
Standard Deviation: s  s 2
where mi and i are the midpoint and frequency of the ith
class, respectively.
1-107
LO 3.7 3.7 Summarizing Grouped Data
 Consider the frequency distribution of house prices.
 Calculate the average house price.
 For the mean, first multiply each class‘s midpoint by its

respective frequency.
 Finally, sum the fourth column and divide by the sample
size to obtain the mean = 18,800/36 = 522 or $522,000.
1-108
 Calculate the sample variance and the standard
deviation.
 First calculate the sum of the weighted squared

differences from the mean.
 Dividing this sum by (n1) = 361 = 35 yields a variance of
10.635($)2.
 The square root of the variance yields a standard deviation
of $103.13.
1-109
 Weighted Mean
 Let w1, w2, . . . , wn denote the weights of the

sample observations x1, x2, . . . , xn such that
w1  w2    wn  1, then
x   w i xi
1-110
 A student scores 60 on Exam 1, 70 on Exam 2, and
80 on Exam 3. What is the student‘s average score
for the course if Exams 1, 2, and 3 are worth 25%,
25%, and 50% of the grade, respectively?
 Define w1 = 0.25, w2 = 0.25, and w3 = 0.50.
x   w i xi   0.25(60)  0.25(70)  0.5(80)  72.50
 The unweighted mean is only 70 as it does not

incorporate the higher weight given to the score on
Exam 3.
1-111
3.8 Covariance and Correlation
LO 3.8 Calculate and interpret the covariance and the correlation coefficient.
 The covariance (sxy or xy) describes the

direction of the linear relationship between
two variables, x and y.
 The correlation coefficient (rxy or rxy)

describes both the direction and strength of
the relationship between x and y.
1-112
LO 3.8 3.8 Covariance and Correlation
 The sample covariance sxy is computed as
sxy 
  x i  x  y i  y 
n 1
 The population covariance xy is computed as
 x  m x   y i  my 
 xy 
i
1-113
 The sample correlation rxy is computed as
sxy
rxy 
s x sy
 The population correlation rxy is computed as

 xy
r xy 
 x y
 Note, 1 < rxy < +1 or 1 < rxy < +1
1-114
 Let‘ s calculate the covariance and the correlation
coefficient for the Metals (x) and Income (y) funds.
 Positive Relationship
 Also recall: x  24.65, sx  37.13, y  8.51, sy  11.07
1-115
 We use the following table for the calculations.
 Covariance:
 Correlation:
1-116
S
LOs 3.1, 3.4, and 3.8 Some Excel Commands
 Measures of central location and dispersion: select
the relevant data, and choose Data > Data Analysis
> Descriptive Statistics.
 Covariance: For sample covariance, choose

Formulas > Insert Function > COVARIANCE.S. For
population covariance, use COVARIANCE.P.
 Correlation: For sample correlation or population

correlation, choose Formulas > Insert Function >
CORREL.
1-117
LO 4.1: Describe fundamental probability
concepts.
LO 4.2: Formulate and explain subjective,
empirical, and a priori probabilities.
LO 4.3: Calculate and interpret the probability of
the complement of an event, the
probability that at least one of two events
will occur, and a joint probability.
LO 4.4: Calculate and interpret a conditional
probability.
1-119
LO 4.5: Distinguish between independent and

dependent events.
LO 4.6: Calculate and interpret probabilities from a
contingency table.
LO 4.7: Apply the total probability rule and Bayes‘
theorem.
LO 4.8: Use a counting rule to solve a particular
counting problem.
1-120
Sportsware Brands
 Annabel Gonzalez, chief retail analyst at marketing
firm Longmeadow Consultants is tracking the sales
of compression-gear produced by Under Armour,
Inc., Nike, Inc., and Adidas Group.
 After collecting data from 600 recent purchases,
Annabel wants to determine weather age
influences brand choice.
1-121
4.1 Fundamental Probability Concepts
LO 4.1 Describe fundamental probability concepts.
 A probability is a numerical value that

measures the likelihood that an uncertain
event occurs.
 The value of a probability is between zero (0)
and one (1).
 A probability of zero indicates impossible events.
 A probability of one indicates definite events.
1-122
LO 4.1 4.1 Fundamental Probability Concepts
 An experiment is a trial that results in one of
several uncertain outcomes.
 Example: Trying to assess the probability of a
snowboarder winning a medal in the ladies‘ halfpipe event
while competing in the Winter Olympic Games.
 Solution: The athlete‘s attempt to predict her chances of
medaling is an experiment because the outcome is
unknown.
 The athlete‘s competition has four possible outcomes:
gold medal, silver medal, bronze medal, and no medal.
We formally write the sample space as
S = {gold, silver, bronze, no medal}.
1-123
 A sample space, denoted S, of an experiment
includes all possible outcomes of the experiment.
 For example, a sample space containing letter
grades is:
S  A, B,C, D, F 
 An event is a subset A, B,C, D F

of the sample space.
The event The simple event
―passing grades‖ ―failing grades‖
is a subset of S. is a subset of S.
1-124
 Events are considered to be
 Exhaustive
 If all possible outcomes of a random experiment are
included in the events. For example, the events
―earning a medal‖ and ―failing to earn a medal‖ in a
single Olympic event are exhaustive since these are the
only outcomes.
 Mutually exclusive
 If they do not share any common outcome of a random
experiment. For example, the events ―earning a medal‖
and ―failing to earn a medal‖ in a single Olympic event
are mutually exclusive.
1-125
 A Venn Diagram represents the sample
space for the event(s).
 For example, this Venn Diagram illustrates the
sample space for events A and B.
A B
 The union of two events (A ∪ B) is the event

consisting of all simple events in A or B.
1-126
 The intersection of  The complement of
two events (A ∩ B) event A (i.e., Ac) is the
consists of all simple event consisting of all
events in both A and B. simple events in the
sample space S that are
A∩B not in A.
A B A Ac
A∪B
1-127
 Example: Recall the snowboarder‘s sample space
defined as S = {gold, silver, bronze, no medal}.
Given the following, find A ∪ B, A ∩ B, A ∩ C,
and Bc.
 A = {gold, silver, bronze}.
 B = {silver, bronze, no medal}.
 C = {no medal}.
 Solution:
 A ∪ B = {gold, silver, bronze, no medal}. Note that there is
no double counting.
 A ∩ B = {silver, bronze}. A ∩ C =  (null or empty set).
 Bc = {gold}.
1-128
4.1 Fundamental Probability Concepts
LO 4.2 Formulate and explain subjective, empirical, and a priori probabilities.
 Assigning Probabilities
 Subjective probabilities
 Draws on personal and subjective judgment.
 Objective probabilities
 Empirical probability: a relative frequency of
occurrence.
 a priori probability: logical analysis.
1-129
 Two defining properties of a probability:
 The probability of any event A is a value between
0 and 1.
 The sum of the probabilities of any list of mutually
exclusive and exhaustive events equals 1.
 Calculating an empirical probability

 Use relative frequency:
the number of outcomes in A
P ( A) 
the number of outcomes in S
1-130
 Example: Let event A be the probability of earning a
medal:
P(A) = P({gold}) + P({silver}) + P({bronze})
= 0.10 + 0.15 + 0.20 = 0.45.
 P(B ∪ C) = P({silver}) + P({bronze}) + P({no medal})

= 0.15 + 0.20 + 0.55 = 0.90.
 P(A ∩ C) = 0; recall that there are no common outcomes

in A and C.
 P(Bc) = P({gold}) = 0.10.
1-131
 Probabilities expressed as odds.
 Percentages and odds are an alternative
approach to expressing probabilities include.
 Converting an odds ratio to a probability:
 Given odds for event A occurring a
of ―a to b,‖ the probability of A is:
ab
 Given odds against event A occurring b
of ―a to b,‖ the probability of A is:
ab
1-132
 Converting a probability to an odds ratio:
 The odds for event A occurring is equal to
P  A
1 P  A
 The odds against A occurring is equal to

1 P  A
P  A
1-133
 Example: Converting an odds ratio to a probability.
 Given the odds of 2:1 for beating the Cardinals, what was
the probability of the Steelers‘ winning just prior to the
2009 Super Bowl?
a 2 2
   0.67
a  b 2 1 3
 Example: Converting a probability to an odds ratio.
 Given that the probability of an on-time arrival for New
York‘s Kennedy Airport is 0.56, what are the odds for a
plane arriving on-time at Kennedy Airport?
P  A 0.56 0.56
   1.27 or 1.27:1
1  P  A  1  0.56 0.44
1-134
4.2 Rules of Probability
LO 4.3 Calculate and interpret the probability of the complement of an event, the
probability that at least one of two events will occur, and a joint probability.
 The Complement Rule

 The probability of the complement of an event,
P(Ac), is equal to one minus the probability of the
event. Sample Space S
P  Ac   1  P  A  A Ac
1-135
LO 4.3 4.2 Rules of Probability
 The Addition Rule
 The probability that event A or B occurs, or that
at least one of these events occurs, is:
P  A  B   P  A  P B   P  A  B 
1-136
 Illustrating the Addition Rule with the Venn
Diagram. Events A and B
A∩B
both occur.
A B
A occurs or B occurs
A∪B
or both occur.
P  A  B   P  A  P B   P  A  B 
1-137
 The Addition Rule for Two Mutually Exclusive
Events Events A and B
A∩B=0 both cannot occur.
A B
A∪B A occurs or B occurs
P  A  B   P  A  P B 
1-138
 Example: The addition rule.
 Anthony feels that he has a 75% chance of getting an A in
Statistics, a 55% chance of getting an A in Managerial
Economics and a 40% chance of getting an A in both
classes. What is the probability that he gets an A in at least
one of these courses?
P  AS  AM   P  AS   P  AM   P  AS  AM 
 0.75  0.55  0.40  0.90
 What is the probability that he does not get an A in either of
these courses? Using the compliment rule, we find
P   A  A    1 P  A  A
S M
C
S M   1 0.90  0.10
1-139
 Example: The addition rule for mutually exclusive
events.
 Samantha Greene, a college senior, contemplates her
future immediately after graduation. She thinks there is a
25% chance that she will join the Peace Corps and a 35%
chance that she will enroll in a full-time law school
program in the United States.
P  A  B   P  A  P B   0.25  0.35  0.60
 What is the probability that she does not choose either of

these options?

P  A  B
C
  1 P  A  B   1 0.60  0.40
1-140
LO 4.4 Calculate and interpret a conditional probability.
 Unconditional (Marginal) Probability

 The probability of an event without any
restriction.
 For example, P(A) = probability of finding a job,
and P(B) = probability of prior work experience.
1-141
 Conditional Probability
 The probability of an event given that another
event has already occurred.
 In the conditional probability statement, the
symbol ― | ‖ means ―given.‖
 Whatever follows ― | ‖ has already occurred.
 For example, P(A | B) = probability of finding a

job given prior work experience.
1-142
 Conditional Probability
A∩B If B has already

occurred, the relevant
portion of the sample
space reduces to B.
A B
1-143
 Calculating a Conditional Probability
 Given two events A and B, each with a positive probability
of occurring, the probability that A occurs given that B has
occurred ( A conditioned on B ) is equal to
P  A  B
P  A | B 
P B 
 Similarly, the probability that B occurs given that A has

occurred ( B conditioned on A ) is equal to
P  A  B
P B | A 
P  A
1-144
 Example: Conditional Probabilities
 An economist predicts a 60% chance that country A will
perform poorly economically and a 25% chance that
country B will perform poorly economically. There is also
a 16% chance that both countries will perform poorly.
What is the probability that country A performs poorly
given that country B performs poorly?
 Let P(A) = 0.60, P(B) = 0.25, and P(A ∩ B) = 0.16
P  A  B 0.16
P  A | B    0.64
P B  0.25
 Since P(A|B) = 0.64 ≠ P(A) = 0.60, events A and B are not

independent.
1-145
LO 4.5 Distinguish between independent and dependent events.
 Independent and Dependent Events

 Two events are independent if the occurrence of one
event does not affect the probability of the occurrence of
the other event.
 Events are considered dependent if the occurrence of
one is related to the probability of the occurrence of the
other.
 Two events are independent if and only if
P  A | B   P A  or P B | A   P B 
1-146
 The Multiplication Rule: the probability that A and
B both occur is equal to:
P  A  B   P  A | B   P B   P  B | A  P  A
 Note that when two events are mutually exclusive
P  A  B  0
1-147
 The Multiplication Rule for Independent
Events
 The joint probability of A and B equals the product
of the individual probabilities of A and B.
P  A  B   P  A P B 
 The multiplication rule may also be used to

determine independence. That is, two events are
independent if the above equality holds.
1-148
4.3 Contingency Tables and Probabilities
LO 4.6 Calculate and interpret probabilities from a contingency table.
 Contingency Tables
 A contingency table generally shows frequencies for two
qualitative or categorical variables, x and y.
 Each cell represents a mutually exclusive combination of
the pair of x and y values.
 Here, x is ―Age Group‖ with two outcomes
while y is ―Brand Name‖ with three outcomes.
1-149
LO 4.6 4.3 Contingency Tables and Probabilities
 Contingency Tables
 Note that each cell in the contingency table
represents a frequency.
 In the above table, 174 customers under the age

of 35 purchased an Under Armour product.
 54 customers at least 35 years old purchased an
Under Armour product.
1-150
 The contingency table may be used to calculate
probabilities using relative frequency.
 Note: Abbreviated labels have been used in place
of the class names in the table.
 First obtain the row and column totals.

 Sample size is equal to the total of the row totals
or column totals. In this case, n = 600.
1-151
 Joint Probability Table
 The joint probability is determined by dividing
each cell frequency by the grand total.
Joint
Probabilities
Marginal
Probabilities
 For example, the probability that a randomly selected

person is under 35 years of age and makes an Under
Armour purchase is 174
P  A  B1    0.29
600
1-152
4.4 The Total Probability Rule and Bayes’ Theorem
LO 4.7 Apply the total probability rule and Bayes’ theorem.
 The Total Probability Rule
 P(A) is the sum of its intersections with some mutually
exclusive and exhaustive events corresponding to an
experiment.
 Consider event B and its
complement Bc. These c
two events are mutually B A B
exclusive and exhaustive.
 The circle, representing
event A, consists entirely of P  A  B  P  A  B c

its intersections with B and Bc.
1-153
LO 4.7 4.4 The Total Probability Rule and Bayes’
Theorem
 The Total Probability Rule conditional on two
outcomes
 The total probability rule conditional on two
events, B and Bc, is
P  A   P  A  B   P  A  Bc 
 or equivalently,
P  A   P  A | B  P  B   P  A | Bc  P Bc 
1-154
Theorem
 Bayes’ Theorem
 A procedure for updating probabilities based on
new information.
 Prior probability is the original (unconditional)
probability (e.g., P(B) ).
 Posterior probability is the updated
(conditional) probability (e.g., P(B | A) ).
1-155
Theorem
 Bayes‘ Theorem
 Given a set of prior probabilities for an event and
some new information, the rule for updating the
probability of the event is called Bayes’ theorem.
P  A  B
P  B | A 

P  A  B   P A  Bc 
or
P  A | B  P B 
P B | A 
   
P  A | B  P B   P A | Bc P Bc
1-156
Theorem
 Example: Bayes‘ Theorem
 Assume that 99% of the individuals taking a polygraph test
tell the truth. These tests are considered to be 95%
reliable (i.e., a 95% chance of actually detecting a lie). Let
there also be a 0.5% chance that the test erroneously
detects a lie even when the individual is telling the truth.
 An individual has just taken a polygraph test and the test
has detected a lie. What is the probability that the
individual was actually telling the truth?
 Let D denote the outcome that the polygraph detects a lie
and T represent the outcome that an individual is telling
the truth.
1-157
Theorem
 Example: Bayes‘ Theorem
 Given the following probabilities,
We find
P T | D  
 0.005  0.99 

0.00495
 0.34256
 0.005  0.99    0.95  0.01 0.01445
1-158
4.5 Counting Rules
LO 4.8 Use a counting rule to solve a particular counting problem.
 The Factorial Formula

 The number of ways to assign every member of a group
of size n to n slots is calculated using the factorial
formula:
n !  n   n  1   n  2   n  3    1
 By definition, 0! = 1.
 For example, in how many ways can a little-league coach
assign nine players to each of the nine team positions
(pitcher, catcher, first base, etc.)?
 Solution: 9!  9  8  7   1  362,880
1-159
LO 4.8 4.5 Counting Rules
 The Combination Formula
 The number of ways to choose x objects from a
total of n objects, where the order in which the x
objects are listed does not matter, is referred to
as a combination and is calculated as:
n Cx   
n
x 
n!
 n  x ! x !
1-160
 The Permutation Formula
 The number of ways to choose x objects from a
total of n objects, where the order in which the x
objects is listed does matter, is referred to as a
permutation and is calculated as:
n!
n Px 
 n  x !
1-161
 Example: The Permutation Formula
 The little-league coach recruits three more players so that
his team has backups in case of injury. Now his team
totals 12. In how many ways can the coach select nine
players from the 12-player roster?
 Combination: What if the  Permutation: What if the

order in which the players order in which the players
are selected is not are selected is important.
important.
12! 12!
12 C9   220 12 P9   79,833,600
12  9 !9! 12  9 !
1-162
LO 5.1: Distinguish between discrete and
continuous random variables.
LO 5.2: Describe the probability distribution of a
discrete random variable.
LO 5.3: Calculate and interpret summary
measures for a discrete random variable.
LO 5.4: Differentiate among risk neutral, risk
averse, and risk loving consumers.
LO 5.5: Compute summary measures to evaluate
portfolio returns.
1-164
LO 5.6: Describe the binomial distribution and
compute relevant probabilities.
LO 5.7: Describe the Poisson distribution and
compute relevant probabilities.
LO 5.8: Describe the hypergeometric distribution
and compute relevant probabilities.
1-165
Available Staff for Probable Customers
 Anne Jones is a manager of a local Starbucks.
Due to a weak economy and higher gas and food
prices, Starbucks announced plans in 2008 to close
500 more U.S. locations.
 While Anne‘s store will remain open, she is
concerned about how other nearby closings might
affect her business.
 A typical Starbucks customer visits the chain
between 15 and 18 times a month.
 Based on all this, Anne believes that customers will
average 18 visits to her store over a 30-day month.
1-166
Available Staff for Probable Customers
 Anne needs to decide staffing needs.
 Too many employees would be costly to the store.
 Not enough employees could result in losing angry
customers who choose not to wait for service.
 With an understanding of the probability distribution

of customer arrivals, Anne will be able to:
 Calculate the expected number of visits from a typical
Starbucks customer in a specified time period.
 Calculate the probability that a typical Starbucks customer
visits the chain a certain number of times in a specified
time period.
1-167
5.1 Random Variables and Discrete Probability
Distributions
LO 5.1 Distinguish between discrete and continuous random variables.
 Random variable
 A function that assigns numerical values to the
outcomes of a random experiment.
 Denoted by uppercase letters (e.g., X ).
 Values of the random variable are denoted by

corresponding lowercase letters.
 Corresponding values of the random variable:
x 1 , x2 , x3 , . . .
1-168
5.1 Random Variables and Discrete
LO 5.1
Probability Distributions
 Random variables may be classified as:

 Discrete
 The random variable assumes a countable

number of distinct values.
 Continuous
 The random variable is characterized by
(infinitely) uncountable values within any
interval.
1-169
LO 5.1
 Consider an experiment in which two shirts are
selected from the production line and each can be
defective (D) or non-defective (N).
 Here is the sample space:
(D,D)
 The random variable X is (D,N)
the number of defective shirts. (N,D)
(N,N)
 The possible number of
defective shirts is the set {0, 1, 2}.
 Since these are the only possible outcomes, this is
a discrete random variable.
1-170
5.1 Random Variables and Discrete Probability
Distributions
LO 5.2 Describe the probability distribution of a discrete random variable.
 Every random variable is associated with a
probability distribution that describes the variable
completely.
 A probability mass function is used to describe
discrete random variables.
 A probability density function is used to describe
continuous random variables.
 A cumulative distribution function may be used to
describe both discrete and continuous random
variables.
1-171
LO 5.2
 The probability mass function of a discrete
random variable X is a list of the values of X with
the associated probabilities, that is, the list of all
possible pairs
 x,P  X  x 
 The cumulative distribution function of X is
defined as
P  X  x
1-172
LO 5.2
 Two key properties of discrete probability
distributions:
 The probability of each value x is a value between
0 and 1, or equivalently
0  P  X  x  1
 The sum of the probabilities equals 1. In other words,
P  X  x   1
i
where the sum extends over all values x of X.
1-173
LO 5.2
 A discrete probability distribution may be viewed
as a table, algebraically, or graphically.
 For example, consider the experiment of rolling a
six-sided die. A tabular presentation is:
 Each outcome has an associated probability of 1/6.

Thus, the pairs of values and their probabilities form
the probability mass function for X.
1-174
LO 5.2
 Another tabular view of a probability distribution is
based on the cumulative probability distribution.
 For example, consider the experiment of rolling a six-
sided die. The cumulative probability distribution is
 The cumulative probability distribution gives the

probability of X being less than or equal to x.
For example, P  X  4   4 6
1-175
LO 5.2
 A probability distribution may be expressed
algebraically.
 For example, for the six-sided die experiment, the
probability distribution of the random variable X is:
1 6 if x  1,2,3,4,5,6
P  X  x  
0 otherwise
 Using this formula we can find

P  X  5  1 6 P  X  7  0
1-176
LO 5.2
 A probability distribution may be expressed
graphically.
 The values x of X are placed on the horizontal axis and
the associated probabilities on the vertical axis.
 A line is drawn such that its height is associated with the
probability of x.
 For example, here is the
graph representing the
six-sided die experiment:
 This is a uniform distribution
since the bar heights are all
the same.
1-177
LO 5.2
 Example: Consider the probability distribution
which reflects the number of credit cards that
Bankrate.com‘s readers carry:
 Is this a valid probability
distribution?
 What is the probability that a
reader carries no credit cards?
 What is the probability that a
reader carries less than two?
 What is the probability that a reader carries at least two
credit cards?
1-178
LO 5.2
 Consider the probability distribution which reflects
the number of credit cards that Bankrate.com‘s
readers carry:
 Yes, because 0 < P(X = x) < 1
and SP(X = x) = 1.
 P(X = 0) = 0.025
 P(X < 2) = P(X = 0) + P(X = 1)
= 0.025 + 0.098 = 0.123.
 P(X > 2) = P(X = 2) + P(X = 3)
+ P(P = 4*) = 0.166 + 0.165 + 0.546 = 0.877.
Alternatively, P(X > 2) = 1  P(X < 2) = 1  0.123 = 0.877.
1-179
5.2 Expected Value, Variance, and Standard
Deviation
LO 5.3 Calculate and interpret summary measures for a
discrete random variable.
 Summary measures for a random variable

include the
 Mean (Expected Value)
 Variance
 Standard Deviation
1-180
5.2 Expected Value, Variance, and
LO 5.3
Standard Deviation
 Expected Value Population Mean
E(X) m
 E(X) is the long-run average value of the random
variable over infinitely many independent
repetitions of an experiment.
 For a discrete random variable X with values
x1, x2, x3, . . . that occur with probabilities
P(X = xi), the expected value of X is
E  X   m   xi P  X  xi 
1-181
LO 5.3
Standard Deviation
 Variance and Standard Deviation
 For a discrete random variable X with values
x1, x2, x3, . . . that occur with probabilities
P(X = x ),
i
Var  X       xi  m  P  X  xi 
2 2
  xi2P  X  xi   m 2
 The standard deviation is the square root of the
variance.
SD  X      2
1-182
LO 5.3
Standard Deviation
 Example: Brad Williams, owner of a car dealership
in Chicago, decides to construct an incentive
compensation program based on performance.
 Calculate the expected value of the annual bonus amount.

 Calculate the variance and standard deviation of the
annual bonus amount.
1-183
LO 5.3
Standard Deviation
 Solution: Let the random variable X denote the
bonus amount (in $1,000s) for an employee.
 E(X) = m = Sxi P(X = xi) = 4.2 or $4,200

 Var(X) = 2 = S(xi  m)2P(X = xi) = 9.97 (in $1,000s)2.
 .SD( X )   2  9.97  3.158 or $3,158.
1-184
5.2 Expected Value, Variance, and Standard
Deviation
LO 5.4 Differentiate among risk neutral, risk averse, and risk
loving consumers.
 Risk Neutrality and Risk Aversion

 Risk averse consumers:
 Expect a reward for taking a risk.
 May decline a risky prospect even if it offers a positive

expected gain.
 Risk neutral consumers:
 Completely ignore risk.
 Always accept a prospect that offers a positive

expected gain.
1-185
LO 5.4
Standard Deviation
 Risk Neutrality and Risk Aversion
 Risk loving consumers:
 May accept a risky prospect even if the expected gain is
negative.
 Application of Expected Value to Risk
 Suppose you have a choice of receiving $1,000 in cash
or receiving a beautiful painting from your grandmother.
 The actual value of the painting is uncertain. Here is a
probability distribution
of the possible worth
of the painting. What
should you do?
1-186
LO 5.4
Standard Deviation
 Application of Expected Value to Risk
 First calculate the
expected value:
E  X    xi P  X  xi 
 $2,000  0.20  $1,000  0.50  $500  0.30
 $1,050
 Since the expected value is more than $1,000 it may

seem logical to choose the painting over cash.
 However, we have not taken into account risk.
1-187
5.3 Portfolio Returns
LO 5.5 Compute summary measures to evaluate a portfolio returns.
 Investment opportunities often use both:
 Expected return as a measure of reward.
 Variance or standard deviation of return as a measure of
risk.
 Portfolio is defined as a collection of assets such as
stocks and bonds.
 Let X and Y represent two random variables of interest,
denoting, say, the returns of two assets.
 Since an investor may have invested in both assets, we
would like to evaluate the portfolio return formed by a
linear combination of X and Y .
1-188
LO 5.5 5.3 Portfolio Returns
 Properties of random variables useful in evaluating
portfolio returns.
 Given two random variables X and Y,
 The expected value of X and Y is
E  X  Y   E  X   E Y 
 The variance of X and Y is
Var  X  Y   Var  X   Var Y   2Cov  X ,Y 
where Cov(X,Y) is the covariance between X and Y.
 For constants a, b, the formulas extend to
E  aX  bY   aE  X   bE Y 
Var  aX  bY   a2Var  X   b2Var Y   2abCov  X ,Y 
1-189
 Expected return, variance, and standard
deviation of portfolio returns.
 Given a portfolio with two assets, Asset A and
Asset B, the expected return of the portfolio
E(Rp) is computed as:
E  Rp   w AE  RA   wBE  RB 
 where
wA and wB are the portfolio weights
wA + wB = 1
E(RA) and E(RB) are the expected returns on assets
A and B, respectively.
1-190
 Expected return, variance, and standard deviation
of portfolio returns.
 Using the covariance or the correlation coefficient of the
two returns, the portfolio variance of return is:
Var  Rp   w A2 A2  wB 2 B 2  2w AwB rAB A B
where 2A and 2B are the variances of the returns for
Asset A and Asset B, respectively,
AB is the covariance between the returns for
Assets A and B
rAB is the correlation coefficient between the returns
for Asset A and Asset B.
1-191
 Example: Consider an investment portfolio of
$40,000 in Stock A and $60,000 in Stock B.
 Given the following information, calculate the expected
return of this portfolio.
 .
1-192
 Calculate the correlation coefficient between

the returns on Stocks A and B.
 Solution:
1-193
 Calculate the portfolio variance.

 Solution:
1-194
 Calculate the portfolio standard deviation.

 Solution:
1-195
5.4 The Binomial Probability Distribution
LO 5.6 Describe the binomial distribution and compute relevant probabilities.
 A binomial random variable is defined as the
number of successes achieved in the n trials of a
Bernoulli process.
 A Bernoulli process consists of a series of n
independent and identical trials of an experiment
such that on each trial:
 There are only two possible outcomes:
p = probability of a success
1p = q = probability of a failure
 Each time the trial is repeated, the probabilities of
success and failure remain the same.
1-196
LO 5.6 5.4 The Binomial Probability Distribution
 A binomial random variable X is defined as the
number of successes achieved in the n trials of a
Bernoulli process.
 A binomial probability distribution shows the
probabilities associated with the possible values of
the binomial random variable (that is, 0, 1, . . . , n).
 For a binomial random variable X , the probability of x
successes in n Bernoulli trials is
 
P  X  x   nx p x q n  x 
n!
x !  n  x !
p xq nx
for x  0,1,2,, n. By definition, 0!  1.
1-197
 For a binomial distribution:
E  X   m  np
 The expected value
(E(X)) is:
 The variance (Var(X)) is: Var  X    2  npq
The standard deviation

SD  X    

(SD(X)) is: npq
1-198
 Example: Approximately 20% of U.S. workers are
afraid that they will never be able to retire. Suppose
10 workers are randomly selected.
 What is the probability that none of the workers is
afraid that they will never be able to retire?
 Solution: Let X = 10, then
1-199
 Computing binomial probabilities with Excel:
 In 2007 approximately 4.7% of the households in the
Detroit metropolitan area were in some stage of
foreclosure. What is the probability that exactly 5 of these
100 mortgage-holding households in Detroit are in some
stage of foreclosure?
 Solution: Using the binomial function on Excel, enter the four
arguments shown here:
 Excel returns the
formula result as
0.1783; thus,
P(X = 5) = 0.1783.
1-200
5.5 The Poisson Probability Distribution
LO 5.7 Describe the Poisson distribution and compute relevant probabilities.
 A binomial random variable counts the number of
successes in a fixed number of Bernoulli trials.
 In contrast, a Poisson random variable counts the
number of successes over a given interval of time or
space.
 Examples of a Poisson random variable include
 With respect to time—the number of cars that cross the
Brooklyn Bridge between 9:00 am and 10:00 am on a
Monday morning.
 With respect to space—the number of defects in a
50-yard roll of fabric.
1-201
LO 5.7 5.5 The Poisson Probability Distribution
 A random experiment satisfies a Poisson
process if:
 The number of successes within a specified time
or space interval equals any integer between zero
and infinity.
 The numbers of successes counted in
nonoverlapping intervals are independent.
 The probability that success occurs in any interval
is the same for all intervals of equal size and is
proportional to the size of the interval.
1-202
 For a Poisson random variable X, the
probability of x successes over a given
interval of time or space is
e m
m x
P  X  x  for x  0,1,2,
x!
where m is the mean number of successes

and e  2.718 is the base of the natural
logarithm.
1-203
 For a Poisson distribution:
 The expected value (E(X)) is: EX  m
 The variance (Var(X)) is: Var  X     m

2
The standard deviation

SD  X     m

(SD(X)) is:
1-204
 Example: Returning to the Starbucks example, Ann
believes that the typical Starbucks customer
averages 18 visits over a 30-day month.
 How many visits should Anne expect in a 5-day period
from a typical Starbucks customer?
 Solution:
 What is the probability that a customer visits the chain five

times in a 5-day period?
 Solution:
1-205
5.6 The Hypergeometric Probability
Distribution
LO 5.8 Describe the hypergeometric distribution and compute
relevant probabilities.
 A binomial random variable X is defined as the number

of successes in the n trials of a Bernoulli process, and
according to a Bernoulli process, those trials are
 Independent and
 The probability of success does not change from trial to
trial.
 In contrast, the hypergeometric probability
distribution is appropriate in applications where we
cannot assume the trials are independent.
1-206
LO 5.85.6 The Hypergeometric Probability
Distribution
 Use the hypergeometric distribution when sampling
without replacement from a population whose size
N is not significantly larger than the sample size n.
 For a hypergeometric random variable X, the probability
of x successes in a random selection of n items is
 x  n  x 
S N S
P  X  x 
 Nn 
for x  0,1,2,, n if n  S or x  0,1,2,,S if n  S,
where N denotes the number of items in the population of

which S are successes.
1-207
Distribution
 For a hypergeometric distribution:
S
 The expected value (E(X)) is: E  X   m  n 
N 
 The variance (Var(X))

is:  S  S  N  n 
Var  X     n
2
1
 N 
  N  
 N  1 
 The standard deviation

(SD(X)) is: SD  X     n  S 1 S  N  n 
 N  N  
   N  1 
1-208
Distribution
 Example: At a convenience store in Morganville,
New Jersey, the manager randomly inspects five
mangoes from a box containing 20 mangoes for
damages due to transportation. Suppose the
chosen box contains exactly 2 damaged mangoes.
 What is the probability that one out of five mangoes used
in the inspection are damaged?
 Solution
1-209
LO 6.1: Describe a continuous random variable.
LO 6.2: Describe a continuous uniform distribution and
calculate associated probabilities.
LO 6.3: Explain the characteristics of the normal distribution.
LO 6.4: Use the standard normal table or the z table.
LO 6.5: Calculate and interpret probabilities for a random
variable that follows the normal distribution.
variable that follows the exponential distribution.
variable that follows the lognormal distribution.
1-211
Demand for Salmon
 Akiko Hamaguchi, manager of a small sushi
restaurant, Little Ginza, in Phoenix, Arizona, has to
estimate the daily amount of salmon needed.
 Akiko has estimated the daily consumption of
salmon to be normally distributed with a mean of
12 pounds and a standard deviation of 3.2 pounds.
 Buying 20 lbs of salmon every day has resulted in
too much wastage.
 Therefore, Akiko will buy salmon that meets the
daily demand of customers on 90% of the days.
1-212
Demand for Salmon
 Based on this information, Akiko would like to:
 Calculate the proportion of days that demand for
salmon at Little Ginza was above her earlier
purchase of 20 pounds.
 Calculate the proportion of days that demand for
salmon at Little Ginza was below 15 pounds.
 Determine the amount of salmon that should be
bought daily so that it meets demand on 90% of
the days.
1-213
6.1 Continuous Random Variables and the
Uniform Probability Distribution
LO 6.1 Describe a continuous random variable.
 Remember that random variables may be

classified as
 Discrete
 The random variable assumes a countable
number of distinct values.
 Continuous
 The random variable is characterized by
(infinitely) uncountable values within any
interval.
1-214
6.1 Continuous Random Variables and
LO 6.1
the Uniform Probability Distribution

 When computing probabilities for a continuous
random variable, keep in mind that P(X = x) = 0.
 We cannot assign a nonzero probability to each
infinitely uncountable value and still have the
probabilities sum to one.
 Thus, since P(X = a) and P(X = b) both equal
zero, the following holds for continuous random
variables:
P  a  X  b   P a  X  b   P a  X  b   P a  X  b 
1-215
LO 6.1

 Probability Density Function f(x) of a
continuous random variable X
 Describes the relative likelihood that X
assumes a value within a given interval

(e.g., P(a < X < b) ), where
 f(x) > 0 for all possible values of X.
 The area under f(x) over all values of x

equals one.
1-216
LO 6.1

 Cumulative Density Function F(x) of a
continuous random variable X
 For any value x of the random variable X,
the cumulative distribution function F(x) is

computed as
F(x) = P(X < x)
 As a result, P(a < X < b) = F(b)  F(a)
1-217
6.1 Continuous Random Variables and the
Uniform Probability Distribution
LO 6.2 Describe a continuous uniform distribution and calculate associated
probabilities.
 The Continuous Uniform Distribution

 Describes a random variable that has an
equally likely chance of assuming a value within a
specified range.
 Probability density function:
 1 where a and b are

 for a  x  b, and the lower and upper
f  x   b  a
 0 for x  a or x  b limits, respectively.
1-218
LO 6.2

 The Continuous Uniform Distribution
 The expected value and standard deviation of X
are:
ab
EX  m 
2
SD  X     b  a
2
12
1-219
LO 6.2

 Graph of the continuous uniform distribution:
 The values a and b on the horizontal axis
represent the lower and upper limits, respectively.
 The height of the
distribution does not
directly represent a
probability.
 It is the area under
f(x) that corresponds
to probability.
1-220
LO 6.2

 Example: Based on historical data, sales for a
particular cosmetic line follow a continuous uniform
distribution with a lower limit of $2,500 and an upper
limit of $5,000.
 What are the mean and standard deviation of this uniform
distribution?
 Let the lower limit a = $2,500 and the upper limit
b = $5,000, then
1-221
LO 6.2

 What is the probability that sales exceed $4,000?
 P(X > 4,000) = base × height =
(5,000  4,000)  (1/ (5,000  2,500)  1,000  0.0004  0.4
1-222
6.2 The Normal Distribution
LO 6.3 Explain the characteristics of the normal distribution.
 The Normal Distribution

 Symmetric
 Bell-shaped
 Closely approximates the probability distribution
of a wide range of random variables, such as the
 Heights and weights of newborn babies
 Scores on SAT
 Cumulative debt of college graduates
 Serves as the cornerstone of statistical inference.
1-223
LO 6.3 6.2 The Normal Distribution
 Characteristics of the Normal Distribution
 Symmetric about its mean
 Mean = Median = Mode
 Asymptotic—that is, the

tails get closer and
closer to the
horizontal axis, P(X < m) = 0.5 P(X > m) = 0.5
but never touch it.
m x
1-224
 Characteristics of the Normal Distribution
 The normal distribution is completely described
by two parameters: m and 2.
 m is the population mean which describes the
central location of the distribution.
 2 is the population variance which describes
the dispersion of the distribution.
1-225
 Probability Density Function of the Normal
Distribution
 For a random variable X with mean m and
variance 2
  m  
2
1 x 
f x  exp   
 2  2 2

 
where   3.14159 and exp  x   e x
e  2.718 is the base of the natural logarithm
1-226
 Example: Suppose the ages of employees in
Industries A, B, and C are normally distributed.
 Here are the relevant parameters:
 Let‘s compare industries using the Normal curves.
 is the same, m is different. m is the same,  is different.
1-227
6.2 The Normal Distribution
LO 6.4 Use the standard normal table or the z table.
 The Standard Normal (Z) Distribution.

 A special case of the normal distribution:
 Mean (m) is equal to zero (E(Z) = 0).
 Standard deviation () is equal to one

(SD(Z) = 1).
1-228
LO 6.4 6.2 The Standard Normal Distribution
 Standard Normal Table (Z Table).
 Gives the cumulative probabilities P(Z < z) for
positive and negative values of z.
 Since the random variable Z is symmetric about
its mean of 0,
P(Z < 0) = P(Z > 0) = 0.5.
 To obtain the P(Z < z), read down the z column
first, then across the top.
1-229
 Standard Normal Table (Z Table).
Table for positive z values.
Table for negative z values.
1-230
 Finding the Probability for a Given z Value.
 Transform normally distributed random variables into
standard normal random variables and use the z table
to compute the relevant probabilities.
 The z table provides cumulative probabilities
P(Z < z) for a given z.
Portion of right-hand page of z table.
If z = 1.52, then look up
1-231
 Finding the Probability for a Given z Value.
 Remember that the z table provides cumulative
probabilities P(Z < z) for a given z.
 Since z is negative, we can look up this
probability from the left-hand page of the z table.
Portion of left-hand page of Z Table.

If z = 1.96, then look up
1-232
 Example: Finding Probabilities for a Standard
Normal Random Variable Z.
 Find P(1.52 < Z < 1.96) =
P(Z < 1.96)  P(Z < 1.52 ) =
P(Z < 1.96) = 0.9750
P(Z < 1.52 ) = 0.0643 0.9750  0.0643 = 0.9107
1-233
 Example: Finding a z value for a given
probability.
 For a standard normal variable Z, find the z
values that satisfy P(Z < z) = 0.6808.
 Go to the standard normal table and find 0.6808
in the body of the table.
 Find the corresponding
z value from the
row/column of z.
 z = 0.47.
1-234
 Revisiting the Empirical Rule.
P  3  Z  3 
P  2  Z  2
P  1  Z  1
1-235
 Example: The Empirical Rule
 An investment strategy has an expected return of
4% and a standard deviation of 6%. Assume that
investment returns are normally distributed.
 What is the probability of earning a return greater
than 10%?
 A return of 10% is one standard deviation
above the mean, or 10 = m + 1 = 4 + 6.
 Since about 68% of observations fall within
one standard deviation of the mean, 32%
(100%  68%) are outside the range.
1-236
 Example: The Empirical Rule
 An investment strategy has an expected return of
4% and a standard deviation of 6%. Assume that
investment returns are normally distributed.
 What is the probability of earning a return greater
than 10%?
 Using symmetry, we
16% 16%
conclude that 16%
(half of 32%) of the 68%
observations are
greater than 10%.
2
(m  )
1-237
6.3 Solving Problems with the Normal
Distribution
LO 6.5 Calculate and interpret probabilities for a random variable that follows
the normal distribution.
 The Normal Transformation

 Any normally distributed random variable X with
mean m and standard deviation  can be
transformed into the standard normal random
variable Z as:
X m xm
Z with corresponding values z 
 
As constructed: E(Z) = 0 and SD(Z) = 1.
1-238
6.3 Solving Problems with the
LO 6.5
Normal Distribution
 A z value specifies by how many standard
deviations the corresponding x value falls
above (z > 0) or below (z < 0) the mean.
 A positive z indicates by how many standard
deviations the corresponding x lies above m.
 A zero z indicates that the corresponding x
equals m.
 A negative z indicates by how many standard
deviations the corresponding x lies below m.
1-239
LO 6.5
Normal Distribution
 Use the Inverse Transformation to compute
probabilities for given x values.
 A standard normal variable Z can be transformed
to the normally distributed random variable X with
mean m and standard deviation  as
X  m  Z with corresponding values x  m  z
1-240
LO 6.5
Normal Distribution
 Example: Scores on a management aptitude exam
are normally distributed with a mean of 72 (m) and a
standard deviation of 8 ().
 What is the probability that a randomly selected
manager will score above 60?
 First transform the random variable X to Z using the
transformation formula: x  m 60  72
z   1.5
 8
 Using the standard normal table, find
 P(Z > 1.5) = 1  P(Z < 1.5) = 1  0.0668 = 0.9332
1-241
LO 6.5
Normal Distribution
 Example:
1-242
LO 6.5
Normal Distribution
 Example:
1-243
6.4 Other Continuous Probability
Distributions
the exponential distribution.
 The Exponential Distribution
 A random variable X follows the exponential distribution if
its probability density function is:
f  x   e   x
for x  0 1
and E  X   SD  X  
where  is the rate parameter 
e  2.718
 The cumulative distribution
P  X  x   1 e x
function is:
1-244
LO 6.66.4 Other Continuous Probability
Distributions
 The exponential distribution is based entirely on
one parameter,  > 0, as illustrated below.
1-245
Distributions
 Example
1-246
6.4 Other Continuous Probability
Distributions
the lognormal distribution.
 The Lognormal Distribution

 Defined for a positive random variable, the
lognormal distribution is positively skewed.
 Useful for describing variables such as
 Income
 Real estate values
 Asset prices
 Failure rate may increase or decrease over time.
1-247
Distributions
 Let X be a normally distributed random variable with
mean m and standard deviation . The random
variable Y = eX follows the lognormal distribution
with a probability density function as
  ln  y   m  
2
1
f y   exp    for y  0,
y 2  2 2

 
where  equals approximately 3.14159
exp  x   e x is the exponential function
e  2.718
1-248
Distributions
 The graphs below show the shapes of the
lognormal density function based on various values
of .
 The lognormal
distribution is
clearly positively
skewed for  > 1.
For  < 1, the
lognormal
distribution
somewhat
resembles the normal distribution.
1-249
Distributions
 The
1-250
Distributions
 Expected values and standard deviations of
the lognormal and normal distributions.
 Let X be a normal random variable with mean m
and standard deviation  and let Y = eX be the
corresponding lognormal variable. The mean
mY and standard deviation Y of Y are derived as
 2m   2 
mY  exp  
 2 
Y   
exp  2   1 exp  2m   2 
1-251
Distributions
 Expected values and standard deviations of
the lognormal and normal distributions.
 Equivalently, the mean and standard deviation of
the normal variable X = ln(Y) are derived as
 mY2 
m  ln  
 m2   2 
 Y Y 
  Y2 
  ln  1  2 
 mY 
1-252
Appendix A Table 1. Standard Normal Curve
1-253
Appendix A Table 2. Standard Normal Curve
1-254
LO 7.1: Differentiate between a population
parameter and a sample statistic.
LO 7.2: Explain common sample biases.
LO 7.3: Describe simple random sampling.
LO 7.4: Distinguish between stratified random
sampling and cluster sampling.
LO 7.5: Describe the properties of the sampling
distribution of the sample mean.
1-256
LO 7.6: Explain the importance of the central
limit theorem.
LO 7.7: Describe the properties of the sampling
distribution of the sample proportion.
LO 7.8: Use a finite population correction
factor.
LO 7.9: Construct and interpret control charts
for quantitative and qualitative data.
1-257
Marketing Iced Coffee
 In order to capitalize on the iced coffee trend,
Starbucks offered for a limited time half-priced
Frappuccino beverages between 3 pm and 5 pm.
 Anne Jones, manager at a local Starbucks,
determines the following from past historical data:
 43% of iced-coffee customers were women.
 21% were teenage girls.
 Customers spent an average of $4.18 on iced

coffee with a standard deviation of $0.84.
1-258
Marketing Iced Coffee
 One month after the marketing period ends, Anne
surveys 50 of her iced-coffee customers and finds:
 46% were women.
 34% were teenage girls.
 They spent an average of $4.26 on the drink.
 Anne wants to use this survey information to
calculate the probability that:
 Customers spend an average of $4.26 or more on iced
coffee.
 46% or more of iced-coffee customers are women.
 34% or more of iced-coffee customers are teenage girls.
1-259
7.1 Sampling
LO 7.1 Differentiate between a population parameter and sample statistic.
 Population—consists of all items of interest

in a statistical problem.
 Population Parameter is unknown.
 Sample—a subset of the population.
 Sample Statistic is calculated from sample and
used to make inferences about the population.
 Bias—the tendency of a sample statistic to
systematically over- or underestimate a
population parameter.
1-260
7.1 Sampling
LO 7.2 Explain common sample biases.
 Classic Case of a ―Bad‖ Sample: The Literary

Digest Debacle of 1936
 During the1936 presidential election, the Literary Digest
predicted a landslide victory for Alf Landon over Franklin
D. Roosevelt (FDR) with only a 1% margin of error.
 They were wrong! FDR won in a landslide election.
 The Literary Digest had committed selection bias by
randomly sampling from their own subscriber/
membership lists, etc.
 In addition, with only a 24% response rate, the Literary
Digest had a great deal of non-response bias.
1-261
LO 7.2 7.1 Sampling
 Selection bias—a systematic exclusion of certain
groups from consideration for the sample.
 The Literary Digest committed selection bias by excluding
a large portion of the population (e.g., lower income
voters).
 Nonresponse bias—a systematic difference in
preferences between respondents and non-
respondents to a survey or a poll.
 The Literary Digest had only a 24% response rate. This
indicates that only those who cared a great deal about the
election took the time to respond to the survey. These
respondents may be atypical of the population as a whole.
1-262
7.1 Sampling
LO 7.3 Describe simple random sampling.
 Sampling Methods
 Simple random sample is a sample of n
observations which has the same probability of
being selected from the population as any other
sample of n observations.
 Most statistical methods presume simple
random samples.
 However, in some situations other sampling
methods have an advantage over simple
random samples.
1-263
LO 7.3 7.1 Sampling
 Example: In 1961, students invested 24 hours per
week in their academic pursuits, whereas today‘s
students study an average of 14 hours per week.
 A dean at a large university in California wonders if this
trend is reflective of the students at her university. The
university has 20,000 students and the dean would like a
sample of 100. Use Excel to draw a simple random
sample of 100 students.
 In Excel, choose
Formulas > Insert function >
RANDBETWEEN and input
the values shown here.
1-264
7.1 Sampling
LO 7.4 Distinguish between stratified random sampling and cluster sampling.
 Stratified Random Sampling

 Divide the population into mutually exclusive and
collectively exhaustive groups, called strata.
 Randomly select observations from each stratum,
which are proportional to the stratum‘s size.
 Advantages:
 Guarantees that the each population subdivision is
represented in the sample.
 Parameter estimates have greater precision than those
estimated from simple random sampling.
1-265
LO 7.4 7.1 Sampling
 Cluster Sampling
 Divide population into mutually exclusive and
collectively exhaustive groups, called clusters.
 Randomly select clusters.
 Sample every observation in those randomly
selected clusters.
 Advantages and disadvantages:
 Less expensive than other sampling methods.
 Less precision than simple random sampling or
stratified sampling.
 Useful when clusters occur naturally in the population.
1-266
LO 7.4 7.1 Sampling
 Stratified versus Cluster Sampling
 Stratified Sampling  Cluster Sampling
 Sample consists of  Sample consists of

elements from elements from the
each group. selected groups.
 Preferred when the  Preferred when

objective is to the objective is to
increase precision. reduce costs.
1-267
7.2 The Sampling Distribution of the Means
LO 7.5 Describe the properties of the sampling distribution of the
sample mean.
 Population is described by parameters.
 A parameter is a constant, whose value may be unknown.
 Only one population.
 Sample is described by statistics.
 A statistic is a random variable whose value depends on
the chosen random sample.
 Statistics are used to make inferences about the
population parameters.
 Can draw multiple random samples of size n.
1-268
7.2 The Sampling Distribution of the
LO 7.5
Sample Mean
 Estimator
 A statistic that is used to estimate a population
parameter.
 For example, X , the mean of the sample, is an
estimator of m, the mean of the population.
 Estimate
 A particular value of the estimator.
 For example, the mean of the sample x is an
estimate of m, the mean of the population.
1-269
LO 7.5
Sample Mean
 Sampling Distribution of the Mean X
 Each random sample of size n drawn from the
population provides an estimate of m—the sample
mean x .
 Drawing many samples of size n results in many
different sample means, one for each sample.
 The sampling distribution of the mean is the
frequency or probability distribution of these
sample means.
1-270
LO 7.5
Sample Mean One simple random sample

 Example drawn from the population—a
single distribution of values of X.
Random Variable
X1 X2 X3 X4
Mean of
X
A distribution of means
6
5
10
10
8
4
4
3
5.57
5.71
from each random
1 8 4 3 6.36 draw from the
4 1 6 2 4.07
6 6 8 4 population—a sampling
7 7 8 6
1 5 10 5
distribution.
5 5 9 1
4 6 4 2 Means from each
7 4 9 5
8 5 8 6 distribution (random
9 2 7 7
9 1 2 3 draw) from the
Means
6
5.57
10
5.71
2
6.36
6
4.07 5.43
population.
1-271
LO 7.5
Sample Mean
 The Expected Value and Standard Deviation
of the Sample Mean
 Expected Value
 The expected value of X,
EX  m
 The expected value of the mean,
 
E X  EX  m
1-272
LO 7.5
Sample Mean
 The Expected Value and Standard Deviation
of the Sample Mean
 Variance of X Var  X    2
 Standard Deviation
 of X SD  X    2  
Where n is the sample size.



  n Also known as the Standard
of X SD X 
Error of the Mean.
1-273
LO 7.5
Sample Mean
 Example: Given that m = 16 inches and  = 0.8
inches, determine the following:
 What is the expected value and the standard
deviation of the sample mean derived from a
random sample of
 0.8
 2 pizzas E  X   m  16 SD  X     0.57
n 2

 
 4 pizzas E  X   m  16 SD X 
n

0.8
4
 0.40
1-274
LO 7.5
Sample Mean
 Sampling from a Normal Distribution
 For any sample size n, the sampling distribution
of X is normal if the population X from which the
sample is drawn is normally distributed.
 If X is normal, then we can transform it into the
standard normal random variable as:
For a sampling For a distribution of
distribution. the values of X.
Z
   X m
X E X x EX xm
Z 
SD  X   n SD  X  
1-275
LO 7.5
Sample Mean Random Standard

Variable Normal
X-bar Z x1  m
x1 3 -2.39  z1 
Note that each x2 9 4.30  n
4 -1.28
value x on X has 
2 -3.51

a corresponding 10 5.42
5 -0.16
value z on Z 9 4.30
4 -1.28
given by the 9 4.30
transformation 2 -3.51
3 -2.39
formula shown 8 3.19
here as indicated 4 -1.28 x13  m

x13 0 -5.74  z13 
by the arrows. Means 5.14 0.00  n
Standard
Error 0.90 1.00
1-276
LO 7.5
Sample Mean
 Example: Given that m = 16 inches and  = 0.8
inches, determine the following:
 What is the probability that a randomly selected pizza is
less than 15.5 inches?
 x  m 15.5  16 P ( X  15.5)  P (Z  0.63)
Z   0.63
 0.8  0.2643 or 26.43%
 What is the probability that 2 randomly selected pizzas

average less than 15.5 inches?
 x  m 15.5  16 P ( X  15.5)  P (Z  0.88)
Z   0.88
 n 0.8 2  0.1894 or 18.94%
1-277
Sample Mean
LO 7.6 Explain the importance of the central limit theorem.
 The Central Limit Theorem
 For any population X with expected value m and standard
deviation , the sampling distribution of X will be
approximately normal if the sample size n is sufficiently
large.
 As a general guideline, the normal distribution
approximation is justified when n > 30.
 As before, if X is approximately X m
normal, then we can transform it to Z 
 n
1-278
LO 7.6
Sample Mean
 The Central Limit Theorem
Sampling distribution of X when the Sampling distribution of X when the

population has a uniform distribution. population has an exponential distribution.
1-279
LO 7.6
Sample Mean
 Example: From the introductory case, Anne wants
to determine if the marketing campaign has had a
lingering effect on the amount of money customers
spend on iced coffee.
 Before the campaign, m = $4.18 and  = $0.84. Based on 50
customers sampled after the campaign, m = $4.26.
 Let‘s find P  X  4.26
.  Since n > 30, the central limit
theorem states that X is approximately normal. So,
 X m  4.26  4.18 
 
P X  4.26  P  Z 
 n
  P Z  
  0.84 50 
 P  Z  0.67   1  0.7486  0.2514
1-280
Sample Proportion
LO 7.7 Describe the properties of the sampling distribution of the sample
proportion.
 Estimator
 Sample proportion P is used to estimate the
population parameter p.
 Estimate
 A particular value of the estimator p .
1-281
LO 7.7
Sample Proportion
 The Expected Value and Standard
Deviation of the Sample Proportion
 Expected Value
 The expected value of P,
 
E P p
 The standard deviation of P,
p 1  p 
 
SD P 
n
1-282
LO 7.7
Sample Proportion
 The Central Limit Theorem for the Sample
Proportion
 For any population proportion p, the sampling
distribution of P is approximately normal if the
sample size n is sufficiently large .
 As a general guideline, the normal distribution
approximation is justified when
np > 5 and n(1  p) > 5.
1-283
LO 7.7
Sample Proportion
Proportion
 If P is normal, we can transform it into the
standard normal random variable as
Z
 
P E P Pp
SD  P  p 1  p 
n
 Therefore any value p on P pp
Z
has a corresponding value p 1  p 
z on Z given by n
1-284
LO 7.7
Sample Proportion
Proportion
Sampling distribution of P Sampling distribution of P

when the population proportion when the population proportion
is p = 0.10. is p = 0.30.
1-285
LO 7.7
Sample Proportion
 Example: From the introductory case, Anne wants

to determine if the marketing campaign has had a
lingering effect on the proportion of customers who
are women and teenage girls.
 Before the campaign, p = 0.43 for women and p = 0.21 for
teenage girls. Based on 50 customers sampled after the
campaign, p = 0.46 and p = 0.34, respectively.
 Let‘s findP  P  0.46  . Since n > 30, the central limit
theorem states thatP is approximately normal.
1-286
LO 7.7
Sample Proportion
   
   
 0.46  0.43
  
P P  0.46  P Z 

p p
p 1  p 



P Z 
 0.43 1  0.43 


   
 n   50 
 P  Z  0.43   1  0.6664  0.3336
1-287
7.4 The Finite Population Correction
Factor
LO 7.8 Use a finite population correction factor.
 The Finite Population Correction Factor

 Used to reduce the sampling variation of X .
 The resulting standard deviation is
  N n 
 
SD X  
n N  1


 The transformation of X to Z is made

accordingly.
1-288
LO 7.8
Factor
 The Finite Population Correction Factor for
the Sample Proportion
 Used to reduce the sampling variation of the
sample proportion P .
 The resulting standard deviation is
p 1  p   N  n 
 
SD P 
n

N  1

 
 The transformation of P to Z is made

accordingly.
1-289
LO 7.8
Factor
 Example: A large introductory marketing class with
340 students has been divided up into 10 groups.
Connie is in a group of 34 students that averaged
72 on the midterm. The class average was 73 with
a standard deviation of 10.
 The population parameters are: m = 73 and  = 10.
 E  X   m  73 but since n = 34 is more than 5% of the
population size N = 340, we need to use the finite
population correction factor.
  N  n  10  340  34 
 
SD X  
n N  1
     1.63
340  1 
 34 
1-290
7.5 Statistical Quality Control
LO 7.9 Construct and interpret control charts for quantitative and
qualitative data.
 Statistical Quality Control

 Involves statistical techniques used to develop
and maintain a firm‘s ability to produce high-
quality goods and services.
 Two Approaches for Statistical Quality Control
 Acceptance Sampling
 Detection Approach
1-291
LO 7.9 7.5 Statistical Quality Control
 Acceptance Sampling
 Used at the completion of a production process or
service.
 If a particular product does not conform to certain
specifications, then it is either discarded or
repaired.
 Disadvantages
 It is costly to discard or repair a product.
 The detection of all defective products is not

guaranteed.
1-292
 Detection Approach
 Inspection occurs during the production process
in order to detect any nonconformance to
specifications.
 Goal is to determine whether the production
process should be continued or adjusted before
producing a large number of defects.
 Types of variation:
 Chance variation.
 Assignable variation.
1-293
 Types of Variation
 Chance variation (common variation) is:
 Caused by a number of randomly occurring events that
are part of the production process.
 Not controllable by the individual worker or machine.
 Expected, so not a source of alarm as long as its
magnitude is tolerable and the end product meets
specifications.
 Assignable variation (special cause variation) is:
 Caused by specific events or factors that can usually be
indentified and eliminated.
 Identified and corrected or removed.
1-294
 Control Charts
 Developed by Walter A. Shewhart.
 A plot of calculated statistics of the production
process over time.
 Production process is ―in control‖ if the calculated
statistics fall in an expected range.
 Production process is ―out of control‖ if calculated
statistics reveal an undesirable trend.
 For quantitative data— x chart.
 For qualitative data— p chart.
1-295
 Control Charts for Quantitative Data
 x Control Charts
 Centerline—the mean when the process is
under control.
 Upper control limit—set at +3 from the mean.
 Points falling above the upper control limit are
considered to be out of control.
 Lower control limit—set at −3 from the mean.
 Points falling below the lower control limit are
1-296
 Control Charts for Quantitative Data
 x Control Charts
 Upper control
limit (UCL):

m 3 UCL
n
Sample means Centerline
 Lower control LCL
limit (LCL):

m 3
n Process is in control—all points
fall within the control limits.
1-297
 Control Charts for Qualitative Data
 p chart (fraction defective or percent defective chart).
 Tracks proportion of defects in a production process.
 Relies on central limit theorem for normal approximation
for the sampling distribution of the sample proportion.
 Centerline—the mean when the process is under control.
 Upper control limit—set at +3 from the centerline.
 Points falling above the upper control limit are
 Lower control limit—set at −3 from the centerline.
 Points falling below the lower control limit are
1-298
 Control Charts for Qualitative Data
 p Control Charts
 Upper control
limit (UCL):
p 1  p  UCL
p3
Sample proportion
n Centerline
 Lower control LCL

limit (LCL):
p 1  p 
p 3
n Process is out of control—some
points fall above the UCL.
1-299
LO 8.1: Discuss point estimators and their desirable properties.
LO 8.2: Explain an interval estimator.
LO 8.3: Calculate a confidence interval for the population mean when
the population standard deviation is known.
LO 8.4: Describe the factors that influence the width of a confidence
interval.
LO 8.5: Discuss features of the t distribution.
LO 8.6: Calculate a confidence interval for the population mean when
the population standard deviation is not known.
LO 8.7: Calculate a confidence interval for the population proportion.
LO 8.8: Select a sample size to estimate the population mean and the
population proportion.
1-301
Fuel Usage of “Ultra-Green” Cars
 A car manufacturer advertises that its new
―ultra-green‖ car obtains an average of 100 mpg
and, based on its fuel emissions, has earned an
A+ rating from the Environmental Protection
Agency.
 Pinnacle Research, an independent consumer
advocacy firm, obtains a sample of 25 cars for
testing purposes.
 Each car is driven the same distance in identical
conditions in order to obtain the car‘s mpg.
1-302
Fuel Usage of “Ultra-Green” Cars
 The mpg for each ―Ultra-Green‖ car is given below.
 Jared would like to use the data in this sample to:

 Estimate with 90% confidence
 The mean mpg of all ultra-green cars.
 The proportion of all ultra-green cars that obtain over

100 mpg.
 Determine the sample size needed to achieve a specified
level of precision in the mean and proportion estimates.
1-303
8.1 Point Estimators and Their Properties
LO 8.1 Discuss point estimators and their desirable properties.
 Point Estimator
 A function of the random sample used to make
inferences about the value of an unknown population
parameter.
 For example, X is a point estimator for m and P is a point
estimator for p.
 Point Estimate
 The value of the point estimator derived from a given
sample.
 For example,x  96.5 is a point estimate of the mean mpg
for all ultra-green cars.
1-304
LO 8.1 8.1 Point Estimators and Their Properties
 Example:
1-305
 Properties of Point Estimators
 Unbiased
 An estimator is unbiased if its expected value equals
the unknown population parameter being estimated.
 Efficient
 An unbiased estimator is efficient if its standard error is
lower than that of other unbiased estimators.
 Consistent
 An estimator is consistent if it approaches the unknown
population parameter being estimated as the sample
size grows larger.
1-306
 Properties of Point Estimators Illustrated:
Unbiased Estimators
 The distributions of unbiased (U1) and biased (U2)
estimators.
1-307
Efficient Estimators
 The distributions of efficient (V1) and less efficient
(V2) estimators.
1-308
Consistent Estimator
 The distribution of a consistent estimator X
for various sample sizes.
1-309
8.2 Confidence Interval of the Population
Mean When  Is Known
LO 8.2 Explain an interval estimator.
 Confidence Interval—provides a range of
values that, with a certain level of confidence,
contains the population parameter of interest.
 Also referred to as an interval estimate.
 Construct a confidence interval as:
Point estimate ± Margin of error.
 Margin of error accounts for the variability of the
estimator and the desired confidence level of the
interval.
1-310
LO 8.3 Calculate a confidence interval for the population mean when the
population standard deviation is known.
 Constructing a Confidence Interval for m

When  is Known
 Consider a standard normal random variable Z.
 P  1.96  Z  1.96   0.95
as illustrated here.
1-311
LO 8.3

When  is Known
Since X m
 Z
 n
 X m 
 We get P  1.96   1.96   0.95
  n 
 Which, after algebraically manipulating, is

equal to P  m  1.96 n  X  m  1.96 n   0.95
1-312
LO 8.3

When  is Known
 Note that 
P m  1.96 n  X  m  1.96 
n  0.95
implies there is a 95% probability that the sample

mean X will fall within the interval m  1.96 n
 Thus, if samples of size n are drawn repeatedly
from a given population, 95% of the computed
sample means, x 's , will fall within the interval
and the remaining 5% will fall outside the interval.
1-313
LO 8.3

When  is Known
 Since we do not know m, we cannot determine if a
particular x falls within the interval or not.
 However, we do know that x will fall within the
interval m  1.96 n if and only if m falls within the
interval x  1.96 n .
 This will happen 95% of the time given the
interval construction. Thus, this is a 95%
confidence interval for the population mean.
1-314
LO 8.3

When  is Known
 Level of significance (i.e., probability of error) = a.
 Confidence coefficient = (1  a)
a = 1  confidence coefficient
 A 100(1-a)% confidence interval of the population
mean m when the standard deviation  is known
is computed as x  za 2  n
or equivalently,  x  za 2  n , x  za 2  n .
 
1-315
LO 8.3
When  is Known
 za/2 is the z value associated
with the probability of a/2
in the upper-tail.
 x  za 2  n , x  za 2  n
 
 Confidence Intervals:
 90%, a = 0.10, a/2 = 0.05, za/2 = z.05 = 1.645.
 95%, a = 0.05, a/2 = 0.025, za/2 = z.025 = 1.96.
 99%, a = 0.01, a/2 = 0.005, za/2 = z.005 = 2.575.
1-316
LO 8.3

 Example: Constructing a Confidence Interval
for m When  is Known
 A sample of 25 cereal boxes of Granola Crunch, a
generic brand of cereal, yields a mean weight of
1.02 pounds of cereal per box.
 Construct a 95% confidence interval of the mean
weight of all cereal boxes.
 Assume that the weight is normally distributed
with a population standard deviation of 0.03
pounds.
1-317
LO 8.3

When  is Known
 This is what we know: n  25, x  1.02 pounds
a = 1  .95   .05, za 2  1.96
  0.03
 Substituting these values, we get
x  1.96  
n  1.02  1.96 0.03 
25  1.02  0.012
or, with 95% confidence, the mean weight of all

cereal boxes falls between 1.008 and 1.032
pounds.
1-318
LO 8.3

 Interpreting a Confidence Interval
 Interpreting a confidence interval requires care.
 Incorrect: The probability that m falls in the
interval is 0.95.
 Correct: If numerous samples of size n are drawn
from a given population, then 95% of the intervals
formed by the formula x  za 2  n will contain m.
 Since there are many possible samples, we
will be right 95% of the time, thus giving us
95% confidence.
1-319
LO 8.4 Describe the factors that influence the width of a confidence interval.
 The Width of a Confidence Interval

 Margin of Error Confidence Interval Width
za 2  n 
2 za 2  n 
 The width of the confidence interval is
influenced by the:
 Sample size n.
 Standard deviation .
 Confidence level 100(1  a)%.
1-320
LO 8.38.2 Confidence Interval of the Population
 The Width of a Confidence Interval is influenced by:
I. For a given confidence level 100(1  a)% and sample size
n, the width of the interval is wider, the greater the
population standard deviation .
 Example: Let the standard deviation of the population of
cereal boxes of Granola Crunch be 0.05 instead of 0.03.
Compute a 95% confidence interval based on the same
sample information.
x  za 2  
n  1.02  1.96 0.05 
25  1.02  0.20
 This confidence interval width has increased from 0.024 to

2(0.020) = 0.040.
1-321
II. For a given confidence level 100(1  a)% and population
standard deviation , the width of the interval is wider, the
smaller the sample size n.
 Example: Instead of 25 observations, let the sample be
based on 16 cereal boxes of Granola Crunch. Compute a
95% confidence interval using a sample mean of 1.02
pounds and a population standard deviation of 0.03.
x  za 2  
n  1.02  1.96 0.03 
16  1.02  0.015

2(0.015) = 0.030.
1-322
III. For a given sample size n and population standard
deviation , the width of the interval is wider, the greater
the confidence level 100(1  a)%.
 Example: Instead of a 95% confidence interval, compute a
99% confidence interval based on the information from the
sample of Granola Crunch cereal boxes.
x  za 2  
n  1.02  2.575 0.03 
25  1.02  0.015

2(0.015) = 0.030.
1-323
LO 8.3

 Example:
1-324
Mean When  Is Unknown
LO 8.5 Discuss features of the t distribution.
 The t Distribution
 If repeated samples of size n are taken from a
normal population with a finite variance, then the
statistic T follows the t distribution X m
with (n  1) degrees of freedom, df. T 
S n
 Degrees of freedom determine
the extent of the broadness of the tails of the
distribution; the fewer the degrees of freedom,
the broader the tails.
1-325
LO 8.5

 Summary of the tdf Distribution
 Bell-shaped and symmetric around 0 with
asymptotic tails (the tails get closer and closer to
the horizontal axis, but never touch it).
 Has slightly broader tails than the z distribution.
 Consists of a family of distributions where the

actual shape of each one depends on the df. As
df increases, the tdf distribution becomes similar to
the z distribution; it is identical to the z distribution
when df approaches infinity.
1-326
LO 8.5

 The tdf Distribution with Various Degrees of
Freedom
1-327
LO 8.5

 Example: Compute ta,df for a = 0.025 using 2, 5,
and 50 degrees of freedom.
 Solution: Turning to the Student‘s t Distribution
table in Appendix A, we find that
 For df = 2, t0.025,2 = 4.303.
 For df = 5, t0.025,5 = 2.571.
 For df = 50, t0.025,50 = 2.009.
 Note that the tdf values change with the degrees

of freedom. Further, as df increases, the tdf
distribution begins to resemble the z distribution.
1-328
LO 8.6 Calculate a confidence interval for the population mean when
the population standard deviation is not known.

When  is Unknown
 A 100(1  a)% confidence interval of the
population mean m when the population standard
deviation  is not known is computed as
x  ta 2,df s n or equivalently,  x  ta 2,df s n , x  ta 2,df s n 
 
where s is the sample standard deviation.
1-329
LO 8.6

 Example: Recall that Jared Beane wants to
estimate the mean mpg of all ultra-green cars. Use
the sample information to construct a 90%
confidence interval of the population mean. Assume
that mpg follows a normal distribution.
 Solution: Since the population standard deviation
is not known, the sample standard deviation has
to be computed from the sample. As a result, the
90% confidence interval is
x  ta 2,df s 
n  96.52  1.711 10.70 
25  96.52  3.66
1-330
LO 8.6

 Using Excel to construct confidence intervals. The
easiest way to estimate the mean when the population
standard deviation is unknown is as follows:
 Open the MPG data file.
 From the menu choose Data >
Data Analysis > Descriptive
Statistics > OK.
 Specify the values as shown
here and click OK.
 Scroll down through the output
until you see the Confidence
Interval.
1-331
Proportion
LO 8.7 Calculate a confidence interval for the population proportion.
 Let the parameter p represent the proportion

of successes in the population, where
success is defined by a particular outcome.
 P is the point estimator of the population
proportion p.
 By the central limit theorem, P can be
approximated by a normal distribution for
large samples (i.e., np > 5 and n(1  p) > 5).
1-332
LO 8.7
Proportion
 Thus, a 100(1a)% confidence interval of the
population proportion is
p 1  p   p 1  p  p 1  p  
p  za 2 or  p  za 2 , p  za 2 
n  n n 
where p is used to estimate the population

parameter p.
1-333
LO 8.7
Proportion
estimate the proportion of all ultra-green cars that
obtain over 100 mpg. Use the sample information to
construct a 90% confidence interval of the
 Solution: Note that p  7 25  0.28. In addition,
the normality assumption is met since np > 5 and
n(1  p) > 5. Thus,
p 1  p  0.28 1  0.28 
p  za 2 =0.28  1.645  0.28  0.148
n 28
1-334
8.5 Selecting a Useful Sample Size
LO 8.8 Select a sample size to estimate the population mean and the
 Precision in interval estimates is implied by a

low margin of error.
 The larger n reduces the margin of error for
the interval estimates.
 How large should the sample size be for a
given margin of error?
1-335
LO 8.8 8.5 Selecting a Useful Sample Size
 Selecting n to Estimate m
 Consider a confidence interval for m with a
known  and let D denote the desired margin of
error.
 Since D  za 2  n
 za 2 
2
we may rearrange to get n   

 D 
 If  is unknown, estimate it with ˆ .
1-336
 Selecting n to Estimate m
 For a desired margin of error D, the minimum
sample size n required to estimate a 100(1 a)%
confidence interval of the population mean m is
 za 2ˆ 
2
n 
 D 
Where ˆ is a reasonable estimate of  in the

planning stage.
1-337
construct a 90% confidence interval of the mean
mpg of all ultra-green cars.
 Suppose Jared would like to constrain the margin of error
to within 2 mpg. Further, the lowest mpg in the population
is 76 mpg and the highest is 118 mpg.
 How large a sample does Jared need to compute the 90%
confidence interval of the population mean?

2
 a 2   1.645  10.50 
ˆ 2
z
n     74.58 or 75
 D   2 
1-338
 Selecting n to Estimate p
 Consider a confidence interval for p and let D
denote the desired margin of error.
 Since
Dz

p 1 p where p is the
sample proportion
a 2
n
2
 za 2 
we may rearrange to get n    p 1  p 
 D 
 Since p comes from a sample, we must use a
reasonable estimate of p, that is, p̂ .
1-339
 Selecting n to Estimate p
 For a desired margin of error D, the minimum
sample size n required to estimate a 100(1  a)%
confidence interval of the population proportion
p is
2
 za 2 
n  pˆ 1  pˆ 
 D 
 Where p̂ is a reasonable estimate of p in the
planning stage.
1-340
construct a 90% confidence interval of the
proportion of all ultra-green cars that obtain over
100 mpg.
 Jared does not want the margin of error to be more than
0.10.
 How large a sample does Jared need for his analysis of
the population proportion?
2
 a2
2
z  1.645 
n  pˆ 1  pˆ     0.50 1  0.50   67.65 or 68
 D   0.10 
1-341
LO 9.1: Define the null hypothesis and the alternative
hypothesis.
LO 9.2: Distinguish between Type I and Type II errors.
LO 9.3: Explain the steps of a hypothesis test using
the p-value approach.
LO 9.4: Explain the steps of a hypothesis test using
the critical value approach.
LO 9.5: Differentiate between the test statistics for the
population mean.
LO 9.6: Specify the test statistic for the population
proportion.
1-343
Undergraduate Study Habits
 Are today‘s college students studying hard or
hardly studying?
 A recent study asserts that over the past five
decades the number of hours that the average
college student studies each week has been
steadily dropping (The Boston Globe, July 4, 2010).
 In 1961, students invested 24 hours per week in
their academic pursuits, whereas today‘s students
study an average of 14 hours per week.
1-344
Undergraduate Study Habits
 As dean of a large university in California, Susan
Knight wonders if the study trend is reflective of
students at her university.
 Susan randomly selected 35 students to ask about
their average study time per week. Using these
results, Susan wants to
1. Determine if the mean study time of students at her
university is below the 1961 national average of 24 hours
per week.
2. Determine if the mean study time of students at her
university differs from today‘s national average of 14
hours per week.
1-345
9.1 Introduction to Hypothesis Testing
LO 9.1 Define the null hypothesis and the alternative hypothesis.
 Hypothesis tests resolve conflicts between

two competing opinions (hypotheses).
 In a hypothesis test, define
 H0, the null hypothesis, the presumed default
state of nature or status quo.
 HA, the alternative hypothesis, a contradiction
of the default state of nature or status quo.
1-346
LO 9.1 9.1 Introduction to Hypothesis Testing
 In statistics we use sample information to make
inferences regarding the unknown population
parameters of interest.
 We conduct hypothesis tests to determine if sample
evidence contradicts H0.
 On the basis of sample information, we either
 ―Reject the null hypothesis‖
 Sample evidence is inconsistent with H0.
 ―Do not reject the null hypothesis‖
 Sample evidence is not inconsistent with H0.
 We do not have enough evidence to ―accept‖ H0.
1-347
 Defining the Null Hypothesis and Alternative
Hypothesis
 General guidelines:
 Null hypothesis, H0, states the status quo.
 Alternative hypothesis, HA, states whatever we
wish to establish (i.e., contests the status quo).
 Use the following signs in hypothesis tests
H0 = > <  specify the status quo,

HA ≠ < >  contradict H0.
 Note that H0 always contains the ―equality.‖
1-348
 One-Tailed versus Two-Tailed Hypothesis
Tests
 Two-Tailed Test
 Reject H0 on either side of the hypothesized
value of the population parameter.
 For example:
H0: m = m0 versus HA: m ≠ m0
H0: p = p0 versus HA: p ≠ p0
 The ―≠‖ symbol in HA indicates that both tail areas
of the distribution will be used to make the
decision regarding the rejection of H0.
1-349
 One-Tailed versus Two-Tailed Hypothesis
Tests
 One-Tailed Test
 Reject H0 only on one side of the hypothesized
value of the population parameter.
 For example:
H0: m < m0 versus HA: m > m0 (right-tail test)
H0: m > m0 versus HA: m < m0 (left-tail test)
 Note that the inequality in HA determines which
tail area will be used to make the decision
regarding the rejection of H0.
1-350
 Three Steps to Formulate Hypotheses
1. Identify the relevant population parameter of
interest (e.g., m or p).
H0 HA Test Type
2. Determine whether = ≠ Two-tail
it is a one- or a > < One-tail, Left-tail
two-tailed test.
< > One-tail, Right-tail
3. Include some form of the equality sign in H0

and use HA to establish a claim.
1-351
 Example: A trade group predicts that back-to-school
spending will average $606.40 per family this year.
A different economic model is needed if the
prediction is wrong.
1. Parameter of interest is m since we are interested
in the average back-to-school spending.
2. Since we want to determine if the population
mean differs from $606.4 (i.e, ≠), it is a two-tail
test.
3. H0: m = 606.4
HA: m ≠ 606.4
1-352
 Example: A television research analyst wishes to
test a claim that more than 50% of the households
will tune in for a TV episode. Specify the null and
the alternative hypotheses to test the claim.
1. Parameter of interest is p since we are interested
in the proportion of households.
2. Since the analyst wants to determine whether p
> 0.50, it is a one-tail test.
3. H0: p < 0.50
HA: p > 0.50
1-353
9.1 Introduction to Hypothesis Testing
LO 9.2 Distinguish between Type I and Type II errors.
 Type I and Type II Errors

 Type I Error: Committed when we reject H0
when H0 is actually true.
 Occurs with probability a. a is chosen a priori.
 Type II Error: Committed when we do not
reject H0 and H0 is actually false.
 Occurs with probability b. Power of the test = 1b
 For a given sample size n, a decrease in a will
increase b and vice versa.
 Both a and b decrease as n increases.
1-354
 This table illustrates the decisions that may
be made when hypothesis testing:
 Correct Decisions:
 Reject H0 when H0 is false.
 Do not reject H0 when H0 is true.
 Incorrect Decisions:
 Reject H0 when H0 is true (Type I Error).
 Do not reject H0 when H0 is false (Type II Error).
1-355
 Example: Consider the following competing
hypotheses that relate to the court of law.
 H0: An accused person is innocent
HA: An accused person is guilty
 Consequences of Type I and Type II errors:
 Type I error: Conclude that the accused is
guilty when in reality, she is innocent.
 Type II error: Conclude that the accused is
innocent when in reality, she is guilty.
1-356
9.2 Hypothesis Test of the Population
LO 9.3 Explain the steps of a hypothesis test using the p-value approach.
 Hypothesis testing enables us to determine whether
the sample evidence is inconsistent with what is
hypothesized under the null hypothesis (H0).
 Basic principle: First assume that H0 is true and
then determine if sample evidence contradicts this
assumption.
 Two approaches to hypothesis testing:
 The p-value approach.
 The critical value approach.
1-357
LO 9.3

 The p-value Approach
 The value of the test statistic for the hypothesis
test of the population mean m when the population
standard deviation  is known is computed as
x  m0
z
 n
where m0 is the hypothesized mean value.
 p-value: the likelihood of obtaining a sample
mean that is at least as extreme as the one
derived from the given sample, under the
assumption that the null hypothesis is true.
1-358
LO 9.3

 Under the assumption that m = m0, the p-value is the
likelihood of observing a sample mean that is at least as
extreme as the one derived from the given sample.
 The calculation of the p-value depends on the
specification of the alternative hypothesis:
 Decision rule: Reject H0 if p-value < a.
1-359
LO 9.3

 Determining the p-value depending on the
specification of the competing hypotheses.
Reject H0 if p-value < a
1-360
LO 9.3

 Four Step Procedure Using The p-value
Approach
 Step 1. Specify the null and the alternative
hypotheses.
 Step 2. Specify the test statistic and compute its
value.
 Step 3. Calculate the p-value.
 Step 4. State the conclusion and interpret the
results.
1-361
LO 9.3

 Example: The p-value Approach
 Consider the following: n  25, x  71,   9
 Step 1. State the hypotheses: H0 : m  67
H A : m  67
Thus, m0 = 67
 Step 2. Given that the population is normally
distributed with a known standard deviation,
 = 9, we compute the value of the test statistic
as x  m0 71  67
z   2.22
 n 9 25
1-362
LO 9.3

 Unstandardized Normal Distribution: x  71 m0  67
Standardized Normal Distribution: z  2.22 m  0
 Step 3. Now compute the p-value:
Note that since HA: m > 67,
this is a right-tail test.
 Thus, P ( X  71)  P (Z  2.22)
 1  0.9868
 0.0132
 p-value = 0.0132
or 1.32%
1-363
LO 9.3

 p-value = 0.0132 or 1.32%
 Typically, before implementing a hypothesis test,
we choose a value for a = 0.01, 0.05, or 0.1 and
reject H0 when the p-value < a.
 Let‘s say, before conducting the study, we chose
a = 0.05.
 Step 4. Since p-value = 0.0132 < a = 0.05, we
reject H0 and conclude that the sample data
support the alternative claim that m > 67.
1-364
LO 9.4 Explain the steps of a hypothesis test using the critical value approach.
 The Critical Value Approach

 Rejection region: a region of values such that if
the test statistic falls into this region, then we
reject H0.
 The location of this region is determined
by HA.
 Critical value: a point that separates the
rejection region from the nonrejection region.
1-365
LO 9.4

 The critical value approach specifies a region such that if
the value of the test statistic falls into the region, the null
hypothesis is rejected.
 The critical value depends on the alternative hypothesis.
 Decision Rule: Reject H0 if:

z > za for a right-tailed test
z < za for a left-tailed test
z > za/2 or z < za/2 for a two-tailed test
1-366
LO 9.4

 Determining the critical value(s) depending on the
specification of the competing hypotheses.
Reject H0 if
z > za/2 or z < za/2
Reject H0 if z < za Reject H0 if z > za
1-367
LO 9.4

 Four Step Procedure Using the Critical Value
Approach
 Step 1. Specify the null and the alternative
hypotheses.
 Step 2. Specify the test statistic and compute its
value.
 Step 3. Find the critical value or values.
results.
1-368
LO 9.4

 Example: The Critical Value Approach
 Step 1. H0: m < 67, HA: m > 67
 Step 2. From previous example, z  2.22
 Step 3. Based on HA, this is a
right-tail test and for
a = 0.05, the critical value
is za = z0.05 = 1.645.
1-369
LO 9.4

Example: The Critical Value Approach
 Step 4. Reject H0 if z > 1.645.
 Since z = 2.22 > za = 1.645, the test statistic
falls in the rejection region. Therefore, we reject
H0 and conclude that the
sample data support the
alternative claim m > 67. z= 2.22
falls in the
rejection
 This conclusion is the

region.
same as that from the

p-value approach.
1-370
LO 9.4

 If z falls in the rejection region, then the p-value
must be less than a.
 If z does not fall in the
rejection region, then the
p-value must be greater
than a.
1-371
LO 9.4

 Confidence Intervals and Two-Tailed
Hypothesis Tests
 Given the significance level a, we can use the
sample data to construct a 100(1  a)%
confidence interval for the population mean m.
 Decision Rule
 Reject H0 if the confidence interval does not
contain the value of the hypothesized mean m0.
 Do not reject H0 if the confidence interval does
contain the value of the hypothesized mean m0.
1-372
LO 9.4

 Implementing a Two-Tailed Test Using a Confidence
Interval
 The general specification for a 100(1  a)% confidence
interval of the population mean m when the population
standard deviation  is known is computed as
x  za /2  n or  x  za /2  n , x  za /2  n
 
 Decision rule: Reject H0 if m0  x  za /2  n
or if m0  x  za /2  n
1-373
LO 9.4

 Example: Recall that a research analyst wishes to
determine if average back-to-school spending
differs from $606.40.
Out of 30 randomly drawn households from a normally
distributed population, the standard deviation is $65 and
sample mean is $622.85.
 Step 1. H0: m = 606.4, HA: m ≠ 606.4
 Step 2. z = 1.39
 Step 3. Based on HA, this
is a two-tail test and for
a = 0.05, the critical value
is za/2 = z0.025 = ±1.96.
1-374
LO 9.5 Differentiate between the test statistics for the population mean.
 Test Statistic for m When  is Unknown

 When the population standard deviation
 is unknown, the test statistic for
testing the population mean m is assumed
to follow the tdf distribution with (n  1)
degrees of freedom (df).
x  m0
 The value of tdf is computed as: tdf 
s n
1-375
LO 9.5

 Example
 Consider the following: n  35, x  16.37, s  7.22
 Step 1. State the hypotheses: H0 : m  24
H A : m  24
Thus, m0 = 24
 Step 2. Because n = 35 (i.e, n > 30), we can
assume that the sample mean is normally
distributed and thus compute the value of the test
statistic as t  x  m0  16.37  24  6.25
34
s n 7.22 35
1-376
LO 9.5

n  35, x  16.37, s  7.22, t34  6.25
a
 H0: m > 24, HA: m < 24
 Step 3. Based on HA,
this is a left-tail test.
For a = 0.05 and
n1 = 34 df, the
critical value is
ta,df = t0.05,34 = 1.691
(1.691 due to symmetry).
1-377
LO 9.5

results.
 Reject H0 if t34 <  t0.05,34 =  1.691.
 Since t34 = 6.25 is less than t0.05,34 = 1.691,
we reject H0 and conclude that the sample data
support the alternative claim that m < 24.
1-378
LO 9.5

n
a  35, x  16.37, s  7.22
 Step 1. H0: m = 14, HA: m ≠ 14
 Step 2. Compute the value of the test statistic as:
x  m0 16.37  14
t34    1.94
s n 7.22 35
 Step 3. Compute the p-value.

 Since t34=1.94 > 0, the p-value for a two-tailed test is
2P(T34 > t34). Referencing the tdf table for df = 34, we
find that the exact probability P(T34 > 1.94) cannot be
determined.
1-379
LO 9.5

 Step 3. Compute the p-value (continued).
 Look up t34 = 1.94 in the t-table to find the p-value.
 Note that t34 = 1.94 lies between 1.691 and 2.032.

 Thus, 0.025 < P(T34 > 1.94) < 0.05.
However, because this is a two-tail test, we multiply by
two to get 0.05 < p-value < 0.10.
1-380
LO 9.5

 0.05 < p-value < 0.10
 a = 0.05.
results.
 Since the p-value satisfies 0.05 < p-value < 0.10, it
must be greater than a = 0.05.
 Thus, we do not reject H0 and conclude that the mean
study time of students at the university is not statistically
different from today‘s national average of 14 hours per
week.
1-381
9.4 Hypothesis Test of the Population Proportion
LO 9.6 Specify the test statistic for the population proportion.
 Test Statistic for p.

 P can be approximated by a normal distribution
if np > 5 and n(1p) > 5.
 Test statistic for the hypothesis test of the
population proportion p is assumed to follow the
z distribution:
p  p0 where p  x n
z
p0 1  p0  n and p0 is the hypothesized
value of the population
proportion.
1-382
LO 9.6
Proportion
 Example:
 an  180, x  67, p0  0.4
 Step 1. H0: p > 0.4, HA: p < 0.4
 Step 2. Compute the value of the test statistic.
 First verify that the sample is large enough:
np0  67  0.4  26.8  5
n(1  p0 )  67  0.6  40.2  5
 Compute the test statistic using p = 67/180 = 0.3722:
p  p0 0.3722  0.4
z   0.76
p0 1  p0  n 0.4 1  0.4  180
1-383
LO 9.6
Proportion
 Example:
 Step 3. Compute the p-value.
 Based on HA: p < 0.4, this is a left-tailed test.
Compute the p-value as:
P(Z < z) = P(Z < 0.76) = 0.2236.
 Let the significance
level a = 0.10.
1-384
LO 9.6
Proportion
 Example:
results.
 p-value = 0.2236 > a = 0.10.
 Do not reject H0: p > 0.4 and conclude
HA: p < 0.4.
 Thus, the magazine‘s claim that fewer than
40% of households in the United States have
changed their lifestyles because of escalating
gas prices is not justified by the sample data.
1-385

1

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

1

Caricato da

Copyright:

Formati disponibili

Business Statistics: Communicating with Numbers

By Sanjiv Jaggia and Alison Kelly

LO 1.1: Describe the importance of statistics.

your drive to the ski resort?

spend at the lodge today?

1. Classify the tweens‘ responses into the

2. Extract useful information from each

3. Provide management with suggestions for

 With knowledge of statistics:

 Problem with Conclusion: Incorrect to

 Example 2. A gambler predicts that he will

 Problem with Conclusion. The

 Problem with Conclusion. The Globe‘s

 Problem with Conclusion. The CFO overstated

 Example 5. Researchers showed that

 Problem with Conclusion. This is an

 Statistics is the methodology of extracting

 Reasons for sampling from the population

 A variable is the general characteristic

Qualitative values may be converted

 The Ordinal Scale

 Differences between categories are meaningless

Example: Tweens Survey

Solution: These are nominal data—the values in the

Solution: These are ordinal since they can be both

 Solution: Clock time responses are on an interval scale.

 Solution: Since the tweens‘ responses are in dollar

1. Summarize the range of house prices.

2. Comment on where house prices tend to cluster.

3. Calculate percentages to compare house prices.

 Note that the total of the proportions must add

 A bar chart depicts the frequency or the

 A frequency distribution for quantitative data

 Classes are exhaustive.

 Approximating the class width:

Largest value  Smallest value

 Question: What is the price range over this time period?

 Question: How many of the houses sold in the $500,000 up

 Question: How many of the houses sold for less than

 A cumulative relative frequency distribution

 Question: What percent of the houses sold for at

 Question: What percent of the houses sold for

 Bar height represents the respective class

 Bar width represents the class width.

 Here are the frequency and relative frequency

 Note that the only difference is the y-axis scale.

 Plot the class midpoints on x-axis and

 Neighboring points are connected with a

 Plot the cumulative frequency (or cumulative

 The neighboring points are then connected.

 Use the ogive to approximate the percentage of

 A stem-and-leaf diagram provides a visual

 It gives an overall picture of the data‘s center and

 Each value of the data set is separated into two

 A scatterplot is used to determine if two

(x1,y1), (x2,y2), etc. y-axis

 This scatterplot shows

 In this scatterplot, there

 Histogram: select the relevant data, and choose

 Scatterplot: select the x- and y-coordinates, choose

LO 3.1: Calculate and interpret the arithmetic mean,

 Rebecca would like to

 The arithmetic mean is a primary measure of central

 This mean does not reflect the typical salary!