Regression Chapter 12 Simple Linear Regression

Chapter 12
Simple Regression
True / False Questions

1. A scatter plot is used to visualize the association (or lack of
association) between two quantitative variables.
True
False
2. The correlation coefficient r measures the strength of the linear

relationship between two variables.
True
False
3. Pearson's correlation coefficient (r) requires that both variables be

interval or ratio data.
True
False
4. If r = .55 and n = 16, then the correlation is significant at = .05 in a

two-tailed test.
True
False
5. A sample correlation r = .40 indicates a stronger linear relationship

than r = -.60.
True
False
6. A common source of spurious correlation between X and Y is when a

third unspecified variable Z affects both X and Y.
True
False
7. The correlation coefficient r always has the same sign as b1 in Y = b0 +

b1X.
True
False
8. The fitted intercept in a regression has little meaning if no data values

near X = 0 have been observed.
True
False
9. The least squares regression line is obtained when the sum of the
squared residuals is minimized.
True
False
10 In a simple regression, if the coefficient for X is positive and

. significantly different from zero, then an increase in X is associated
with an increase in the mean (i.e., the expected value) of Y.
True
False
11 In least-squares regression, the residuals e1, e2, . . . , en will always

. have a zero mean.
True
False
12 When using the least squares method, the column of residuals always
. sums to zero.
True
False
13 In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads

. will increase sales by 7.37 percent.
True
False
14 If R2 = .36 in the model Sales = 268 + 7.37 Ads with n = 50, the two. tailed test for correlation at = .05 would say that there is a significant
correlation between Sales and Ads.
True
False
15 If R2 = .36 in the model Sales = 268 + 7.37 Ads, then Ads explains 36
. percent of the variation in Sales.
True
False
16 The ordinary least squares regression line always passes through the
. point
.
True
False
17 The least squares regression line gives unbiased estimates of 0 and

. 1.
True
False
18 In a simple regression, the correlation coefficient r is the square root of

. R2.
True
False
19 If SSR is 1800 and SSE is 200, then R2 is .90.

.
True False
20 The width of a prediction interval for an individual value of Y is less
. than standard error se.
True
False
21 If SSE is near zero in a regression, the statistician will conclude that the
. proposed model probably has too poor a fit to be useful.
True
False
22 For a regression with 200 observations, we expect that about 10

. residuals will exceed two standard errors.
True
False
23 Confidence intervals for predicted Y are less precise when the residuals
. are very small.
True
False
24 Cause-and-effect direction between X and Y may be determined by

. running the regression twice and seeing whether Y = 0 + 1X or X = 1
+ 0Y has the larger R2.
True
False
25 The ordinary least squares method of estimation minimizes the

. estimated slope and intercept.
True
False
26 Using the ordinary least squares method ensures that the residuals will
. be normally distributed.
True
False
27 If you have a strong outlier in the residuals, it may represent a different

. causal system.
True
False
28 A negative correlation between two variables X and Y usually yields a

. negative p-value for r.
True
False
29 In linear regression between two variables, a significant relationship

. exists when the p-value of the t test statistic for the slope is greater
than .
True
False
30 The larger the absolute value of the t statistic of the slope in a simple
. linear regression, the stronger the linear relationship exists between X
and Y.
True
False
31 In simple linear regression, the coefficient of determination (R2) is

. estimated from sums of squares in the ANOVA table.
True
False
32 In simple linear regression, the p-value of the slope will always equal
. the p-value of the F statistic.
True
False
33 An observation with high leverage will have a large residual (usually an

. outlier).
True
False
34 A prediction interval for Y is narrower than the corresponding

. confidence interval for the mean of Y.
True
False
35 When X is farther from its mean, the prediction interval and confidence
. interval for Y become wider.
True
False
36 The total sum of squares (SST) will never exceed the regression sum of
. squares (SSR).
True
False
37 "High leverage" would refer to a data point that is poorly predicted by

. the model (large residual).
True
False
38 The studentized residuals permit us to detect cases where the

. regression predicts poorly.
True
False
39 A poor prediction (large residual) indicates an observation with high

. leverage.
True
False
40 Ill-conditioned refers to a variable whose units are too large or too

. small (e.g., $2,434,567).
True
False
41 A simple decimal transformation (e.g., from 18,291 to 18.291) often

. improves data conditioning.
True
False
42 Two-tailed t-tests are often used because any predictor that differs
. significantly from zero in a two-tailed test will also be significantly
greater than zero or less than zero in a one-tailed test at the same .
True
False
43 A predictor that is significant in a one-tailed t-test will also be

. significant in a two-tailed test at the same level of significance .
True
False
44 Omission of a relevant predictor is a common source of model

. misspecification.
True
False
45 The regression line must pass through the origin.

.
True False
46 Outliers can be detected by examining the standardized residuals.
.
True False
47 In a simple regression, there are n - 2 degrees of freedom associated

. with the error sum of squares (SSE).
True
False
48 In a simple regression, the F statistic is calculated by taking the ratio of

. MSR to the MSE.
True
False
49 The coefficient of determination is the percentage of the total variation

. in the response variable Y that is explained by the predictor X.
True
False
50 A different confidence interval exists for the mean value of Y for each
. different value of X.
True
False
51 A prediction interval for Y is widest when X is near its mean.

.
True False
52 In a two-tailed test for correlation at = .05, a sample correlation
. coefficient r = 0.42 with n = 25 is significantly different than zero.
True
False
53 In correlation analysis, neither X nor Y is designated as the

. independent variable.
True
False
54 A negative value for the correlation coefficient (r) implies a negative

. value for the slope (b1).
True
False
55 High leverage for an observation indicates that X is far from its mean.
.
True False
56 Autocorrelated errors are not usually a concern for regression models

. using cross-sectional data.
True
False
57 There are usually several possible regression lines that will minimize
. the sum of squared errors.
True
False
58 When the errors in a regression model are not independent, the

. regression model is said to have autocorrelation.
True
False
59 In a simple bivariate regression, Fcalc = tcalc2.

.
True False
60 Correlation analysis primarily measures the degree of the linear
. relationship between X and Y.
True
False
Multiple Choice Questions

61 The variable used to predict another variable is called the:
.
A.
B.
C.
D.
response variable.
regression variable.
independent variable.
dependent variable.
62 The standard error of the regression:

.
A. is based on squared deviations from the regression line.
B. may assume negative values if b1 < 0.
C. is in squared units of the dependent variable.
D. may be cut in half to get an approximate 95 percent prediction
interval.
63 A local trucking company fitted a regression to relate the travel time
. (days) of its shipments as a function of the distance traveled (miles).
The fitted regression is Time = -7.126 + 0.0214 Distance, based on a
sample of 20 shipments. The estimated standard error of the slope is
0.0053. Find the value of tcalc to test for zero slope.
A.
B.
C.
D.
2.46
5.02
4.04
3.15

. (days) of its shipments as a function of the distance traveled (miles).
The fitted regression is Time = -7.126 + .0214 Distance, based on a
0.0053. Find the critical value for a right-tailed test to see if the slope is
positive, using = .05.
A.
B.
C.
D.
2.101
2.552
1.960
1.734
65 If the attendance at a baseball game is to be predicted by the equation

. Attendance = 16,500 - 75 Temperature, what would be the predicted
attendance if Temperature is 90 degrees?
A.
B.
C.
D.
6,750
9,750
12,250
10, 020
66 A hypothesis test is conducted at the 5 percent level of significance to

. test whether the population correlation is zero. If the sample consists
of 25 observations and the correlation coefficient is 0.60, then the
computed test statistic would be:
A.
B.
C.
D.
2.071.
1.960.
3.597.
1.645.
67 Which of the following is not a characteristic of the F-test in a simple

. regression?
A.
It is a test for overall fit of the model.
B. The test statistic can never be negative.
C. It requires a table with numerator and denominator degrees of
freedom.
D. The F-test gives a different p-value than the t-test.
68 A researcher's Excel results are shown below using Femlab (labor force
. participation rate among females) to try to predict Cancer (death rate
per 100,000 population due to cancer) in the 50 U.S. states.
Which of the following statements is not true?
A. The standard error is too high for this model to be of any predictive
use.
B. The 95 percent confidence interval for the coefficient of Femlab is
-4.29 to -0.28.
C. Significant correlation exists between Femlab and Cancer at = .
05.
D. The two-tailed p-value for Femlab will be less than .05.
69 A researcher's results are shown below using Femlab (labor force

Which statement is valid regarding the relationship between Femlab

and Cancer?
A. A rise in female labor participation rate will cause the cancer rate to
decrease within a state.
B. This model explains about 10 percent of the variation in state cancer
rates.
C. At the .05 level of significance, there isn't enough evidence to say
the two variables are related.
D. If your sister starts working, the cancer rate in your state will
decline.
70 A researcher's results are shown below using Femlab (labor force

What is the R2 for this regression?
A.
B.
C.
D.
.9018
.0982
.8395
.1605
71 A news network stated that a study had found a positive correlation

. between the number of children a worker has and his or her earnings
last year. You may conclude that:
A. people should have more children so they can get better jobs.
B. the data are erroneous because the correlation should be negative.
C.
causation is in serious doubt.
D.
statisticians have small families.
72 William used a sample of 68 large U.S. cities to estimate the
. relationship between Crime (annual property crimes per 100,000
persons) and Income (median annual income per capita, in dollars). His
estimated regression equation was Crime = 428 + 0.050 Income. We
can conclude that:
A. the slope is small so Income has no effect on Crime.

B. crime seems to create additional income in a city.
C. wealthy individuals tend to commit more crimes, on average.
D. the intercept is irrelevant since zero median income is impossible in
a large city.
73 Mary used a sample of 68 large U.S. cities to estimate the relationship

. between Crime (annual property crimes per 100,000 persons) and
Income (median annual income per capita, in dollars). Her estimated
regression equation was Crime = 428 + 0.050 Income. If Income
decreases by 1000, we would expect that Crime will:
A.
B.
C.
D.
increase by 428.
decrease by 50.
increase by 500.
remain unchanged.
74 Amelia used a random sample of 100 accounts receivable to estimate

. the relationship between Days (number of days from billing to receipt
of payment) and Size (size of balance due in dollars). Her estimated
regression equation was Days = 22 + 0.0047 Size with a correlation
coefficient of .300. From this information we can conclude that:
A. 9 percent of the variation in Days is explained by Size.

B. autocorrelation is likely to be a problem.
C. the relationship between Days and Size is significant.
D. larger accounts usually take less time to pay.
75 Prediction intervals for Y are narrowest when:
.
A.
the mean of X is near the mean of Y.
B.
the value of X is near the mean of X.
C. the mean of X differs greatly from the mean of Y.
D.
the mean of X is small.
76 If n = 15 and r = .4296, the corresponding t-statistic to test for zero
. correlation is:
A.
B.
C.
D.
1.715.
7.862.
2.048.
impossible to determine without .
77 Using a two-tailed test at = .05 for n = 30, we would reject the

. hypothesis of zero correlation if the absolute value of r exceeds:
A.
B.
C.
D.
.2992.
.3609.
.0250.
.2004.
78 The ordinary least squares (OLS) method of estimation will minimize:

.
A.
B.
C.
D.
neither the slope nor the intercept.

only the slope.
only the intercept.
both the slope and intercept.
79 A standardized residual ei = -2.205 indicates:

.
A.
B.
C.
D.
a rather poor prediction.

an extreme outlier in the residuals.
an observation with high leverage.
a likely data entry error.
80 In a simple regression, which would suggest a significant relationship

. between X and Y?
A.
B.
C.
D.
Large p-value for the estimated slope

Large t statistic for the slope
Large p-value for the F statistic
Small t-statistic for the slope
81 Which is indicative of an inverse relationship between X and Y?

.
A.
A negative F statistic
B. A negative p-value for the correlation coefficient
C.
A negative correlation coefficient
D. Either a negative F statistic or a negative p-value
82 Which is not correct regarding the estimated slope of the OLS

. regression line?
A.
B.
C.
D.
It is divided by its standard error to obtain its t statistic.

It shows the change in Y for a unit change in X.
It is chosen so as to minimize the sum of squared errors.
It may be regarded as zero if its p-value is less than .
83 Simple regression analysis means that:

.
A. the data are presented in a simple and clear way.
B.
we have only a few observations.
C. there are only two independent variables.
D. we have only one explanatory variable.
84 The sample coefficient of correlation does not have which property?
.
A.
B.
C.
D.
It can range from -1.00 up to +1.00.

It is also sometimes called Pearson's r.
It is tested for significance using a t-test.
It assumes that Y is the dependent variable.
85 When comparing the 90 percent prediction and confidence intervals for

. a given regression analysis:
A. the prediction interval is narrower than the confidence interval.

B. the prediction interval is wider than the confidence interval.
C. there is no difference between the size of the prediction and
confidence intervals.
D. no generalization is possible about their comparative width.
86 Which is not true of the coefficient of determination?
.
A.
B.
C.
D.
It is the square of the coefficient of correlation.

It is negative when there is an inverse relationship between X and Y.
It reports the percent of the variation in Y explained by X.
It is calculated using sums of squares (e.g., SSR, SSE, SST).
87 If the fitted regression is Y = 3.5 + 2.1X (R2 = .25, n = 25), it is

. incorrect to conclude that:
A.
B.
C.
D.
Y increases 2.1 percent for a 1 percent increase in X.

the estimated regression line crosses the Y axis at 3.5.
the sample correlation coefficient must be positive.
the value of the sample correlation coefficient is 0.50.
88 In a simple regression Y = b0 + b1X where Y = number of robberies in a

. city (thousands of robberies), X = size of the police force in a city
(thousands of police), and n = 45 randomly chosen large U.S. cities in
2008, we would be least likely to see which problem?
A. Autocorrelated residuals (because this is time-series data)

B. Heteroscedastic residuals (because we are using totals uncorrected
for city size)
C. Nonnormal residuals (because a few larger cities may skew the
residuals)
D. High leverage for some observations (because some cities may be
huge)
89 When homoscedasticity exists, we expect that a plot of the residuals
. versus the fitted Y:
A.
B.
C.
D.
will form approximately a straight line.

crosses the centerline too many times.
will yield a Durbin-Watson statistic near 2.
will show no pattern at all.
90 Which statement is not correct?

.
A. Spurious correlation can often be reduced by expressing X and Y in
per capita terms.
B. Autocorrelation is mainly a concern if we are using time-series data.
C. Heteroscedastic residuals will have roughly the same variance for
any value of X.
D. Standardized residuals make it easy to identify outliers or instances
of poor fit.
91 In a simple bivariate regression with 25 observations, which statement
. is most nearly correct?
A. A non-standardized residual whose value is ei = 4.22 would be

considered an outlier.
B. A leverage statistic of 0.16 or more would indicate high leverage.
C. Standardizing the residuals will eliminate any heteroscedasticity.
D. Non-normal residuals imply biased coefficient estimates, a major
problem.
92 A regression was estimated using these variables: Y = annual value of
. reported bank robbery losses in all U.S. banks ($millions), X = annual
value of currency held by all U.S. banks ($millions), n = 100 years
(1912 through 2011). We would not anticipate:
A. autocorrelated residuals due to time-series data.

B. heteroscedastic residuals due to the wide variation in data
magnitudes.
C. nonnormal residuals due to skewed data as bank size increases over
time.
D. a negative slope because banks hold less currency when they are
robbed.
93 A fitted regression for an exam in Prof. Hardtack's class showed Score

. = 20 + 7 Study, where Score is the student's exam score and Study is
the student's study hours. The regression yielded R2 = 0.50 and SE =
8. Bob studied 9 hours. The quick 95 percent prediction interval for
Bob's grade is approximately:
A.
B.
C.
D.
69 to 97.
75 to 91.
67 to 99.
76 to 90.
94 Which is not an assumption of least squares regression?

.
A.
B.
C.
D.
Normal X values
Non-autocorrelated errors
Homoscedastic errors
Normal errors
95 In a simple bivariate regression with 60 observations there will be _____

. residuals.
A.
B.
C.
D.
60
59
58
57
96 Which is correct to find the value of the coefficient of determination

. (R2)?
A.
B.
C.
SSR/SSE
SSR/SST
1 - SST/SSE
97 The critical value for a two-tailed test of H0: 1 = 0 at = .05 in a

. simple regression with 22 observations is:
A.
B.
C.
D.
1.725
2.086
2.528
1.960
98 In a sample of size n = 23, a sample correlation of r = .400 provides

. sufficient evidence to conclude that the population correlation
coefficient exceeds zero in a right-tailed test at:
A.
B.
C.
D.
= .01 but not = .05.

= .05 but not = .01.
both = .05 and = .01.
neither = .05 nor = .01.
99 In a sample of n = 23, the Student's t test statistic for a correlation of r

. = .500 would be:
A.
B.
C.
D.
2.559.
2.819.
2.646.
can't say without knowing .
100 In a sample of n = 23, the critical value of the correlation coefficient

.
for a two-tailed test at = .05 is:
A.
B.
C.
D.
.524
.412
.500
.497
101 In a sample of n = 23, the critical value of Student's t for a two-tailed

.
test of significance for a simple bivariate regression at = .05 is:
A.
B.
C.
D.
2.229
2.819
2.646
2.080
102 In a sample of n = 40, a sample correlation of r = .400 provides

.
sufficient evidence to conclude that the population correlation
A.
B.
C.
D.
= .025 but not = .05.

= .05 but not = .025.
both = .025 and = .05.
103 In a sample of n = 20, the Student's t test statistic for a correlation of

.
r = .400 would be:
A.
2.110
B.
1.645
C.
1.852
D. can't say without knowing if it's a two-tailed or one-tailed test.
.
A.
B.
C.
D.
.587
.412
.444
.497

.
A.
B.
C.
D.
2.060
2.052
2.898
2.074
106 In a sample of size n = 36, a sample correlation of r = -.450 provides

.
coefficient differs significantly from zero in a two-tailed test at:
A.
B.
C.
D.
= .01
= .05
both = .01 and = .05.
107 In a sample of n = 36, the Student's t test statistic for a correlation of

.
r = -.450 would be:
A.
B.
C.
D.
-2.110.
-2.938.
-2.030.

.
A.
B.
C.
D.
.329
.387
.423
.497

.
test of significance of the slope for a simple regression at = .05 is:
A.
B.
C.
D.
2.938
2.724
2.032
2.074
.
(days) of its shipments as a function of the distance traveled (miles).
The fitted regression is Time = -7.126 + 0.0214 Distance. If Distance
increases by 50 miles, the expected Time would increase by:
A.
B.
C.
D.
1.07 days
7.13 days
2.14 days
1.73 days
111 A local trucking company fitted a regression to relate the cost of its
.
shipments as a function of the distance traveled. The Excel fitted
regression is shown.
Based on this estimated relationship, when distance increases by 50

miles, the expected shipping cost would increase by:
A.
B.
C.
D.
$286.
$143.
$104.
$301.
112 If SSR is 2592 and SSE is 608, then:

.
A.
B.
C.
D.
the slope is likely to be insignificant.

the coefficient of determination is .81.
the SST would be smaller than SSR.
the standard error would be large.
113 Find the sample correlation coefficient for the following data.
.
A.
B.
C.
D.
.8911
.9124
.9822
.9556
114 Find the slope of the simple regression

.
A.
B.
C.
D.
1.833
3.294
0.762
-2.228
= b0 + b1x.
115 Find the sample correlation coefficient for the following data.
.
A.
B.
C.
D.
.7291
.8736
.9118
.9563
116 Find the slope of the simple regression

.
A.
B.
C.
D.
2.595
1.109
-2.221
1.884
= b0 + b1x.
117 A researcher's results are shown below using n = 25 observations.

.
The 95 percent confidence interval for the slope is:
A.
B.
C.
D.
[ -3.282, -1.284].
[ -4.349, -0.217].
[1.118, 5.026].
[ -0.998, +0.998].
118 A researcher's regression results are shown below using n = 8

.
observations.
A.
B.
C.
D.
[1.333, 2.284].
[1.602, 2.064].
[1.268, 2.398].
[1.118, 2.449].
119 Bob thinks there is something wrong with Excel's fitted regression.
.
What do you say?
A.
B.
C.
D.
The estimated equation is obviously incorrect.

The R2 looks a little high but otherwise it looks OK.
Bob needs to increase his sample size to decide.
The relationship is linear, so the equation is credible.
Short Answer Questions
120 Pedro became interested in vehicle fuel efficiency, so he performed a

.
simple regression using 93 cars to estimate the model CityMPG = 0 +
1 Weight where Weight is the weight of the vehicle in pounds. His
results are shown below. Write a brief analysis of these results, using
what you have learned in this chapter. Is the intercept meaningful in
this regression? Make a prediction of CityMPG when Weight = 3000,
and also when Weight = 4000. Do these predictions seem believable?
If you could make a car 1000 pounds lighter, what change would you
predict in its CityMPG?
121 Mary noticed that old coins are smoother and more worn. She
.
weighed 31 nickels and recorded their age, and then performed a
simple regression to estimate the model Weight = 0 + 1 Age where
weight is the weight of the coin in grams and Age is the age of the
coin in years. Her results are shown below. Write a brief analysis of
these results, using what you have learned in this chapter. Make a
prediction of Weight when Age = 10, and also when Age = 20. What
does this tell you? Is the intercept meaningful in this regression?
Chapter 12 Simple Regression Answer Key
True / False Questions

1.
A scatter plot is used to visualize the association (or lack of

association) between two quantitative variables.
TRUE
The scatter plot shows association between two quantitative
variables.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 12-01 Calculate and test a correlation coefficient for significance.
Topic: Visual Displays and Correlation Analysis
2.
The correlation coefficient r measures the strength of the linear

relationship between two variables.
TRUE
A correlation coefficient measures linearity between two variables.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
3.
Pearson's correlation coefficient (r) requires that both variables be

interval or ratio data.
TRUE
Correlation assumes quantitative data with at least interval
measurements.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
4.
If r = .55 and n = 16, then the correlation is significant at = .05 in

a two-tailed test.
TRUE
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.55)[(16 - 2)/(1 - .552)]1/2 = 2.464 > t.025 =
2.145 for d.f. = 16 - 2 = 14.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
5.
A sample correlation r = .40 indicates a stronger linear relationship

than r = -.60.
FALSE
The sign only indicates the direction, not the strength, of the linear
relationship.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
6.
A common source of spurious correlation between X and Y is when a

third unspecified variable Z affects both X and Y.
TRUE
Both X and Y could be influenced by Z.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
7.
The correlation coefficient r always has the same sign as b1 in Y = b0

+ b1X.
TRUE
The t-test for the slope in simple regression gives the same result as
the t-test for r.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 12-04 Fit a simple regression on an Excel scatter plot.
Topic: Regression Terminology
8.
The fitted intercept in a regression has little meaning if no data

values near X = 0 have been observed.
TRUE
Predicting Y for X = 0 makes little sense if the observed data have no
values near X = 0.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 12-02 Interpret the slope and intercept of a regression equation.
Topic: Simple Regression
9.
The least squares regression line is obtained when the sum of the
squared residuals is minimized.
TRUE
The OLS method minimizes the sum of squared residuals.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Topic: Ordinary Least Squares Formulas
10.
In a simple regression, if the coefficient for X is positive and

significantly different from zero, then an increase in X is associated
with an increase in the mean (i.e., the expected value) of Y.
TRUE
The conditional mean of Y depends on X (unless the slope is
effectively zero).
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
11.
In least-squares regression, the residuals e1, e2, . . . , en will always

have a zero mean.
TRUE
The residuals must sum to zero if the OLS method is used, so their
mean is zero.
AACSB: Analytic
Blooms: Remember
12.
When using the least squares method, the column of residuals

always sums to zero.
TRUE
The residuals must sum to zero if the OLS method is used.
AACSB: Analytic
Blooms: Remember
13.
In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads

will increase sales by 7.37 percent.
FALSE
The slope coefficient is in the same units as Y (dollars, not percent,
in this case).
AACSB: Analytic
Blooms: Apply
14.
If R2 = .36 in the model Sales = 268 + 7.37 Ads with n = 50, the twotailed test for correlation at = .05 would say that there is a
significant correlation between Sales and Ads.
TRUE
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.60)[(50 - 2)/(1 - .36)]1/2 = 5.196 > t.025 =
2.011 for d.f. = 50 - 2 = 48.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
15.
If R2 = .36 in the model Sales = 268 + 7.37 Ads, then Ads explains
36 percent of the variation in Sales.
TRUE
We can interpret R2 as the fraction of variation in Y explained by X
(expressed as a percent).
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 12-08 Interpret the standard error; R2; ANOVA table; and F test.
16.
The ordinary least squares regression line always passes through the
point
TRUE
The OLS formulas require the line to pass through this point.
AACSB: Analytic
Blooms: Remember
17.
The least squares regression line gives unbiased estimates of 0 and

1.
TRUE
The expected values of the OLS estimators b0 and b1 are the true
parameters 0 and 1.
AACSB: Analytic
Blooms: Remember
18.
In a simple regression, the correlation coefficient r is the square root

of R2.
TRUE
In fact, we could use the notation r2 instead of R2 when talking about
simple regression.
AACSB: Analytic
Blooms: Remember
19.
If SSR is 1800 and SSE is 200, then R2 is .90.

TRUE
R2 = SSR/SST = SSR/(SSR + SSE) = 1800/(1800 + 200) = .90.
AACSB: Analytic
Blooms: Apply
Topic: Tests for Significance
20.
The width of a prediction interval for an individual value of Y is less

than standard error se.
FALSE
The formula for the interval width multiplies the standard error by an
expression > 1.
AACSB: Analytic
Blooms: Understand
Learning Objective: 12-09 Distinguish between confidence and prediction intervals for Y.
Topic: Confidence and Prediction Intervals for Y
21.
If SSE is near zero in a regression, the statistician will conclude that

the proposed model probably has too poor a fit to be useful.
FALSE
SSE is the sum of the square residuals, which would be smaller if the
fit is good.
AACSB: Analytic
Blooms: Apply
22.
For a regression with 200 observations, we expect that about 10

residuals will exceed two standard errors.
TRUE
If the residuals are normal, 95.44 percent (190 of 200) will lie within
2se (so 10 outside).
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-11 Identify unusual residuals and high-leverage observations.
Topic: Unusual Observations
23.
Confidence intervals for predicted Y are less precise when the

residuals are very small.
FALSE
Small residuals imply a small standard error and thus a narrower
prediction interval.
AACSB: Analytic
Blooms: Understand
24.
Cause-and-effect direction between X and Y may be determined by

running the regression twice and seeing whether Y = 0 + 1X or X =
1 + 0Y has the larger R2.
FALSE
Cause and effect cannot be determined in the context of simple
regression models.
AACSB: Analytic
Blooms: Understand
25.
The ordinary least squares method of estimation minimizes the

estimated slope and intercept.
FALSE
OLS minimizes the sum of squared residuals.
AACSB: Analytic
Blooms: Remember
26.
Using the ordinary least squares method ensures that the residuals
will be normally distributed.
FALSE
OLS produces unbiased estimates but cannot ensure normality of the
residuals.
AACSB: Analytic
Blooms: Remember
Learning Objective: 12-10 Test residuals for violations of regression assumptions.
Topic: Residual Tests
27.
If you have a strong outlier in the residuals, it may represent a

different causal system.
TRUE
Outliers might come from a different population or causal system.
AACSB: Analytic
Blooms: Understand
Topic: Other Regression Problems (Optional)
28.
A negative correlation between two variables X and Y usually yields a

negative p-value for r.
FALSE
The p-value cannot be negative.
AACSB: Analytic
Blooms: Understand
Learning Objective: 12-06 Test hypotheses about the slope and intercept by using t tests.
29.
In linear regression between two variables, a significant relationship

exists when the p-value of the t test statistic for the slope is greater
than .
FALSE
Reject 1 = 0 if the p-value is less than .
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
30.
The larger the absolute value of the t statistic of the slope in a

simple linear regression, the stronger the linear relationship exists
between X and Y.
TRUE
The correlation coefficient measures linearity, regardless of its sign
(+ or -).
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
31.
In simple linear regression, the coefficient of determination (R2) is

estimated from sums of squares in the ANOVA table.
TRUE
R2 = SSR/SST or R2 = 1 - SSE/SST.
AACSB: Analytic
Blooms: Remember
32.
In simple linear regression, the p-value of the slope will always equal
the p-value of the F statistic.
TRUE
This is true only if there is one predictor (but is no longer true in
multiple regression).
AACSB: Analytic
Blooms: Remember
Topic: Analysis of Variance: Overall Fit
33.
An observation with high leverage will have a large residual (usually

an outlier).
FALSE
The concepts are distinct (a high-leverage point could have a good
fit).
AACSB: Analytic
Blooms: Understand
34.
A prediction interval for Y is narrower than the corresponding

confidence interval for the mean of Y.
FALSE
Predicting an individual case requires a wider confidence interval
than predicting the mean.
AACSB: Analytic
Blooms: Remember
35.
When X is farther from its mean, the prediction interval and

confidence interval for Y become wider.
TRUE
The width increases when X differs from its mean (review the
formula).
AACSB: Analytic
Blooms: Understand
36.
The total sum of squares (SST) will never exceed the regression sum
of squares (SSR).
FALSE
The identity is SSR + SSE = SST.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
37.
"High leverage" would refer to a data point that is poorly predicted

by the model (large residual).
FALSE
A high-leverage observation may have a good fit (only its X value
determines its leverage).
AACSB: Analytic
Blooms: Remember
38.
The studentized residuals permit us to detect cases where the

regression predicts poorly.
TRUE
Studentized residuals resemble a t-distribution. A large studentized tvalue (e.g., t < -2.00 or t > + 2.00) would implies a poor fit.
AACSB: Analytic
Blooms: Understand
39.
A poor prediction (large residual) indicates an observation with high

leverage.
FALSE
High leverage indicates an unusually large or small X value (not a
poor prediction). A high-leverage observation may have a good fit or
a poor fit. Only its X value determines its leverage.
AACSB: Analytic
Blooms: Understand
40.
Ill-conditioned refers to a variable whose units are too large or too

small (e.g., $2,434,567).
TRUE
In Excel, a symptom of poor data conditioning is exponential
notation (e.g., 4.3E + 06).
AACSB: Analytic
Blooms: Remember
Learning Objective: 12-07 Perform regression analysis with Excel or other software.
41.
A simple decimal transformation (e.g., from 18,291 to 18.291) often

improves data conditioning.
TRUE
Keeping data magnitudes similar helps avoid exponential notation
(e.g., 4.3E + 06).
AACSB: Analytic
Blooms: Understand
42.
Two-tailed t-tests are often used because any predictor that differs
significantly from zero in a two-tailed test will also be significantly
greater than zero or less than zero in a one-tailed test at the same
.
TRUE
True because the critical t is larger in the two-tailed test (the default
in most software).
AACSB: Analytic
Blooms: Apply
43.
A predictor that is significant in a one-tailed t-test will also be

significant in a two-tailed test at the same level of significance .
FALSE
False because the critical t would be larger in a two-tailed test.
AACSB: Analytic
Blooms: Remember
44.
Omission of a relevant predictor is a common source of model

misspecification.
TRUE
In a multivariate world, simple regression may be inadequate.
AACSB: Analytic
Blooms: Remember
45.
The regression line must pass through the origin.

FALSE
The OLS intercept estimate does not, in general, equal zero. We
might be unable to reject a zero intercept if a t-test, but the fitted
intercept is rarely zero.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
46.
Outliers can be detected by examining the standardized residuals.

TRUE
A poor fit implies a large t-value (e.g., larger than 3 would be an
outlier).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
47.
In a simple regression, there are n - 2 degrees of freedom associated

with the error sum of squares (SSE).
TRUE
This is true in simple regression because we estimate two
parameters (0 and 1).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
48.
In a simple regression, the F statistic is calculated by taking the ratio

of MSR to the MSE.
TRUE
By definition, Fcalc = MSR/MSE (obtained from the ANOVA table).
AACSB: Analytic
Blooms: Understand
49.
The coefficient of determination is the percentage of the total

variation in the response variable Y that is explained by the predictor
X.
TRUE
R2 = SSR/SST or R2 = 1 - SSE/SST lies between 0 and 1 and often is
expressed as a percent.
AACSB: Analytic
Blooms: Understand
50.
A different confidence interval exists for the mean value of Y for

each different value of X.
TRUE
Both the interval width and also E(Y|X) =0 + 1 X depend on the
value of X.
AACSB: Analytic
Blooms: Remember
51.
A prediction interval for Y is widest when X is near its mean.

FALSE
The prediction interval is narrowest when X is near its mean. Review
the formula, which has a term (xi - )2 in the numerator. The
minimum would be when xi = .
AACSB: Analytic
Blooms: Remember
52.
In a two-tailed test for correlation at = .05, a sample correlation

coefficient r = 0.42 with n = 25 is significantly different than zero.
TRUE
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.42)[(25 - 2)/(1 - .422)]1/2 = 2.219 > t.025 =
2.069 for d.f. = 25 - 2 = 23.
AACSB: Analytic
Blooms: Apply
53.
In correlation analysis, neither X nor Y is designated as the

TRUE
In correlation analysis, X and Y covary without designating either as
"independent."
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
54.
A negative value for the correlation coefficient (r) implies a negative

value for the slope (b1).
TRUE
The sign of r must be the same as the sign of the slope estimate b1.
AACSB: Analytic
Blooms: Remember
55.
High leverage for an observation indicates that X is far from its

mean.
TRUE
By definition, observations have higher leverage when X is far from
its mean.
AACSB: Analytic
Blooms: Remember
56.
Autocorrelated errors are not usually a concern for regression models

using cross-sectional data.
TRUE
We more often expect autocorrelated residuals in time series data.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
57.
There are usually several possible regression lines that will minimize
the sum of squared errors.
FALSE
The OLS solution for the estimators b0 and b1 is unique.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
58.
When the errors in a regression model are not independent, the

regression model is said to have autocorrelation.
TRUE
For example, in first-order autocorrelation t depends on t-1.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
59.
In a simple bivariate regression, Fcalc = tcalc2.

TRUE
This statement is true only in a simple regression (one predictor).
AACSB: Analytic
Blooms: Remember
60.
Correlation analysis primarily measures the degree of the linear

relationship between X and Y.
TRUE
The sign of r indicates the direction and its magnitude indicates the
degree of linearity.
AACSB: Analytic
Blooms: Remember
Multiple Choice Questions

61.
The variable used to predict another variable is called the:
A.
B.
C.
D.
response variable.
regression variable.
dependent variable.
We might also call the independent variable a predictor of Y.

AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
62.
The standard error of the regression:
A. is based on squared deviations from the regression line.

B. may assume negative values if b1 < 0.
C. is in squared units of the dependent variable.
D. may be cut in half to get an approximate 95 percent prediction
interval.
In a simple regression, the standard error is the square root of the
sum of the squared residuals divided by (n - 2).
AACSB: Analytic
Blooms: Apply
63.
A local trucking company fitted a regression to relate the travel time

The fitted regression is Time = -7.126 + 0.0214 Distance, based on a
0.0053. Find the value of tcalc to test for zero slope.
A.
B.
C.
D.
tcalc =
2.46
5.02
4.04
3.15
= (0.0214)/(0.0053) = 4.038.
AACSB: Analytic
Blooms: Apply
64.
A local trucking company fitted a regression to relate the travel time

The fitted regression is Time = -7.126 + .0214 Distance, based on a
0.0053. Find the critical value for a right-tailed test to see if the slope
is positive, using = .05.
A.
B.
C.
D.
2.101
2.552
1.960
1.734
For d.f. = n - 2 = 20 - 2 = 18, Appendix D gives t.05 = 1.734.

AACSB: Analytic
Blooms: Apply
65.
If the attendance at a baseball game is to be predicted by the

equation Attendance = 16,500 - 75 Temperature, what would be the
predicted attendance if Temperature is 90 degrees?
A.
B.
C.
D.
6,750
9,750
12,250
10, 020
The predicted Attendance is 16,500 - 75(90) = 9,750.

AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
66.
A hypothesis test is conducted at the 5 percent level of significance

to test whether the population correlation is zero. If the sample
consists of 25 observations and the correlation coefficient is 0.60,
then the computed test statistic would be:
A.
B.
C.
D.
2.071.
1.960.
3.597.
1.645.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.60)[(25 - 2)/(1 - .602)]1/2 = 3.597.

Comment: Requires formula handout or memorizing the formula.
AACSB: Analytic
Blooms: Apply
67.
Which of the following is not a characteristic of the F-test in a simple

regression?
A.
It is a test for overall fit of the model.
B. The test statistic can never be negative.
C. It requires a table with numerator and denominator degrees of
freedom.
D. The F-test gives a different p-value than the t-test.
Fcalc is the ratio of two variances (mean squares) that measures
overall fit. The test statistic cannot be negative because the
variances are non-negative. In a simple regression, the F-test always
agrees with the t-test.
AACSB: Analytic
Blooms: Remember
68.
A researcher's Excel results are shown below using Femlab (labor

force participation rate among females) to try to predict Cancer
(death rate per 100,000 population due to cancer) in the 50 U.S.
states.
Which of the following statements is not true?
A. The standard error is too high for this model to be of any

predictive use.
B. The 95 percent confidence interval for the coefficient of Femlab is
-4.29 to -0.28.
C. Significant correlation exists between Femlab and Cancer at = .
05.
D. The two-tailed p-value for Femlab will be less than .05.
The magnitude of se depends on Y (and, in this case, the tcalc
indicates significance).
AACSB: Analytic
Blooms: Apply
69.
A researcher's results are shown below using Femlab (labor force

participation rate among females) to try to predict Cancer (death
rate per 100,000 population due to cancer) in the 50 U.S. states.
Which statement is valid regarding the relationship between Femlab

and Cancer?
A. A rise in female labor participation rate will cause the cancer rate
to decrease within a state.
B. This model explains about 10 percent of the variation in state
cancer rates.
C. At the .05 level of significance, there isn't enough evidence to say
the two variables are related.
D. If your sister starts working, the cancer rate in your state will
decline.
It is customary to express the R2 as a percent (here, the tcalc indicates
significance).
AACSB: Analytic
Blooms: Apply
70.
A researcher's results are shown below using Femlab (labor force

participation rate among females) to try to predict Cancer (death
rate per 100,000 population due to cancer) in the 50 U.S. states.
What is the R2 for this regression?
A.
B.
C.
D.
.9018
.0982
.8395
.1605
R2 = SSR/SST = (5,377.836)/(54,745.225) = .0982.

AACSB: Analytic
Blooms: Apply
71.
A news network stated that a study had found a positive correlation

between the number of children a worker has and his or her earnings
last year. You may conclude that:
A. people should have more children so they can get better jobs.
B. the data are erroneous because the correlation should be
negative.
C.
causation is in serious doubt.
D.
statisticians have small families.
There is no a priori basis for expecting causation.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
72.
William used a sample of 68 large U.S. cities to estimate the

relationship between Crime (annual property crimes per 100,000
persons) and Income (median annual income per capita, in dollars).
His estimated regression equation was Crime = 428 + 0.050 Income.
We can conclude that:
A. the slope is small so Income has no effect on Crime.

B. crime seems to create additional income in a city.
C. wealthy individuals tend to commit more crimes, on average.
D. the intercept is irrelevant since zero median income is impossible
in a large city.
Zero median income makes no sense (significance cannot be
assessed from given facts).
AACSB: Analytic
Blooms: Apply
73.
Mary used a sample of 68 large U.S. cities to estimate the

relationship between Crime (annual property crimes per 100,000
persons) and Income (median annual income per capita, in dollars).
Her estimated regression equation was Crime = 428 + 0.050
Income. If Income decreases by 1000, we would expect that Crime
will:
A.
B.
C.
D.
increase by 428.
decrease by 50.
increase by 500.
remain unchanged.
The constant has no effect so Crime = 0.050 Income = 0.050(1000) = -50.

AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
74.
Amelia used a random sample of 100 accounts receivable to

estimate the relationship between Days (number of days from billing
to receipt of payment) and Size (size of balance due in dollars). Her
estimated regression equation was Days = 22 + 0.0047 Size with a
correlation coefficient of .300. From this information we can conclude
that:
A. 9 percent of the variation in Days is explained by Size.

B. autocorrelation is likely to be a problem.
C. the relationship between Days and Size is significant.
D. larger accounts usually take less time to pay.
R2 = .302 = .09. These are not time-series data, so there is no reason
to expect autocorrelation. We cannot judge significance without
more information.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
75.
Prediction intervals for Y are narrowest when:
A.
the mean of X is near the mean of Y.
B. the value of X is near the mean of X.
C. the mean of X differs greatly from the mean of Y.
D.
the mean of X is small.
Review the formula, which has (xi - )2 in the numerator. The
minimum would be when xi = .
AACSB: Analytic
Blooms: Remember
76.
If n = 15 and r = .4296, the corresponding t-statistic to test for zero

correlation is:
A.
B.
C.
D.
1.715.
7.862.
2.048.
impossible to determine without .
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.4296)[(15 - 2)/(1 - .42962)]1/2 = 1.715.

AACSB: Analytic
Blooms: Apply
77.
Using a two-tailed test at = .05 for n = 30, we would reject the

hypothesis of zero correlation if the absolute value of r exceeds:
A.
B.
C.
D.
.2992.
.3609.
.0250.
.2004.
Use rcrit = t.025/(t.0252 + n - 2)1/2 = (2.048)/(2.0482 + 30 - 2)1/2 = .3609

for d.f. = 30 - 2 = 28.
AACSB: Analytic
Blooms: Apply
78.
The ordinary least squares (OLS) method of estimation will

minimize:
A.
B.
C.
D.
neither the slope nor the intercept.

only the slope.
only the intercept.
both the slope and intercept.
OLS method minimizes the sum of squared residuals.

AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
79.
A standardized residual ei = -2.205 indicates:
A.
B.
C.
D.
a rather poor prediction.

an extreme outlier in the residuals.
an observation with high leverage.
a likely data entry error.
This residual is beyond 2se but is not an outlier (and without xi we

cannot assess leverage).
AACSB: Analytic
Blooms: Apply
80.
In a simple regression, which would suggest a significant relationship

between X and Y?
A.
B.
C.
D.
Large p-value for the estimated slope

Large t statistic for the slope
Large p-value for the F statistic
Small t-statistic for the slope
The larger the tcalc the more we feel like rejecting H0: 1 = 0.
AACSB: Analytic
Blooms: Remember
81.
Which is indicative of an inverse relationship between X and Y?
A.
A negative F statistic
B. A negative p-value for the correlation coefficient
C.
A negative correlation coefficient
D. Either a negative F statistic or a negative p-value
Fcalc and the p-value cannot be negative.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
82.
Which is not correct regarding the estimated slope of the OLS

regression line?
A.
B.
C.
D.
It is divided by its standard error to obtain its t statistic.

It shows the change in Y for a unit change in X.
It is chosen so as to minimize the sum of squared errors.
It may be regarded as zero if its p-value is less than .
We would reject H0: 1 = 0 if its p-value is less than the level of

significance.
AACSB: Analytic
Blooms: Remember
83.
Simple regression analysis means that:
A. the data are presented in a simple and clear way.

B.
we have only a few observations.
C. there are only two independent variables.
D. we have only one explanatory variable.
Multiple regression has more than one independent variable
(predictor).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
84.
The sample coefficient of correlation does not have which property?
A.
It can range from -1.00 up to +1.00.
B. It is also sometimes called Pearson's r.
C. It is tested for significance using a t-test.
D. It assumes that Y is the dependent variable.
Correlation analysis makes no assumption of causation or
dependence.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
85.
When comparing the 90 percent prediction and confidence intervals

for a given regression analysis:
A. the prediction interval is narrower than the confidence interval.

B. the prediction interval is wider than the confidence interval.
C. there is no difference between the size of the prediction and
confidence intervals.
D. no generalization is possible about their comparative width.
Individual values of Y vary more than the mean of Y.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
86.
Which is not true of the coefficient of determination?
A. It is the square of the coefficient of correlation.

B. It is negative when there is an inverse relationship between X and
Y.
C. It reports the percent of the variation in Y explained by X.
D. It is calculated using sums of squares (e.g., SSR, SSE, SST).
R2 cannot be negative.
AACSB: Analytic
Blooms: Remember
87.
If the fitted regression is Y = 3.5 + 2.1X (R2 = .25, n = 25), it is

incorrect to conclude that:
A.
B.
C.
D.
Y increases 2.1 percent for a 1 percent increase in X.

the estimated regression line crosses the Y axis at 3.5.
the sample correlation coefficient must be positive.
the value of the sample correlation coefficient is 0.50.
Units are not percent unless Y is already a percent.

AACSB: Analytic
Blooms: Apply
88.
In a simple regression Y = b0 + b1X where Y = number of robberies in

a city (thousands of robberies), X = size of the police force in a city
(thousands of police), and n = 45 randomly chosen large U.S. cities
in 2008, we would be least likely to see which problem?
A. Autocorrelated residuals (because this is time-series data)

B. Heteroscedastic residuals (because we are using totals
uncorrected for city size)
C. Nonnormal residuals (because a few larger cities may skew the
residuals)
D. High leverage for some observations (because some cities may be
huge)
It is not a time series, so autocorrelation would not be expected, but
the "size effect" is likely to produce heteroscedasticity, nonnormality,
and unusual leverage.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
89.
When homoscedasticity exists, we expect that a plot of the residuals

versus the fitted Y:
A.
B.
C.
D.
will form approximately a straight line.

crosses the centerline too many times.
will yield a Durbin-Watson statistic near 2.
will show no pattern at all.
Homoscedastic residuals exhibit no pattern (equal variance for all Y).

AACSB: Analytic
Blooms: Understand
90.
Which statement is not correct?
A. Spurious correlation can often be reduced by expressing X and Y

in per capita terms.
B. Autocorrelation is mainly a concern if we are using time-series
data.
C. Heteroscedastic residuals will have roughly the same variance for
any value of X.
D. Standardized residuals make it easy to identify outliers or
instances of poor fit.
Heteroscedastic residuals exhibit different variance for different X or
Y values.
AACSB: Analytic
Blooms: Understand
91.
In a simple bivariate regression with 25 observations, which

statement is most nearly correct?
A. A non-standardized residual whose value is ei = 4.22 would be

considered an outlier.
B. A leverage statistic of 0.16 or more would indicate high leverage.
C. Standardizing the residuals will eliminate any heteroscedasticity.
D. Non-normal residuals imply biased coefficient estimates, a major
problem.
For simple regression, the "high leverage criterion" is hi > 4/n = 4/25
= .16. We cannot judge a residual's magnitude without knowing the
standard error se. Standardizing is only a scale shift so does not
reduce heteroscedasticity. Non-normal errors do not bias the OLS
estimates.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
92.
A regression was estimated using these variables: Y = annual value

of reported bank robbery losses in all U.S. banks ($millions), X =
annual value of currency held by all U.S. banks ($millions), n = 100
years (1912 through 2011). We would not anticipate:
A. autocorrelated residuals due to time-series data.

B. heteroscedastic residuals due to the wide variation in data
magnitudes.
C. nonnormal residuals due to skewed data as bank size increases
over time.
D. a negative slope because banks hold less currency when they are
robbed.
It is a time series, so autocorrelation would be expected, and the
"size effect" is likely to produce heteroscedasticity and nonnormality,
but growth in both X and Y would yield a positive slope.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
93.
A fitted regression for an exam in Prof. Hardtack's class showed

Score = 20 + 7 Study, where Score is the student's exam score and
Study is the student's study hours. The regression yielded R2 = 0.50
and SE = 8. Bob studied 9 hours. The quick 95 percent prediction
interval for Bob's grade is approximately:
A.
B.
C.
D.
69
75
67
76
to
to
to
to
97.
91.
99.
90.
The quick interval is ypredicted 2se or 83 (2)(8) or 83 16.

AACSB: Analytic
Blooms: Apply
94.
Which is not an assumption of least squares regression?
A.
B.
C.
D.
Normal X values
Non-autocorrelated errors
Homoscedastic errors
Normal errors
The predictor X is not assumed to be a random variable at all.

AACSB: Analytic
Blooms: Apply
95.
In a simple bivariate regression with 60 observations there will be

_____ residuals.
A.
B.
C.
D.
60
59
58
57
There is one residual for every observation.

AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 12-03 Make a prediction for a given x value using a regression equation.
96.
Which is correct to find the value of the coefficient of determination

(R2)?
A.
B.
C.
SSR/SSE
SSR/SST
1 - SST/SSE
We use the ANOVA sums of squares to calculate R2.

AACSB: Analytic
Blooms: Remember
97.
The critical value for a two-tailed test of H0: 1 = 0 at = .05 in a

simple regression with 22 observations is:
A.
B.
C.
D.
1.725
2.086
2.528
1.960
From Appendix D, tcrit = 2.086 for d.f. = n - 2 = 22 - 2 = 20.

AACSB: Analytic
Blooms: Apply
98.
In a sample of size n = 23, a sample correlation of r = .400 provides

A.
B.
C.
D.
= .01 but not = .05.

= .05 but not = .01.
both = .05 and = .01.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.40)[(23 - 2)/(1 - .402)]1/2 = 2.000 > t.05 =
1.721 for d.f. = 23 - 2 = 21. However, the test would not be
significant for t.01 = 2.518.
AACSB: Analytic
Blooms: Apply
99.
In a sample of n = 23, the Student's t test statistic for a correlation

of r = .500 would be:
A.
B.
C.
D.
2.559.
2.819.
2.646.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.50)[(23 - 2)/(1 - .502)]1/2 = 2.646.

AACSB: Analytic
Blooms: Apply
100. In a sample of n = 23, the critical value of the correlation coefficient

A.
B.
C.
D.
.524
.412
.500
.497
Use rcrit = t.025/(t.0252 + n - 2)1/2 = (2.069)/(2.0692 + 23 - 2)1/2 = .4115

for d.f. = 23 - 2 = 21.
AACSB: Analytic
Blooms: Apply
101. In a sample of n = 23, the critical value of Student's t for a two-tailed

A.
B.
C.
D.
2.229
2.819
2.646
2.080
From Appendix D, t.025 = 2.080 for d.f. = n - 2 = 23 - 2 = 21.

AACSB: Analytic
Blooms: Apply
102. In a sample of n = 40, a sample correlation of r = .400 provides

A.
B.
C.
D.
= .025 but not = .05.

= .05 but not = .025.
both = .025 and = .05.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.40)[(40 - 2)/(1 - .402)]1/2 = 2.690 > t.025 =
2.024 for d.f. = 40 - 2 = 38. The test would also be significant a
fortiori if we used t.05 = 1.686.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
103. In a sample of n = 20, the Student's t test statistic for a correlation

of r = .400 would be:
A.
2.110
B.
1.645
C.
1.852
D. can't say without knowing if it's a two-tailed or one-tailed test.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (.40)[(20 - 2)/(1 - .402)]1/2 = 1.852.
AACSB: Analytic
Blooms: Apply

A.
B.
C.
D.
.587
.412
.444
.497
Use rcrit = t.025/(t.0252 + n - 2)1/2 = (2.101)/(2.1012 + 20 - 2)1/2 = .4437

for d.f. = 20 - 2 = 18.
AACSB: Analytic
Blooms: Apply

A.
B.
C.
D.
2.060
2.052
2.898
2.074

AACSB: Analytic
Blooms: Apply
106. In a sample of size n = 36, a sample correlation of r = -.450 provides

coefficient differs significantly from zero in a two-tailed test at:
A.
B.
C.
D.
= .01
= .05
both = .01 and = .05.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (-.45)[(36 - 2)/(1 - (-.40)2)]1/2 = -2.938 <

t.005 = -2.728 for d.f. = 34. The test would also be significant a fortiori
if we used t.025 = -2.032
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
107. In a sample of n = 36, the Student's t test statistic for a correlation

of r = -.450 would be:
A.
B.
C.
D.
-2.110.
-2.938.
-2.030.
tcalc = r[(n - 2)/(1 - r2)]1/2 = (-.45)[(36 - 2)/(1 - (-.40)2)]1/2 = -2.938.

AACSB: Analytic
Blooms: Apply

A.
B.
C.
D.
.329
.387
.423
.497
Use rcrit = t.025/(t.0252 + n - 2)1/2 = (2.032)/(2.0322 + 36 - 2)1/2 = .3191

for d.f. = 36 - 2 = 34.
AACSB: Analytic
Blooms: Apply

test of significance of the slope for a simple regression at = .05 is:
A.
B.
C.
D.
2.938
2.724
2.032
2.074

AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
110. A local trucking company fitted a regression to relate the travel time
The fitted regression is Time = -7.126 + 0.0214 Distance. If Distance
increases by 50 miles, the expected Time would increase by:
A.
B.
C.
D.
1.07
7.13
2.14
1.73
days
days
days
days
50(0.0214) = 1.07.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
111. A local trucking company fitted a regression to relate the cost of its
shipments as a function of the distance traveled. The Excel fitted
regression is shown.
Based on this estimated relationship, when distance increases by 50

miles, the expected shipping cost would increase by:
A.
B.
C.
D.
$286.
$143.
$104.
$301.
2.8666(50) = $143.33.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
112. If SSR is 2592 and SSE is 608, then:
A.
B.
C.
D.
the slope is likely to be insignificant.

the coefficient of determination is .81.
the SST would be smaller than SSR.
the standard error would be large.
R2 = SSR/SST = SSR/(SSR + SSE) = 2592/(2592 + 608) = .81. SST

cannot be smaller than SSR because SST = SSR + SSE. The
significance and standard error cannot be judged without more
information.
AACSB: Analytic
Blooms: Apply
113. Find the sample correlation coefficient for the following data.
A.
B.
C.
D.
.8911
.9124
.9822
.9556
Use Excel =CORREL(XData, YData) to verify your calculation using

the formula for r.
AACSB: Analytic
Blooms: Apply
114. Find the slope of the simple regression
A.
B.
C.
D.
= b0 + b1x.
1.833
3.294
0.762
-2.228
Use Excel to verify your calculations using the formulas for b0 and b1.
AACSB: Analytic
Blooms: Apply
115. Find the sample correlation coefficient for the following data.
A.
B.
C.
D.
.7291
.8736
.9118
.9563
Use Excel =CORREL(XData, YData) to verify your calculation using

the formula for r.
AACSB: Analytic
Blooms: Apply
116. Find the slope of the simple regression
A.
B.
C.
D.
= b0 + b1x.
2.595
1.109
-2.221
1.884
Use Excel to verify your calculations using the formulas for b0 and b1.
AACSB: Analytic
Blooms: Apply
117. A researcher's results are shown below using n = 25 observations.
A.
B.
C.
D.
[ -3.282, -1.284].
[ -4.349, -0.217].
[1.118, 5.026].
[ -0.998, +0.998].
For d.f. = n - 2 = 25 - 2 = 23, t.025 = 2.069, so -2.2834 (2.069)

(0.99855).
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-05 Calculate and interpret confidence intervals for regression coefficients.
118. A researcher's regression results are shown below using n = 8

observations.
A.
B.
C.
D.
[1.333,
[1.602,
[1.268,
[1.118,
2.284].
2.064].
2.398].
2.449].
For d.f. = n - 2 = 8 - 2 = 6, t.025 = 2.447, so 1.8333 (2.447)

(0.2307).
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-05 Calculate and interpret confidence intervals for regression coefficients.
119. Bob thinks there is something wrong with Excel's fitted regression.
What do you say?
A.
B.
C.
D.
The estimated equation is obviously incorrect.

The R2 looks a little high but otherwise it looks OK.
Bob needs to increase his sample size to decide.
The relationship is linear, so the equation is credible.
A visual estimate of the slope is y/x = (625 - 100)/(200 - 0) =

2.625, so the indicated slope less than 1 must be wrong, plus the
visual intercept is 100 (not 154.61) and the fit seems better than R2
= .2284.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Short Answer Questions
120. Pedro became interested in vehicle fuel efficiency, so he performed a

simple regression using 93 cars to estimate the model CityMPG = 0
+ 1 Weight where Weight is the weight of the vehicle in pounds. His
results are shown below. Write a brief analysis of these results, using
what you have learned in this chapter. Is the intercept meaningful in
this regression? Make a prediction of CityMPG when Weight = 3000,
and also when Weight = 4000. Do these predictions seem
believable? If you could make a car 1000 pounds lighter, what
change would you predict in its CityMPG?
It is reasonable that a causal relationship might exist between a

vehicle's weight and its MPG. We expect a negative slope (heavier
vehicles would get lower MPG). The coefficient of Weight differs from
zero at any common value of (the p-value is less than .0001) and
the F statistic is huge. The confidence interval for the coefficient of
the predictor Weight does not include zero. The highly significant
predictor Weight is consistent with the high coefficient of
determination (R2 = .711), which says that well over half the
variation in MPG is explained by Weight. If Weight = 3000, we
predict MPG = 47.0484 - .0080 Weight = 47.0484 - .0080(3000) =
23.05 mpg. If Weight = 4000, we predict MPG = 47.0484 - .0080
Weight = 47.0484 - .0080(4000) = 15.05 mpg. The intercept is not
meaningful since no vehicle has zero weight or a weight close to
zero.
Feedback: It is reasonable to postulate that a causal relationship

might exist between a vehicle's weight and its MPG. Our a priori
expectation would be that the slope should be negative since we
would expect that heavier vehicles would get lower MPG. The
coefficient of Weight differs from zero at any common value of (the
p-value is less than .0001) and the F statistic is huge. The confidence
interval for the coefficient of the predictor Weight does not include
zero. The slope's sign is negative, as anticipated a priori. The highly
significant predictor Weight is consistent with the high coefficient of
determination (R2 = .711), which says that well over half the
variation in MPG is explained by Weight. If Weight = 3000, we
predict MPG = 47.0484 - .0080 Weight = 47.0484 - .0080(3000) =
23.05 mpg. When Weight = 4000, we would predict MPG = 47.0484 .0080 Weight = 47.0484 - .0080(4000) = 15.05 mpg. The intercept is
not meaningful since no vehicle has zero weight or any weight close
to zero.
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 3 Hard
121. Mary noticed that old coins are smoother and more worn. She
weighed 31 nickels and recorded their age, and then performed a
simple regression to estimate the model Weight = 0 + 1 Age where
weight is the weight of the coin in grams and Age is the age of the
coin in years. Her results are shown below. Write a brief analysis of
these results, using what you have learned in this chapter. Make a
prediction of Weight when Age = 10, and also when Age = 20. What
does this tell you? Is the intercept meaningful in this regression?
It is reasonable to postulate a causal relationship between a coin's

age and its weight (negative slope, since we would expect that coins
will wear down with usage). The coefficient of Age differs from zero
at any common (the p-value is less than .0001) and the F test
statistic is large. The confidence interval for the coefficient of Age
does not include zero, and its sign is negative, as anticipated a priori.
Despite the significant predictor Age, the coefficient of determination
(R2 = .442) shows that less than half the variation in nickel weights is
explained by Age. If Age = 10, we predict Weight = 5.0210 - .0040
Age = 5.0210 - .0040(10) = 4.981 gm. If Age = 20, we predict
Weight = 5.0210 - .0040 Age = 5.0210 - .0040(20) = 4.941 gm. The
intercept is meaningful if Age = 0 was in the sample data set (or at
least some Age value near zero). The intercept is logically
meaningful because Age = 0 is something we might observe (i.e., a
newly minted nickel).
Feedback: It is reasonable to postulate that a causal relationship

might exist between a coin's age and its weight. Our a priori
expectation would be that the slope should be negative since we
would expect that coins will wear down with usage. The coefficient of
Age differs from zero at any common value of (the p-value is less
than .0001) and the F test statistic is quite large. The confidence
interval for the coefficient of Age does not include zero, and its sign
is negative, as anticipated a priori. Despite the highly significant
predictor Age, the coefficient of determination (R2 = .442) shows that
less than half the variation in nickel weights is explained by Age. Our
predictions: If Age = 10, we would predict Weight = 5.0210 - .0040
Age = 5.0210 - .0040(10) = 4.981 gm. If Age = 20, we would predict
Weight = 5.0210 - .0040 Age = 5.0210 - .0040(20) = 4.941 gm. The
intercept is meaningful, assuming that Age = 0 years was included
in the sample data set (or at least some Age value near zero). The
intercept is logically meaningful a priori because Age = 0 is
something we might easily observe (i.e., a newly minted nickel).
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 3 Hard

Regression Chapter 12 Simple Linear Regression

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Regression Chapter 12 Simple Linear Regression

Caricato da

Copyright:

Formati disponibili

Chapter 12

True / False Questions

2. The correlation coefficient r measures the strength of the linear

3. Pearson's correlation coefficient (r) requires that both variables be

4. If r = .55 and n = 16, then the correlation is significant at = .05 in a

5. A sample correlation r = .40 indicates a stronger linear relationship

6. A common source of spurious correlation between X and Y is when a

7. The correlation coefficient r always has the same sign as b1 in Y = b0 +

8. The fitted intercept in a regression has little meaning if no data values

10 In a simple regression, if the coefficient for X is positive and

11 In least-squares regression, the residuals e1, e2, . . . , en will always

13 In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads

17 The least squares regression line gives unbiased estimates of 0 and

18 In a simple regression, the correlation coefficient r is the square root of

19 If SSR is 1800 and SSE is 200, then R2 is .90.

22 For a regression with 200 observations, we expect that about 10

24 Cause-and-effect direction between X and Y may be determined by

25 The ordinary least squares method of estimation minimizes the

27 If you have a strong outlier in the residuals, it may represent a different

28 A negative correlation between two variables X and Y usually yields a

29 In linear regression between two variables, a significant relationship

31 In simple linear regression, the coefficient of determination (R2) is

33 An observation with high leverage will have a large residual (usually an

34 A prediction interval for Y is narrower than the corresponding

37 "High leverage" would refer to a data point that is poorly predicted by

38 The studentized residuals permit us to detect cases where the

39 A poor prediction (large residual) indicates an observation with high

40 Ill-conditioned refers to a variable whose units are too large or too

41 A simple decimal transformation (e.g., from 18,291 to 18.291) often

43 A predictor that is significant in a one-tailed t-test will also be

44 Omission of a relevant predictor is a common source of model

45 The regression line must pass through the origin.

47 In a simple regression, there are n - 2 degrees of freedom associated

48 In a simple regression, the F statistic is calculated by taking the ratio of

49 The coefficient of determination is the percentage of the total variation

51 A prediction interval for Y is widest when X is near its mean.

53 In correlation analysis, neither X nor Y is designated as the

54 A negative value for the correlation coefficient (r) implies a negative

56 Autocorrelated errors are not usually a concern for regression models

58 When the errors in a regression model are not independent, the

59 In a simple bivariate regression, Fcalc = tcalc2.

Multiple Choice Questions

62 The standard error of the regression:

64 A local trucking company fitted a regression to relate the travel time

65 If the attendance at a baseball game is to be predicted by the equation

66 A hypothesis test is conducted at the 5 percent level of significance to

67 Which of the following is not a characteristic of the F-test in a simple

Which of the following statements is not true?

69 A researcher's results are shown below using Femlab (labor force

Which statement is valid regarding the relationship between Femlab

70 A researcher's results are shown below using Femlab (labor force

What is the R2 for this regression?

71 A news network stated that a study had found a positive correlation

A. the slope is small so Income has no effect on Crime.

73 Mary used a sample of 68 large U.S. cities to estimate the relationship

74 Amelia used a random sample of 100 accounts receivable to estimate

A. 9 percent of the variation in Days is explained by Size.

77 Using a two-tailed test at = .05 for n = 30, we would reject the

78 The ordinary least squares (OLS) method of estimation will minimize:

neither the slope nor the intercept.

79 A standardized residual ei = -2.205 indicates:

a rather poor prediction.