Sei sulla pagina 1di 18

4/21/2008

20 Answers
Mix and Match
1. d 2. c 3. h 4. j 5. a 6. i 7. e 8. g 9. b 10. f The formula for the optimal price does not make sense unless the elasticity is less than -1.

True/False
11. False. 12. False. Correlation only measures the amount of linear association. Other patterns may be present as well 13. True 14. True. Residuals carry the same units as the response, whatever it might be. 15. True. If the scale of the variable is measured in something per something else, a reciprocal just flips this over. 16. False. If, for example, both y and x are on log scale, then the slope is the elasticity. Elasticity has a very important interpretation. 17. False. The point of the transformation is to capture the pattern in the data, not make outliers look more natural. Often, outliers do seem more reasonable after a transformation, but this is more often an accident. Often, transformations reveal other points as outliers. 18. False. The elasticity requires that both x and y be on a log scale. 19. True 20. True

4/21/2008

20 Curves

Think About It
21. If the relationship is linear, these would cost the same amount (assuming fixed costs are near zero). For example, if diamonds cost = 2500 Weight. $2,500 per carat with no fixed costs, the fitted line is y A one-carat diamond then goes for $2,500 and each carat diamond for $1,250. 22. The reciprocal equation is better, and it implies that reducing the weight by 50 pounds has a larger effect on the mileage of a smaller car than the mileage of a larger car. Take the junk out of the trunk. 23. Heavier cars also tend to have more powerful engines. As with the Ferrari, larger engines burn more fuel regardless of how they are driven. 24. Once can think of many possible lurking factors, such as simultaneous changes in the level of promotion (that came with the changes in the prices) or promotions of other products in the stores. In general anything that brought more customers to the store would potentially increase the sales. If these actions coincide with changes in the price of this item, the equation in the text would mix those effects with the effects of price. It would be a good idea to look at a timeplot of the residuals. 25. These would be unrelated. Changes in the price would have no effect on the quantity sold. Think simple variation in sales. 26. A positive elasticity higher prices drawing higher sales is a bit odd. More likely, the explanation for this apparent elasticity is a lurking factor that has happened to boost sales in spite of rising prices. 27. Its still the predicted value when the predictor is zero, only in this case the predictor is 1/x rather than x itself. To make the predictor go away, make x get really big. In other words, b0 is the fitted value when x gets arbitrarily large, sometimes called the asymptote. 28. In this case, the predictor is log x. To make log x = 0, set x = 1. The intercept is the predicted value for y when x = 1 (not when x is zero). 29. Rounding has many effects on data. In this case, the stripes come from the reporting of gasoline mileage in whole numbers. You can also see the stripes (as horizontal bands) in Figure 20.3. When you subtract out the line, those horizontal stripes become tilted. 30. Not at all. Changing the units by multiplying them by a constant will not affect the percentage changes.

A20-2

4/21/2008

20 Curves

You Do It
31. Wal-Mart a) The scatterplot follows. The pattern is not linear, and grows more rapidly in the later years. b) The scatterplot shows the fit of the linear equation, Estimated Operating Income = -570031.1 + 286.35664 Date The intercept is a gross extrapolation (to the year zero). The slope indicates that on average operating income is growing about 286 billion annually at Wal-Mart over these 16 years. c) The residuals show the up-down-up pattern common when a nonlinear pattern has been estimated using a linear equation. The residuals also show more variation in the later years, with wider swings around the trend. d) The scatterplot on a log scale is more linear, with more consistent variation over these years. The plot does, however, seem to bend back the other way from the initial plot. The log scale emphasizes the gyration in sales in the late 1990s that is otherwise less apparent. e) The fitted equation is Estimated Log Op Income = -306.7946 + 0.1572509 Date The slope (times 100) in this equation estimates the annual percentage change in operating income, here about 16%. This interpretation comes from algebra similar to that shown in the text. The estimated change in the log income from year t to year t+1 is 0.157 = log y(t+1) log y(t) = log(y(t+1)/y(t) log(1 + (y(t+1)-y(t))/y(t)) (y(t+1)-y(t))/y(t)) f) The residuals have an odd wavy pattern, tracking over time. We can also see that one quarter has exceptionally larger values over these years, namely the 4th quarter associated with the holiday shopping season. g) The log transformation reveals some details that are otherwise less noticeable (seasonal pattern, dip in sales in the late 90s). The slope also has a nice interpretation as the rate of growth. We cannot compare R2 or se since the response in these two equations differs.
6000 5000

Operating Income

4000
1500

3000 2000 1000 0 1990 1995 2000 2005


Residual

1000 500 0 -500 -1000 1990 1995 2000 2005

Date

Date

A20-3

4/21/2008

20 Curves

8.5

Log Op Income

8 7.5 7 6.5 6 1990 1995 2000 2005

0.5 0.4

Residual

0.2 0.0 -0.1 -0.3 -0.5 1990 1995 2000 2005

Date

Date

32. Target a) The scatterplot follows. The pattern is not linear, and grows more rapidly in the later years. b) The scatterplot shows the fit of the linear equation, Estimated Operating Income = -113814.6 + 57.226374 Date The intercept is a gross extrapolation (to the year zero). The slope indicates that on average operating income is growing about $57 billion annually at Target over these 16 years. c) The residuals trend down, then up. The residuals also show more variation in the later years, with wider swings around the trend. The seasonal pattern is also evident, with 4th quarter income much larger than in other quarters. The spread among quarters is expanding as well. d) The scatterplot on a log scale is more linear, with more consistent variation over these years. The log scale shows more of the variation in income in quarters 1-3 that is otherwise obscured by the large seasonal variation. e) The fitted equation is Estimated Log Op Income = -236.6474 + 0.1214651 Date The slope (times 100) in this equation estimates the annual percentage change in operating income, here about 12%. This interpretation follows from algebra similar to that shown in the text that produces the interpretation of an elasticity. The estimated change in the log income from year t to year t+1 is 0.121 = log y(t+1) log y(t) = log(y(t+1)/y(t) log(1 + (y(t+1)-y(t))/y(t)) (y(t+1)-y(t))/y(t)) f) The residuals from the quarters 1-3 track the fit relatively well, with those from the 4th quarter associated with the holiday shopping season being much above the fitted trend (but with a closing gap in more recent years). g) The log transformation reveals some details that are otherwise less noticeable (variation in quarters 1-3). On a log scale, Target seems to have been relatively steady until 1995, and then began to grow rapidly. A20-4

4/21/2008

20 Curves

The slope also has a nice interpretation as the rate of growth. We cannot compare R2 or se since the response in these two equations differs.
1800 1600 1400 1200 800 600 400 200 0 1990 1995 2000 2005 1000
700 600 500 400 300 200 100 0 -100 -200 -300 -400 1990 1995 2000 2005

Operating Income

Date

Residual

Date

7.5

Log Operating Income

7 6.5 6 5.5 5 1990 1995 2000 2005


1.0

Residual

0.5 0.0 -0.5 -1.0 1990 1995 2000 2005

Date

Date

33. Wine a) The scatterplot follows. The relationship seems fairly linear over this range of ratings, though the floor of prices at zero complicates the analysis. b) The linear regression is shown in red. The equation of the fit is Estimated Price = -1058.588 + 12.184283 Rating 2 with R = 0.55. The plot shows that the linear model underpredicts poorly rated wines (theyre not giving them away) as well as those that are very highly rated. c) The plot on the log scale appears to have a more linear trend than the original scatterplot. d) The fit of the model on the log scale is Estimated log(Price) = -19.67249 + 0.2566828 Rating As shown in plot below, the log model (curve) does not predict negative prices for lower rated wines. e) We cannot use R2 and se to compare these models because the response in the two models is different. The scale of se is not the same in the two models. Similarly, we should not compare directly explaining variation in the log of price with explaining variation in the price itself. Well stick to the plot and the substantive common sense of preferring a model that does not predict negative prices for common situations.

A20-5

4/21/2008

20 Curves

5.5

150

5 4.5

Price

100

Log Price
85 86 87 88 89 90 91 92 93 94 95 96

4 3.5 3 2.5

50

2 85 86 87 88 89 90 91 92 93 94 95 96

Rating

Rating

34. Display space a) Its not linear, and we should not expect it to be linear. Sales grow more rapidly with the initial promotional effort, then rise more slowly. Basically, once an interested customer can find the product, we dont need to have much excess space. b) The linear equation is Estimated Sales = 93.032311 + 39.75648 Display Feet This equation implies that sales continue to steadily rise on average with more allocated space to this product. Thats not reasonable. c) The plot with the predictor on the log scale looks more linear, but pushes the data to the right edge of the range. d) The fit on a log scale is Estimated Sales = 83.560256 + 138.62089 Log (Display Feet) The intercept is the estimated level of sales with 1 foot on display, $83.56. The slope gives the effect of a percentage increase in the amount on display. Using the methods illustrated in the text, each 1% increase in the display space comes with 0.01 b1 = $1.39 in sales per week. This shows the diminishing marginal returns common in advertising. e) These can be used here because the response variable and cases are the same in both fits. The equation using log of the display feet fits a little better, but not dramatically. Model Linear Log R2 0.712 0.815 Se $51.59 $41.31

A20-6

4/21/2008
450 400 350 300
450 400 350 300 250 200 150 100 50 0 0 .5 1

20 Curves

Sales

250 200 150 100 50 0 0 1 2 3 4 5 6 7 8

Sales

Display Feet

1.5

Log display feet

35. Used Accords a) It would be odd if the price fell off at a constant rate. The values cannot go negative, after all. One should expect larger drops in the first few years. b) The fit of the linear equation (shown in the figure below as the red line) is Estimated Asking price = 15.462647 - 0.9464141 Age The intercept suggests a new used car ( just driven off the dealers lot) is about $15,462. The slope indicates these cars drop in resale value by about $946 per year. c) The plot of the residuals (below left) shows that this equation misses the pattern, underestimating the price at the two ends and overestimating the price in the middle of the age range. d) These residuals seem simpler when compared to the pattern that is evident in the residuals from the linear equation. e) The log predictor implies that price changes rapidly among relatively newer used cars, then drops off more slowly as the cars age. f) We can compare these summaries in this example because the response variable is the same. The log equation fits better, with larger R2 and smaller se. By bending, the log equation captures more of the variation in the data. This table shows the results. Model Linear Log R2 0.795 0.928 Se $2,190 $1,300

g) The intercept is the predicted asking price for a car that is one year old, $22,993. Using the same approximations as in the text, we see that a 1% increase in age comes with a change in the resale price of about 0.01 b1 = 0.078 thousand dollars. This implies a much larger change in the first years of aging. h) The estimated asking price drops almost $5,500 from $22,993 at age 1 A20-7

4/21/2008

20 Curves

to 17,573 in year 2. For older cars, the drop is smaller. The estimated price drops from $4,243 at year 11 to $3,562 in year 12.
25 20

Asking price

15 10 5 0 0 5 10 15

Age

10.0 7.5

4 3 2

Residual

Residual

5.0 2.5 0.0 -2.5 -5.0 0 5 10 15

1 0 -1 -2 -3 0 5 10 15

Age

Age

36. Used Camrys In general, this example closely follows the previous exercise for Honda Accords. The lack of linearity is less apparent, however. a) It would be odd if the price fell off at a constant rate. The values cannot go negative, after all. One should expect larger drops in the first few years. b) The fit of the linear equation (shown in the figure below as the red line) is Est Asking Price ($000) = 15.726113 - 1.0416708 Age The intercept suggests a new used car ( just driven off the dealers lot) is about $15,726. The slope indicates these cars drop in resale value by about $1,042 per year. c) The plot of the residuals (below left) shows that this equation misses the pattern, underestimating the price at the two ends and overestimating the price in the middle of the age range. d) These residuals seem simpler when compared to the pattern that is evident in the residuals from the linear equation, but the differences between these is more subtle than in the prior exercise. e) The log predictor implies that price changes rapidly among relatively newer used cars, then drops off more slowly as the cars age. f) We can compare these summaries in this example because the response variable is the same. The log equation fits better, with larger R2 and smaller se. By bending, the log equation captures more of the A20-8

4/21/2008 variation in the data. This table shows the results. Model Linear Log R2 0.858 0.890 Se $1,799 $1,585

20 Curves

g) The intercept is the predicted asking price for a car that is one year old, $22,286. Using the same approximations as in the text, we see that a 1% increase in age comes with a change in the resale price of about 0.01 b1 = 0.0773 thousand dollars. This implies a much larger change in the first years of aging. h) The estimated asking price drops almost $5,500 from $22,286 at age 1 to 16,930 in year 2. For older cars, the drop is smaller. The estimated price drops from $3,757 at year 11 to $3,085 in year 12.

Asking Price ($000)

15

10

0 0 5 10 15

Age (years)

6 4

5 4 3

Residual

2 0 -2 -4 0 5 10 15

Residual

2 1 0 -1 -2 -3 0 5 10 15

Age (years)

Age (years)

37. Cellular phones in the US a) The rapid expansion of this market suggests that we ought to expect something nonlinear -- exponential growth. b) Its certainly growing rapidly, perhaps more smoothly than we might have expected. c) The linear equation is not a good summary. The fitted equation Estimated Subscribers = -1.96e+10 + 9831282.4 Date suggests that the market is growing at about 10 million per year. Thats in the right ballpark overall, but is much too fast for the early years and much too slow for the later years. The negative intercept is a reminder A20-9

4/21/2008

20 Curves

that we cannot extrapolate this equation back to the year zero. d) On the log scale, the trend in the data bends the other way. An equation thats linear in the logs misses the trend as well. For example, the linear equation underpredicts the later years. This equation overpredicts the number of subscribers in later years. e) With the percentage changes plotted on the years since 1984, we can see the gradual slowing of the rate of growth. This rate of growth would be roughly constant for the log model to describe the curve. Since the rate slows, the log model misses the curvature. f) The estimated equation for the curve is Estimated Pct Growth = 1.1793435 + 148.0128 Recip(Date-1984) As the following scatterplot shows, this curve captures the slowing growth. g) For many, many years past 1984 (basically, infinitely many years), the rate of growth will eventually slow to 1.18 percent. h) Based on this fitted curve, wed estimate the percentage growth in the next period (Date - 1984 = 23) to be Estimated Pct Growth = 1.1793435 + 148.0128/23 7.6% Applying this to the last observation predicts the number of subscribers to be 219,420,457 * 1.076 236,096,411 about 236 million.

38. Cellular phones in Africa a) The sequence plot of both series together suggests a linear trend in the number of landlines. Mobile use appears to grow faster than linear, with some decrease in the rate of growth in the later years. A20-10

4/21/2008

20 Curves

b) The linear trend (shown in red below) has a high R2 0.98, but misses the trend at the extremes and middle of the data.

c) The linear equation is Estimated Fixed Line Subscribers (Sub-Sahara, 000) = -991136.1 + 497.94545 Year The slope implies annual growth of about 500,000, and the negative intercept represents a huge, unrealistic extrapolation. d) The residuals highlight the poor fit, magnifying the deviations from the linear trend. The linear equation underpredicts at the edges of the plot, and overpredicts in the middle.

e) The use of the log captures the bending pattern, which is subtle but important in the original plot (see the green curve above). The residuals from this curve also seem more random (though its hard to tell with A20-11

4/21/2008 only 11 cases).

20 Curves

f) The equation for the log is Estimated Log(Fixed Line Subscribers (Sub-Sahara, 000)) = -205.9254 + 0.1071677 Year which implies growth at a rate of about 10% per year. g) The fit is Log (Mobile Subscribers (Sub-Sahara, 000)) = -1263.01 + 0.635364 Year which implies a very high rate of growth, on the order of more than 63% annually. The fit is not so good and overstates the rate of growth (as shown in the scatterplot below), but its clear that the mobile business is growing much, much faster than landlines.

39. Pet foods, revisited a) The content of the plots are the same; only the labels on the axes have changed. Natural logs (on the left) are larger than base 10 logs, by a constant factor (see part d).
12

5.2 5.1

Log Sales Volume

11.5

Log 10 Volume

5 4.9 4.8 4.7 4.6 4.5

11

10.5

10 -0.4 -0.3

4.4
-0.2 -0.1 0 .1 .2 .3

-0.15 -0.1 -0.05

.05

.1

.15

Log Avg Price

Log 10 Price

A20-12

4/21/2008

20 Curves

b) The relationship for natural logs, as found in the text of this chapter is Estimated Log Sales Volume = 11.050556 - 2.4420491 Log Avg Price The estimated equation for base 10 logs has the same slope, but the intercept is smaller in keeping with the scale of the plots. Estimated Log 10 Volume = 4.7991954 - 2.4420491 Log 10 Price c) The R2 of both fits is 0.9546, and the se of the log 10 equation is smaller (0.026254 versus 0.060453) by the same factor that distinguishes the intercepts, 0.060453/0.026254 = 2.30262. d) The log 10 and log e differ by a constant factor, loge x = 2.30262 log10 x The slope, this constant factor, is the loge 10. Hence, we can substitute one for another in equations, as long as we keep the factor 2.30262 in the right place.
12

Log Sales Volume

11.5

11

10.5

10 4.4 4.5 4.6 4.7 4.8 4.9 5 5.1 5.2

Log 10 Volume

e) Use any base that youd like, so long as both x and y use the same base. The elasticity works out the same either way. 40. Movies a) The scatterplot in the actual units is not well-suited to a linear equation. The plot fans out and bends as the movies become big-sellers. These are low-lying outliers on the log scale, but the pattern seems more linear (though it too might bend a bit).
Subsequent Sales ($MM)
6 5 4 3 2 1 0 0 50 100 150 200 250 300

Log 10 Subsequent Purchase

1 0.5 0 -0.5 -1 -1.5 -2 1 2

Box Office Gross ($MM)

Log 10 Gross

b) The least-squares equation for the log10 of these variables is Estimated Log 10 Subsequent Purchase = -1.39107 + 0.90055 Log 10 Gross The elasticity is the slope, 0.90055. A20-13

4/21/2008

20 Curves

c) The elasticity implies that each 1% increase in box office sales translates into 0.9% increase in subsequent selling, on average and ignoring the effects of other factors. d) The fitted equation using natural logs is similar, with only the intercept becoming larger by the conversion factor 2.30262 Estimated Log Subsequent Purchase = -3.203059 + 0.90055 Log Gross The slope, or elasticity, is the same. Because base 10 logs are 2.30262 times base e logs, we can substitute one for the other. The constant factor cancels in the slope, since it appears in both x and y. The fit of the equation is the same in the sense that R2 = 0.6847 in either case. The se is larger for natural logs by the factor 2.30262, but this is just a change in the scale of the y axis and does not imply a different quality of fit. e) The movies, with log10 larger than 1 (for $10 million) are shown in this plot on the log-log scale. These seem to be movies intended for young audiences. These do not generate so much selling on pay-per-view.
1

Log 10 Subsequent Purchase

0.5 0 -0.5 -1 -1.5 -2 1 2

Bug's Life, A Tarzan Fantasia 2000 JACK FROST Thomas and the Magic Railroad

Log 10 Gross

A20-14

4/21/2008 4M Cars in 1989

20 Curves

a) The equation will tell us how many pounds of weight reduction are needed. If the equation is linear, then the weight can be trimmed from either model with comparable benefit. If the equation is not linear (as in the example in the text), it will be more advantageous to take the weight from the smaller car. (Assuming that weight reductions are equally costly in the two situations.) b) The text examples strongly motivate a nonlinear, or curved, relationship. Namely, we expect to find that 1/mileage is linearly related to weight. The reciprocal of the response is likely needed. c) We will have to rely on the interpretation of the model the science of the problem and the fit to the data. We wont be able to use summary statistics like R2 or se because the response is likely to differ, making these comparisons inappropriate. d) The scatterplot shows a strong negative association (r = -0.86) between mileage on weight. The pattern clearly bends in a fashion similar to that in the text of this chapter. The bending is more apparent near the ends of the data, cars with relatively small or large weights.
40 35

MPG City

30 25 20 15 1.5 2 2.5 3 3.5 4

Weight (000 lbs)

e) The linear equation (shown above) misses the pattern and conveys the impression that changes in weight have equal effects on mileage regardless of the weight of the car. The nonlinear equation (using Gallons per 100 miles as the response, green curve) captures the tendency for changes in weight to have more benefit at smaller weights. The equation of the fit is Estimated Gallons/100 Miles = 0.9432339 + 1.3615948 Weight (000 lbs) As the weight gets larger and larger, this equation will not predict the mileage to drop below zero.

A20-15

4/21/2008

20 Curves

Gallons/100 Miles

6 5 4 3 1.5 2 2.5 3 3.5 4

Weight (000 lbs)

f) The residuals from this equation track along the horizontal line representing the fit. The residual plot follows. There is some tendency toward larger variation in consumption among heavier cars, but the effect is small. The diagonal stripes visible in this plot come from rounding the mileage of cars to whole integers. None of the residuals is exceptionally large. (The largest positive residual is the Mazda Rx-7, a rotary-engine sports car.)
1.5 1.0

Residual

0.5 0.0 -0.5 -1.0 1.5

2.5

3.5

Weight (000 lbs)

g) The equation in the text is 2004: Estimated Gallons/100 miles = 1.11 + 1.21 Weight compared to 1989: Estimated Gallons/100 Miles = 0.94 + 1.36 Weight The intercept has gotten larger (more gas while not driving perhaps from larger engines) and the estimated effects of changes in weight have gotten smaller. Weight appears to matter less; perhaps larger vehicles perform better with the modern designs. The R2 for 2004 is 41% with se = 1.04 gallons per 100 miles. This equation fit in 1989 explains 76.5% of the variation in consumption with se = 0.42 gallons per 100 miles. The worse R2 for the more recent data is probably due to the outlying sports cars present in the data for 2004. h) Heavier cars use more gasoline for routine driving than smaller cars. On average, cars that weigh 3500 pounds (a value in our data) use about 1.4 more gallons to drive 100 miles than cars that weigh 2500 pounds. The effect of weight is not linear; reductions in weight have more benefit A20-16

4/21/2008 for smaller cars than for larger cars.

20 Curves

i) Based on my equation, I suggest taking the weight from the smaller car. A car that weighs 2500 pounds (in 1989) gets about 23 miles per gallon. A reduction in the weight by 250 pounds from 2500 pounds to 2,250 would boost its mileage to 25 mpg, based on the fit of this equation. In comparison, a 4000 pound car gets about 15.7 miles per gallon. To boost its mileage to 17.7 requires a reduction in weight of more than twice this magnitude (reduction of about 550 pounds down to 3,450 pounds). 4M Crime and Housing in Philadelphia a) Leaders could use such an equation to estimate the economic payback to the community (in terms of owning more sellable homes) of paying for greater police protection. Of course, such an analysis would also need to show that more police did in fact lower crime rates. b) We can only describe the association between crime rates and housing prices using these data, not causation. These data allow us to contrast housing prices in communities that have different crime rates. There could be other, lurking factors that affect the housing prices that are not captured by the association between crime rates and housing prices for these communities. We did not manipulate the crime rate in any of these communities; we can only compare housing prices in different communities with different crime rates. c) If we think of adding police to lower the crime rate, then the leader can think of these actions as manipulating the crime rate (albeit indirectly) to improve home values. Housing prices are the response affected by changes induced in the crime rate. d) Not linearly related., A linear relationship implies that differences in the crime rate from 0 to 1 come with the same differences in housing values as between 20 and 21 crimes. At some point, incremental changes in the crime rate are lost in the multitude of crimes. You can almost imagine the splash in the local news of one crime in a very safe neighborhood, compared to the yawn in a crime-ridden area. e) The direction is negative, with moderate strength (r = 0.43).
500000 400000

House Price ($)

300000 200000 100000 0 10 20 30 40 50 60 70

Crime Rate

A20-17

4/21/2008 f) The linear equation (shown in red) is

20 Curves

Estimated House Price ($) = 225233.55 - 2288.6894 Crime Rate The intercept estimates that the average selling price of homes in a very safe neighborhood with no crimes is $225,233. That seems a bit low considering some of the outliers. The slope implies that communities that differ in the crime rate by one differ in average housing prices by $2,289 (lower in those with more crime). Its tempting to say that lowering the rate of crime by 1 produces a $2,289 increase in home prices but thats sounding way too causal. The value of R2 implies that the linear equation describes only 18.4% of the variation in prices, and se = $78,862 implies that prices vary considerably from this fit. The SD of the residuals (other factors aside from price that affect housing prices) is almost $80,000. g) The fitted equation using the reciprocal of the crime rate is Estimated House Price ($) = 98120.08 + 1,298,242.7 1/(Crime Rate) The reciprocal of the crime rate counts the multiples of 100,000 people per crime rather than the number of crimes per 100,000 people. h) The plots of the two sets of residuals are very similar (linear is on the left, and reciprocal on the right). The linear equation has a slightly larger R2 with smaller se than the reciprocal model (17%, $79,565). We prefer the reciprocal because of the way that it captures the effect of different crime rates. (See i below.)
300000
300000 200000

Residual

Residual

200000 100000 0 -100000 10 20 30 40 50 60 70

100000 0 -100000 10 20 30 40 50 60 70

Crime Rate

Crime Rate

i) Going with the reciprocal, we find that communities with different crime rates also have different housing prices. The connection is weak, however, and describes less than 20% of the variation in housing prices among these communities. Changes in small crime rates, however, do have much larger effects on average than differences at higher crime rates. (See i). j) No. The estimated difference in housing prices between communities with 1 and 2 crimes per 10000 is huge; the estimated difference in housing prices goes from $1,396,362.7 at 1 crime per 100,000 to $747,241.35 at 2. A difference between crime rates of 11 to 12 is associated with an average difference of $216,142 - $206,307 $9,835. A linear equation fixes the effect at the slope of that equation, $2,289. A20-18

Potrebbero piacerti anche