Sei sulla pagina 1di 6

Statistics Chapter 4

Project
Autumn & Lindsey
For our data set, we chose the senior class
accumulative GPA and their ACT score. We collected
the data by asking our classmates their ACT and
accumulative GPA. We do not believe that the
students GPA and their ACT score correlate. Our data
is a population of the senior class at Fowler High
School.
Explanatory & Response Variable
The explanatory variable: accumulative GPA
The response variable: ACT score
The explanatory variable is the independent variable.
An independent variable does not depend on any
other variables. Therefore, accumulative GPA would
be the explanatory variable because it is not affected
by any other variables.
The response variable is the dependent variable. A
dependent variables value does depend on other
variables. Therefore, ACT score would be the response
variable because the result is affected by other
variables, which, in our case would be the GPA.

Scatter Diagram

Our scatter plot shows positive correlation. Our scatter


plot shows moderate positive correlation because if it
was strong correlation, the data points would have
been closer to the line. If the correlation was weak,
the data points wouldn't have been as close to the
line as they are.
Correlation Coefficient
The correlation coefficient is a number between -1 and
+1 calculated to represent the linear dependence of

two variables. To find our correlation coefficient, we


typed our data points into the L1 and L2 list in the
calculator and then ran a linear regression. The r
value is a measure of how correlated the two variables
are.
Our correlation coefficient is .659
This correlation coefficient number, .659, shows that
our data is moderate. Since our number is not very
close to one, it is not very well correlated.
Xbar & Ybar
The Xbar is 3.2
What the Xbar tells us about our data set is the
average GPA. We found the Xbar by adding all the
GPAs together and dividing it by the number of our
data set, which is 49.
The Ybar is 21.9
What the Ybar tells us about our data set is the
average ACT score. We found the Ybar by adding all
the ACT scores together and dividing the 49, which is
the number of our data set.
Least Squares Regression
The least squares regression is a linear line that tells
us where the middle of our data et is, so that we can
use it to predict someone's ACT score or GPA
depending on what information we are missing.
Therefore, if we were missing someone's GPA, then we
can follow the line in order to predict their ACT score.
Our least squares regression is y= .11x+.75

Marginal Change
Marginal change is a small addition or subtraction to
the total quantity of some variable. In other words,
marginal change is the slope. To find the marginal
change, we used the calculator.
a= .11
Our marginal change is very low; this means that our
data is not very well correlated because this number
is very far away from one.
Influential Points
The points that might influence our data is (23,2.2),
(18,1.6), and (21,4). However since our data is
moderately correlated and we have a big data set,
removing these influential points will not affect out
correlation coefficient very much and most likely will
not make our data more closely related to the trend
line.
We found our Influential points by hovering over the
points on our scatter diagram.
Coefficient of Determination
The coefficient of determination is a key output of
regression analysis. To find the coefficient of
determination, we put our data into the L1 list and L2
list and then ran a linear regression and found the r2
value.
The coefficient of determination is .434.

From this number, we can see that our data isnt very
well correlated because this number is not close to
one.
Explained & Unexplained Variation
The explained variation is the sum of the squared of
the differences between each predicted y-value and
the mean of y.
The unexplained variable is the sum of the squared of
the differences between the y-value and each ordered
pair and each corresponding predicted y-value.
The explained variation for our data set is 43%. We
may have had a low explained variation because there
are several data points are farther away from our line,
we may have a lower explained variation.
The unexplained variation for our data set is 57%. We
may have had a high unexplained variation because
many students may have had the flu or a sickness the
week of the ACT.
Lurking Variables
Some lurking variables for our data set include:
1. Some people may not have been honest with their
GPA and ACT scores
2. Some people may have said their quarter GPA and
not their accumulative GPA
3. The ACT can be taken more than once, someone
may have taken the ACT more than once, and
didnt say their best ACT score

4. Some people didnt know their GPA and ACT, so


they probably guessed
5. Some people may have had a sickness the week
of the ACT, which would affect their scores
6. Some peoples GPAs may have been incorrect
due to a grading mistake
Interpolation & Extrapolation
Interpolation is an estimation of a value with two
known values in a sequence of values. Extrapolation is
estimating beyond original observational range.
Interpolation: 1.146
Extrapolation: 3
To get interpolation, we picked a random GPA between
our lowest GPA (1.6) and highest GPA (4.0) we picked
3.5 and plugged that number into our regression
equation. For 3.5, we looked at x to see what we got
for y
For extrapolation, we picked a number that wasnt
between 1.6 and 4.0. We picked 1 and then expanded
our graph to see where the trend line correlated with
the GPA of 1 in order to find out the ACT score of
someone with GPA of 1.
Conclusion
After our project, we determined that the seniors
accumulative GPA moderately affects their ACT score.

Potrebbero piacerti anche