Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
40 50 60 70
% Obama 2008
+
Simple Linear Regression (SLR)
Models
When data are collected in pairs the
standard notation used to designate this
is:
(x1, y1), (x2, y2), . . . , (xn, yn),
where x1 denotes the first value of the X-
variable and y1 denotes the first value of
the Y-variable. The X-variable is called the
explanatory variable, while the Y-
variable is called the response variable.
+
Simple Linear Regression (SLR)
Models
Forexample, the first few entries in
the election data set above were
(37.7, 40.8), (38.8, 38.4), (38.8,
36.9)
meaning 37.7% of Alabama voters
voted for President Obama in 2008,
while 40.8% did so in 2012, and so on.
+
Simple Linear Regression (SLR)
Models
We could use a matched pairs t-test to test
whether the same percentage of voters
voted for President Obama in 2008 and
2012 in the 50 states, on average. If the
percentages were indeed equal, what would
that mean for our regression model?
election<read.csv("/Users/Elizabeth//stat608/sp13/pct_obama.csv)
my.lm<lm(election$Obama_12~election$Obama_08)
my.lm
procregdata=election;
modelObama12=Obama08;
run;
procgplotdata=election;
ploty*x;
run;
+
Example: Election Data
Explanatory variable: % voting for Obama in 08
Response variable: % voting for Obama in 12
Coefficients:
(Intercept) election$Obama_08
-4.461 1.042
http://en.wikipedia.org/wiki/Matrix_calculus#
Derivatives_with_matrices
George Box
Inference
Objective: Develop hypothesis tests and
confidence intervals for the slope and
intercept of a least squares regression model
1. Assumptions: Next chapter
2. Bias of Estimates
3. Variability of Estimates
In English: Is the percent of the states that
voted for Obama in 2008 linearly related to
the percentage in 2012?
+
Review: Expectation and
Variance of Random Variables
Recall the definition of variance:
When most values are in the bottom left and top right of the
plot, the correlation is positive.
r = 0.98.
40
positive relationship
40 50 60 70
between 2008 and 2012.
% Obama 2008
+
Expectation and Variance of
Matrices
Assume x is a vector of random variables. Then:
+
Variance / Covariance Matrix
+
Correlation Matrix
Usual Assumptions
In matrix notation:
+
Bias
Intercept
Variance of the estimates
+
Inferences About the Slope and
Intercept
What does it mean to take the expectation of the
sample slope?
+ Inferences About the Slope 46
Distribution of slope:
The
standard error is the standard deviation of a statistic
(often, the sample standard deviation).
Dont mix up the critical value with the test statistic calculated
from the data! Look up the critical value on a table, or use R.
Variance of mean
+ 61
21 total men; 10
took the calcium
supplement, and 11
took the placebo.
+ 70
my.lm<-lm(y~x)
summary(my.lm)
Output:
Geometric Interpretation of
LetV be the vector space spanned by the
columns of our design matrix X.
Then is the projection of the vector y down
into the vector space V.
Is a in V?
+ 78
Geometric Interpretation of
Geometric Interpretation of
+ 80
Geometric Interpretation of
+ 81
Geometric Interpretation of
Geometric Interpretation of
Is it true that
What is
+
ANOVA Table
+
ANalysis Of VAriance (ANOVA)
yi = ( 0 + 1xi) + ( i)
ANOVA Table
Degre
es of
Freedo Sums of Mean F- P-
m Squares Squares Statistic value
Regressi
on 1
(Model)
Residual
n2
(Error)
Total
n1