Sei sulla pagina 1di 4

1

Rubric Table: 1.1: Please consult the given article Statistics review 7: Correlation and
regression by Viv Bewick1, Liz Cheek1 and Jonathan Ball2 to solve the questions.
Reference Focus questions +Marks Solution
1
st
Preference:
Consult Given
Article:
2
nd
: Consult Video
Lectures # 32 &33
and Recommended
Book (s)
1. How would you see
the role of product
moment
Correlation and
Regression analysis?
(Marks 02)









2. The product moment
correlation is also
known as the Pearson
correlation coefficient,
it has few other
names; please write at
least three?
(Marks 03)



3. Product moment
correlation (r) is best
used in which scenario
as compare to
Regression analysis
and ANOVA?
(Marks 03)


4. Misuse of correlation
is described in this
given article; please
highlight Three (3) of
Correlation and linear regression
are the most commonly used
techniques for investigating the
relationship between two
quantitative variables. The role
of a correlation analysis is to see
whether two measurement
variables co vary, and to quantify
the strength of the relationship
between the variables, whereas
regression expresses the
relationship in the form of an
equation.



Following are the three other
names which are used for
Pearson Correlation Coefficient.
1. Pearson product-moment
correlation coefficient. 2. PPMCC
or PCC or Pearson's r. 3.
Spearman rho correlation
coefficient or simple correlation
coefficient.



If there is a linear relationship
between the variables and to
quantify the strength of that
relationship, Product moment
correlation is best to use as
compare to Regression or
ANOVA.



Following are the common
situations in which the
correlation coefficient can be
misinterpreted.
2

them.
(Marks 09)




























5. In which situation,
regression mean
squares would be
approximately the
same as the residual
mean squares?
(Marks 03)






1. Failure to consider that
there may be third
variable related to both
the variables being
investigated, which is
responsible for the
apparent correlation, as
correlation does not
imply causation.
2. A non-linear relationship
may exist between two
variables that would be
inadequately described,
or possibly even
undetected, by the
correlation coefficient.
3. When comparing two
methods of
measurement
correlation coefficient
can be misinterpreted. A
high correlation can be
incorrectly taken to
mean that there is
agreement between two
methods. An analysis
that investigates the
differences between
pairs of observations is
more appropriate.



If there were no linear
relationship between the
variables then the regression
mean squares would be
approximately the same as the
residual mean squares. We can
test the null hypothesis that
there is no linear relationship
using an F test. The test statistic
is calculated as the regression
mean square divided by the
residual mean square, and a P
value may be obtained by
comparison of the test statistic
3






6. In what ways can
regression analysis be
used?
(Marks 10)


































with the F-distribution with 1
and n-2 degrees of freedom.



Regression analysis is a statistical
process for estimating the
relationships among variables. It
includes many techniques for
modeling and analyzing several
variables, when the focus is on
the relationship between
a dependent variable and one or
more independent variables.
More specifically, regression
analysis helps one understand
how the typical value of the
dependent variable changes
when any one of the
independent variables is varied,
while the other independent
variables are held fixed.
Regression analysis is widely
used for prediction and
forecasting, where its use has
substantial overlap with the field
of machine learning Many
techniques for carrying out
regression analysis have been
developed. Familiar methods
such as linear regression and
ordinary least squares
regression are parametric, in
that the regression function is
defined in terms of a finite
number of unknown parameters
that are estimated from the
data.. Regression can also be
helpful for Analysis of Variance
which can further lead to the
useful quantity generation
coefficient of determination.





4

7. What are the
assumptions made by
the regression model
in estimating the
parameters and in
significance testing?
(Marks 05)










8. In which case the
response variable
should have a Normal
distribution and the
variability of y should
be the same for each
value of the predictor
variable?

(Marks 05)





When using a regression
equation for prediction, errors in
prediction may not be just
random but be due to
inadequacies in the model. In
particular, extrapolating beyond
the range of the data is very
risky. A phenomenon to be
aware of that may arise with
repeated measurements on
individuals is regression to the
mean. Low at first reading and
high at second reading and its
converse can generate
misleading results.



For regression only the response
variable y must be random. In
the case when carrying out
hypothesis tests or calculating
confidence intervals for the
regression parameters, the
response variable should have a
normal distribution the
variability of y should be the
same for each value of the
predictor variable.

Potrebbero piacerti anche