Sei sulla pagina 1di 3

Correlation is described as the analysis which lets us know the association or the absence of

the relationship between two variables ‘x’ and ‘y’. it represent linear relationship between two
variables. Correlation coefficient indicates the extent to which two variables move together.
Pearson’s correlation coefficient is the test statistics that measures the statistical relationship,
or association, between two continuous variables. It is known as the best method of measuring
the association between variables of interest because it is based on the method of covariance. It
gives information about the magnitude of the association, or correlation, as well as the direction
of the relationship.

Correlation coefficient
The degree of association is measured by a correlation coefficient, denoted by r. It is sometimes
called Pearson's correlation coefficient after its originator and is a measure of linear association.
If a curved line is needed to express the relationship, other and more complicated measures of
the correlation must be used.

The correlation coefficient is measured on a scale that varies from + 1 through 0 to - 1.


Complete correlation between two variables is expressed by either + 1 or -1. When one variable
increases as the other increases the correlation is positive; when one decreases as the other
increases it is negative. Complete absence of correlation is represented by 0. Figure 11.1 gives
some graphical representations of correlation.

Figure :Correlation illustrated.

• Positive correlation – the other variable has a tendency to also increase;


• Negative correlation – the other variable has a tendency to decrease;
• No correlation – the other variable does not tend to either increase or decrease.

Basis for
Comparison Correlation Regression
Correlation is a statistical measureRegression describes how an
which determines co-relationship orindependent variable is numerically
Meaning association of two variables. related to the dependent variable.
To fit a best line and estimate one
To represent linear relationshipvariable on the basis of another
Usage between two variables. variable.
Dependent and
Independent
variables No difference Both variables are different.
Correlation coefficient indicates theRegression indicates the impact of a
extent to which two variables moveunit change in the known variable (x)
Indicates together. on the estimated variable (y).
To estimate values of random
To find a numerical value expressingvariable on the basis of the values of
Objective the relationship between variables. fixed variable.

Regression analysis
Regression analysis is a collection of statistical techniques that serve as a basis for drawing
inferences about relationships among interrelated variables. Since these techniques are
applicable in almost every field of study, including the social, physical and biological Sciences,
business and engineering, regression analysis is now perhaps the most used of all data analysis
methods.
Uses of Regression Analysis Three uses for regression analysis are for 1. prediction 2. model
specification and 3. parameter estimation. Regression analysis equations are designed only to
make predictions. Good predictions will not be possible if the model is not correctly specified
and accuracy of the parameter not ensured. However, accurate prediction and model
specification require that all relevant variables be accounted for in the data and the prediction
equation be defined in the correct functional form for all predictor variables. Parameter
estimation is the most difficult to perform because not only is the model required to be correctly
specified, the prediction must also be accurate and the data should allow for good estimation.
For example, multicolinearity creates a problem and requires that some estimators may not be
used. Thus, limitations of data and inability to measure all predictor variables relevant in a
study restrict the use of prediction equations.
Linear regression : Linear regression is the most basic and commonly used regression
technique and is of two types viz. simple and multiple regression. You can use Simple linear
regression when there is a single dependent and a single independent variable. Both the
variables must be continuous (quantitative) and the line describing the relationship is a
straight line (linear).
Multiple linear regression on the other hand can be used when we have one continuous
dependent variable and two or more independent variables, for example when we want to
answer the question, “Given that a patient has both diabetes and hypertension, what is his risk
of developing coronary artery disease (CAD)”. Importantly, the independent variables could
be quantitative or qualitative. Both the independent variables here could be expressed either
as continuous data (blood pressure or HbA1C values) or qualitative data (presence or absence
of diabetes as defined by the ADA 2016 or hypertension as defined by JNC VIII). A linear
relationship should exist between the dependent and independent variables.
B. Logistic regression: This type of regression analysis is used when the dependent variable
is binary in nature. For example, if the outcome of interest is death in a cancer study, any
patient in the study can have only one of two possible outcomes- dead or alive. The impact of
one or more predictor variables on this binary variable is assessed. The predictor variables
can be either quantitative or qualitative. Unlike linear regression, this type of regression does
not require a linear relationship between the predictor and dependent variables.

Potrebbero piacerti anche