0 Voti positivi0 Voti negativi

10 visualizzazioni3 pagineRESEARCH METHODS

Jul 30, 2014

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

RESEARCH METHODS

© All Rights Reserved

10 visualizzazioni

RESEARCH METHODS

© All Rights Reserved

- Statistical analysis of survey data
- STAB22 Data Analysis Project Instruction-1-已转档
- Data Science Course Brochure
- ICT Initiatives in Agriculture in India
- 0000 Stats Practice
- UT Dallas Syllabus for acn6312.001.09f taught by Pamela Rollins (rollins)
- Quality Kitchen Meatloaf Mix
- Application of Econometrics in Economics
- MATH 533 Course Project All Weeks Part A_Part B_Part C_AJ DAVIS is a Department Store Chain Answer
- Formulae 1
- Bozeman & Bretschneider the Publicness Puzzle in Organization Theory a Test of Alternative Explanations of Differences Between Public and Private Organizations
- ECON1203 2012 S2
- nternational Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
- overeaction new.pdf
- Hasil Olah
- Multiple Regression Result Explanation
- SAM
- Final Test
- Demand Forecasting CH 4
- Earliest Uses of Symbols in Probability and Statistics

Sei sulla pagina 1di 3

11.556 183

R

E

S

E

A

R

C

H

M

E

T

H

O

D

O

L

O

G

Y

Essentially when we run a regression we are actually estimating the

parameters on the basis of the sample of observations. Therefore

y =a+bx for example is a sample regression line much in the

same way that x is a sample estimate of the population parameter

m. In the same way our population regression line or the true

relationship of the data is : is Y=A+Bx . This equation however

is unknown and we have to use sample data to estimate it. The

true form of the unknown equation for the k variable case is:

k k

x b x b x b x b a y + + + + + = .. ..........

3 3 2 2 1 1

Even in the case of the population regression plane regression

plane not all data points will lie on it.. Why ? Consider our IRS

problem. Not all payments to informants will be equally effective.;

Some of the computer hours may be used for organizing

data rather than analyzing accounts.

For these and other reasons some of the data points will

lie above the regression plane and some below it .

Therefore instead of satisfying the above equation the

individual data points will satisfy :

e x b x b x b x b a y

k k

+ + + + + + = .. ..........

3 3 2 2 1 1

This is the population regression plane plus a random disturbance

term

.

The term e is a random disturbance term which equals zero

on the average. The standard deviation of this term of this term

is e. The standard error of the regression se which we have

talked about in the earlier section is an estimate of e.

As our sample regression equation :

k k

x b x b x b x b a y + + + + + = .. ..........

3 3 2 2 1 1

This equation estimates the unknown e population regression

plane

k k

x B x B x B x B a y + + + + + = .. ..........

3 3 2 2 1 1

As we can see the estimation of a regression plane can also be

thought of as a problem of statistical inference where we make

inferences regarding an unknown population relationship on the

basis of an estimated relationship based on sample data.

Much in the same way as for hypothesis testing for a mean we can

also set up confidence intervals for the parameters of the estimated

equation. We can also make inferences about the slopes of the

true regression equation slope parameters(B

1

, B

2

, B

k

) on the basis

of slopes coefficients of the estimated equation (b

1,

b

2

, b

3,

b

k

).

Tests of Inference of an Individual Slope Parameter

B

i

As explained earlier we can use the value of the individual b

i

,

which are values of the slope parameter for the ith variable , to test

a hypotheses about the value of Bi , which is the true population

value of the slope for the i

th

variable.

The process of hypotheses testing is the same as that delineated

for testing the mean.

When we perform a regression we are frequently interested in

questions whether y is actually dependent on x? That is if we take

our example we may ask whether the volumeof unpaid tax recovery

actually depends on the number of computer hours of research

the filed researcher puts in. Essentially we are asking is x a significant

explanatory variable for y?

We can say that if there is some relationship between y and x if

Bi0. There is no relationship between x and y if Bi=0.

Thus we can formulate our hypotheses regarding the tests of

significance of the x

i

coefficient as follows:

Ho: Bi=0 Null hypothesis , that x

i

is not a significant explanatory

variable for y

Ha: Bi0 Alternative hypothesis that x

i

is a significant explanatory

variable for y.

We can test this hypothesis using the t ratio:

bi

io i

s

B b

t

=

Where

b

i

: slope of fitted regression

B

io

: actual slope of hypothesized for the population

S

bi:

standard error of the regression coefficient

Why is the t Statistic Used ?

In multiple regressionwe use n data points to estimate k+1

coefficients, i.e., intercept a and b

1

b

k

slope coefficients. These

coefficients were used to calculate s

e

which estimates

se.

the standard

deviation of the disturbance of the data. We use s

e

to estimate s

bi

Therefore since se has n-k-1 degrees of freedom, s

bi

will have n-k-

1. Degrees of freedom.

The value of s

b

i is given in the output as stdev term.

Because our hypothesized value value of Bi is 0 , the standardized

value of the regression coefficient to becomes:

The value of to is called the observed or computed t value. This is

the number that appears in the column headed t ratio in the

computer out put.

We test for the significance of the t ratio by checking against the

column headed p-value. This column gives the prob values for

the two-tailed test of hypotheses:

Ho: Bi=0

Ha: Bi0

The prob values are the probabilities that each bi would be as far

(or farther ) away from zero( hypothesized value of Bi coefficient)

if ho is true. This is shown in Figure 2 . We need only to compare

the p values with a, the level of significance. To determine whether

xi is a significant explanatory variable of y.

LESSON 30:

MAKING INFERENCES ABOUT POPULATION PARAMETERS

Copy Right : Ra i Unive rsit y

184 11.556

R

E

S

E

A

R

C

H

M

E

T

H

O

D

O

L

O

G

Y

Figure 2

If p> Xi is not a significant explanatory variable .

If p< Xi is a significant explanatory variable .

This test of significance of the explanatory variable is always a

two-tailed test. The independent variable x

i

is a significant

explanatory variable if bi is significantly different from zero. This

requires that our t ratio be a large positive or negative.

In our IRS example for each of the three explanatory variables p

is less than .01. Therefore we conclude that each one is a significant

explanatory variable.

TEst of Significance of The Regression as a Whole

It is quite possible that we frequently may get a high value of R

2

by pure chance. After all if we throw a dart on board to get a scatter

plot we could generate a regression, which may conceivably have a

high R

2

. Therefore we need to ask the question a high value of R

2

necessarily mean that the independent variables explain a large

proportion of the variation in Y or could this be a freak chance.

In statistical terms we ask the following question:

Is the regression as a whole significant? In the last section we had

looked at whether the individual x

i

were significant. Now we ask

whether collectively all the x

i

(i=1k) together significantly explain

the variability in y.

Our hypothesis is:

Ho: B

1

=B

2

+B

k

= 0 Null hypothesis that y does not depend on x

is

Ha: atleast oneBi0 Alternative hypothesis that at least one Bi is

not zero.

To explain this concept we have to go back to our initial diagram,

which shows the two variable case. (insert diag Lr p743

The total variation in y

2

) ( y y

Explained variation by the regression is

2

) ( y y

Unexplained variation

2

) ( y y

This is shown in the figure 3 for the one variable case of simplicity.

For a multiple variable case the something applies conceptually.

Figure 3

Thus when we look at the variation in y we look at 3 different

terms each of which is a sum of squares .These are denoted as

follows:

SST= Total sum of squares

2

) ( y y

SSR=Regression sum of squares

2

) ( y y

SSE=Error sum of squares

2

) ( y y

Total variation in y can be broken into two parts: the explained

and the unexplained:

SST=SSR+SSE

Each of these has an associated degrees of freedom. SST has n-1

degrees of freedom. SSR has k degrees of freedom because there

are k independent variables. SSE has degrees of freedom n-k-1

because we used n observations to estimate k+1 parameters a,

b

1

,b

2

, ..b

k

.

If the null hypotheses is true we get the following F ratio

1

=

k n

SSE

k

SSR

F

Which has a F distribution with k numerator degrees of freedom

and n-k-1 degrees of freedom in the denominator.

If the null hypotheses is false i.e that the explanatory variables

have a significant effect on y then the F ratio tends to be higher

than if the null hypothesis is true., So if the F ratio is large we

reject the null hypotheses that the explanatory variables have no

effect on the variation of y. Therefore we reject Ho and conclude

that the regression is significant.

Going back to our IRS example we now look at the computer

output. A typical output of a regression also includes the computed

F ratio for the regression. This is also at times called the ANOVA

for the regression. This is because we break up the up the analysis

of variation in Y into explained variance or variance explained by

the regression(between column variace0 and unexplained

variance.(within column variance.) This is shown in table 3

Copy Right : Ra i Unive rsit y

11.556 185

R

E

S

E

A

R

C

H

M

E

T

H

O

D

O

L

O

G

Y

Table 3

Analysis of Variance

Source DF SS MS F P

Regression 3 29.1088 9.7029 118.52 0.00

Error 6 .4912 .0819

Total 9 29.600

The sample output for the IRS problem is given above.

SSR=29.109, k=3

SSE=.491 ( with n-k-1 df = 6) degrees of freedom.

3 . 118

6

491 . 0

3

11 . 29

= = F

The MS column is the sum of squares divided by the number of

degrees of freedom. The output also gives us the p- value, which

is 0.00. Because p< =0.01 we can conclude that the regression as

a whole is highly significant.

Exercises

Q1. Bill Buxton, a statistic professor in a leading business school,

has a keen interest in factors affecting students performance

on exams. The midterm exam for the past semester had a

wide distribution of grades, but Bill feels certain that several

factors explain the distribution: He allowed has students to

study from as many different books as they liked, their Iqs

they are of different ages, and they study varying amount of

time for exams. To develop a predicting formula for exam

grads, Bill asked each student to answer, at the end of the

exam, questions regarding study time and number of books

used. Bills teaching record already contained the Iqs and

ages for the students, so he compiled the data for the class

and ran a multiple regression with Minitab. The output form

Bills computer run was as follows:

Predictor Coef Stdev T-

ratio

P

Constant -49.948 41.55 -1.20 0.268

Hours 1.06931 0.98163 1.09 0.312

Iq 1.36460 0.37627 3.63 0.008

Books 2.03982 1.50799 1.35 0.218

Age -

1.78990

0.67332 -2.67 0.319

S = 11.657 R sq = 76.7%

a. What is the best fitting regression equation for these data?

b. What percentage of the variation in grades is explained by this

equation?

c. What grade would you aspect for a 21- year old student with an

IQ of 113 who studied 5 hour and used three different books?

Notes

- Statistical analysis of survey dataCaricato daKen Lim
- STAB22 Data Analysis Project Instruction-1-已转档Caricato daRenu Kumari
- Data Science Course BrochureCaricato daRamesh Kummam
- ICT Initiatives in Agriculture in IndiaCaricato daShambhu Kumar
- 0000 Stats PracticeCaricato daJason McCoy
- UT Dallas Syllabus for acn6312.001.09f taught by Pamela Rollins (rollins)Caricato daUT Dallas Provost's Technology Group
- Quality Kitchen Meatloaf MixCaricato datiger_71
- Application of Econometrics in EconomicsCaricato daMayank Chaturvedi
- MATH 533 Course Project All Weeks Part A_Part B_Part C_AJ DAVIS is a Department Store Chain AnswerCaricato daTom Cox
- Formulae 1Caricato daaliebrahim88
- Bozeman & Bretschneider the Publicness Puzzle in Organization Theory a Test of Alternative Explanations of Differences Between Public and Private OrganizationsCaricato daDaniel Hernández
- ECON1203 2012 S2Caricato datommy
- nternational Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.Caricato daInternational Journal of computational Engineering research (IJCER)
- overeaction new.pdfCaricato daSANKALAN GHOSH
- Hasil OlahCaricato daIrwan Cungkring
- Multiple Regression Result ExplanationCaricato daBishawnath Roy
- SAMCaricato daAli Tariq Butt
- Final TestCaricato daAlex Bamford
- Demand Forecasting CH 4Caricato daNayana Weerasinghe
- Earliest Uses of Symbols in Probability and StatisticsCaricato daGolamKibriabipu
- BA7_Multiple Regression 7.05Caricato daandreea143
- Lecture 20Caricato daAnonymous 0qR2NHz9D
- fin333 eventstudy finalCaricato daapi-269284817
- Regression SheetCaricato danauman724
- St Chap005Caricato daimran_chaudhry
- Fall 2019_ECO601_1Caricato daAlina Shahjahan Rana
- JIT-Chap_4MCaricato dacup5000
- Assignment1_statanalCaricato daalicorpanao
- m,Khapidial Ansori SpssCaricato daopidial ansori
- Riegelman Glossary1Caricato dacainbelzebub

- Fs-1035mfp Fs-1135mfp End Users Guide Manual Fs-1035Caricato daWinny Shiru Machira
- lesson-07-Management information systemsCaricato daWinny Shiru Machira
- lesson-04Caricato daWinny Shiru Machira
- MArketing research notes chapter4Caricato damanojpatel51
- Human Resource ManagementCaricato damarksman77661
- MArketing research notes chapter2Caricato damanojpatel51
- MArketing research notes chapter19Caricato damanojpatel51
- Lecture 22Caricato daWinny Shiru Machira
- Organization's ImageCaricato dasweetlittlegirl_92
- Basic Networking TutorialCaricato dasareeee
- organizational behaviorCaricato daWinny Shiru Machira
- Lecture 24Caricato daWinny Shiru Machira
- PC HARDWAARE2Caricato daajayakomna
- Lecture 27Caricato daWinny Shiru Machira
- Lecture 34Caricato daWinny Shiru Machira
- Lecture 36Caricato daWinny Shiru Machira
- Lecture 39Caricato daWinny Shiru Machira
- Lecture 38Caricato daWinny Shiru Machira
- Lecture 33Caricato daWinny Shiru Machira
- network-notes.pdfCaricato daNasarMahmood
- Human Res. Management - M. Com - ICaricato daShailesh Mehta
- 34978919_2a_IIT_Law_(2011)_2(1)Caricato daWinny Shiru Machira
- lesson-18Caricato daWinny Shiru Machira
- Mentoring Prog Operations ManualCaricato daWinny Shiru Machira
- Support for Growth Orinted Women in KenyaILO-KENYA_12.01.2005Caricato daWinny Shiru Machira
- Mol Biol Evol 1997 Suzuki 800 6Caricato daWinny Shiru Machira
- Who Offset 36Caricato daalexgo5_7186
- Operation Guide Command CenterCaricato daWinny Shiru Machira
- Cisco Networking EssentialsCaricato daapi-19655487

- LEARNING FRENCH AS A SECOND LANGUAGE: CHALLENGES FOR A NATIVE ENGLISH SPEAKER - Adigwe Joseph Chinedu and Anukwu Anthonia .UCaricato daFrancis Abulude
- How to handle difficult participants.docxCaricato daSamer Sami
- TPACK Template Mobile Applications AssignmentCaricato daSammi Seacrist
- 1. Questionnaire for Accreditation Re-Accreditation (PEC AC-1 Form)Caricato dazeeya
- Class 7 Important Questions for Maths – Fractions and Decimals.pdfCaricato daKTSivakumar
- Lesson Plan Checked.docxCaricato daAldwin Anastacio
- mathlesson2215Caricato daapi-311986455
- OverviewCaricato daShaira Jhann L. Rosales
- OC-Newsletter-October.pdfCaricato daLisa James
- Lesson Plan SportCaricato daHa Autumn
- Tips to WriteCaricato dacool.meenu
- newtonleibnizwebquest-albertraezCaricato daapi-305647212
- level-6-diploma-occupational-safety-health-1.pdfCaricato daakram
- EJ1075996.pdfCaricato daNovita Sari
- MATH 4002 Course Outline.pdfCaricato daVanz
- educ 5324-research paper s kCaricato daapi-322596215
- Canada CNR Vancouver (1)Caricato daJanela Statham
- 160604 michael drew professional resume principalCaricato daapi-323239011
- Using Scripted Role-Play to Improve Oral Performance: A Study of Prathom Six Students at Chariyathamsuksa Foundation School, SongkhlaCaricato daNational Graduate Conference
- Erph2018 AmnCaricato daZuriya Henney
- KUBENARKAN.docxCaricato daFernandochriss
- Fluenz version German full international software crack key download pc-mac.txtCaricato daMegatore
- final application dulce (1)Caricato daapi-286232273
- Worldwide Survey of Primary EltCaricato daTom Way
- Yr 5 Pksr 2 2017 NewCaricato daVictor Tan
- The Award and Co-curricular Education - International Seminar, Indonesia 10 May 2017Caricato daSomantri
- Prospectus 2018Caricato dakhurram95103
- SSQ forms_and_worksheets.pdfCaricato daKaren Sanborn
- curriculum vitaeCaricato daapi-278253810
- Delta2_LSA 1_ Lesson PlanCaricato daPhairouse Abdul Salam

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.