Sei sulla pagina 1di 87

Dr.

Shailesh Kaushal
Department of Business Administration
University of Lucknow
Lucknow
Mobile: 9415323233
Email: kaushal.sk1971@gmail.com
Assumptions of Parametric tests

1. Normal

2. Homogeneous

3. Ordinal and Continuous Data

4. Independent Score
Assumptions of Parametric tests

• Understand the assumption of Normality

1. Graphical displays

2. Distribution: Skew & Kurtosis

3. Normality tests: K-S & Shapiro Test


• Understand Homogeneity of Variance

– Levene’s Test
Assessing Normality
1. Graphical displays

A. Histogram

B. P-P Plot (or Q-Q plot)


Histogram

Normal Not Normal: Positive Skew Not Normal: Positive Skew


The P-P Plot

Normal Not Normal Not Normal


The Q-Q Plot

Normal Not Normal Not Normal


2. Skew/Kurtosis

Values of Skew/Kurtosis

-1 to +1

0 in a Normal distribution
3. Kolmogorov-Smirnov Test

For Normal distribution

P > .05 (Non significant)


Assessing Homogeneity
of Variance
Assessing Homogeneity of Variance
• Graphs
• Levene’s Tests
– Tests if variances in different groups are the same.
– Significant = Variances not equal

– Non-Significant = Variances are equal

• Variance Ratio
– With 2 or more groups
– VR = Largest variance/Smallest variance
– If VR < 2, homogeneity can be assumed.
Homogeneity of Variance
Correcting Data Problems
Correcting Data Problems

• Log Transformation (log(Xi)): Ex. log(day2+1)


– Reduce positive skew.

• Square Root Transformation (√Xi):


– Also reduces positive skew. Can also be useful for stabilizing variance.

• Reciprocal Transformation (1/ Xi): Ex. 1/(day2+1)


– Dividing 1 by each score also reduces the impact of large scores. This
transformation reverses the scores, you can avoid this by reversing the
scores before the transformation, 1/(XHighest – Xi).
Step-1: Transform Compute Variables..
Step-2: Type Log2 into Target Variable box, Click Type & Label
and Type Log2 into Label and the click Continue
Step-3: Select All from Functions group and select Lg10 and transfer into Numeric
Expression box, Drag Hygiene (Day2..) and drop into Lg10(day2+1) and click O.K.
Step-1: Transform Compute Variables..
Step-2: Type Sqrt2 into Target Variable box, Click Type & Label and Type Sqrt2 into Label and
the click Continue. Select All from Functions group and select SQRT and transfer into Numeric
Expression box, Drag Hygiene (Day2..) and drop into SQRT(day2) and click O.K.
Step-1: Transform Compute Variables..
Step-2: Type Ins2 into Target Variable box, Click Type & Label and Type Ins2 into Label and the
click Continue.
Step-3: Type 1/( )into Numeric Expression box, Drag Hygiene (Day2..) and
drop into 1/(day2+1 ) and click O.K.
Log Transformation
Log(day2+1)

Before After

Slide 24
Square Root Transformation
Before After

Slide 25
Parametric Tests Nonparametric Tests

Two Sample Two Sample

Wilcoxon
Independent Mann-Whitney
Paired t-test signed rank
t-test test
test

More Than Two Sample More Than Two Sample

Repeated
Kruskal -Wallis Friedman’s
ANOVA Measures
test ANOVA
ANOVA
Parametric Tests
Parametric Tests

1. One sample t-test

2. Independent t- test

3. Paired t-test

4. ANOVA (Analysis of Variance)

5. Repeated Measures ANOVA


One sample t-test
One sample t-test

One-sample t-test is used to test


whether a population mean is significantly
different from some hypothesized value.
Variables should have
1. The data are continuous (not discrete).

2. The data follow the normal probability distribution.

3. The sample is a simple random sample from its population.


One sample t-test

Y (D V)
Metric Data

Q.2 How many kilometer run Tyres? (Ratio)

Km. --------
Case Study- Tyres
Company X tyres has launched a new brand of tyres for cars
and claims that under normal circumstances the average life of
tyres is 40,000 km. A retailers wants to test this claims and has
taken a random samples of 10 tyres. He tests the life of the tyres
under normal circumstances. The results obtained are presented
in the table.

Life of the sample tyres


Tyres 1 2 3 4 5 6 7 8 9 10

Km. 36,000 37,000 41,000 41,500 39,500 41,000 42,000 39,500 41,000 42,500
Independent t- test
Independent t- test

This test is used when there are two mutually


exclusive groups or two experimental conditions and
different participants were assigned to each condition.
Variables should have
1. One  independent variable with two levels (Nominal)

2. Metric or Continuous dependent variable.

3. Observations should be independent.


Independent t- test
Y (D V) = X (I V)
Metric Data Nonmetric Data

Q.1 What is your gender? (Nominal)


Male -1
Female -2
Q.2 How much salary are you drawing? (Ratio)
Rs. --------
Case Study- Salary

The example to know whether the salary of male


faculty members in private institutions is equal to salary of
female faculty members, Researcher collected data from 20
male and 20 female faculty members in private institutions to
know the difference.
Case Study- Salary

Male
Y (D V) = X (I V)
Metric Data Nonmetric Data Female

Y (D V) = Salary
X (I V) = Male (10) & Female (10)
Paired  t- test
Before-After

Manipulation
Pre-Measure Post-Measure
Paired  t- test
The Paired Samples t-test compares two means that
are from the same individual, object, or related units. The two
means can represent things like:

A measurement taken at two different times (e.g.,


pre-test and post-test with an intervention administered
between the two time points)
Variables should have
1. One metric or continuous dependent variable.

2. Pre test and post test observations of same objects.


Case study-Training

ABC University arranged a special Faculty


Development Programme for one segment of its teachers. The
University wants to measure the change in the attitude of its
teachers after the FDP. For this purpose, it has used a well-
designed questionnaire, which consists of 10 questions on a 1
to 5 rating scale (1 is strongly disagree and 5 is strongly
agree). The University selected a random sample of 20
teachers.
Case study-Training

Y (D V)
Metric Data

Dependent variable, or test variable


(continuous), measured at two different times or for two
related conditions or units
Case study-Training

Participants Before Training After Training


1 25 38
2 35 45
3 24 42
4 16 32
5 12 41
6 27 42
7 22 34
8 13 29
9 17 36
10 26 35
Analysis of Variance
Analysis of Variance (ANOVA)
This test useful to analyze situations
in which we want to compare more than two
conditions.
Variables should have
1. One  independent variable with more than two levels
(Nominal)

2. Metric or Continuous dependent variable.

3. Observations should be independent.


Case Study- Store Promotion
An experiment in which a major department store chain
wanted to examine the effect of the level of in -store promotion. The
experimenter wants to know the impact of the in -store promotion
on sales. In store promotion was varied at three levels:

1. No promotion.

2. Medium promotion.

3. High promotion.
Case Study- Store Promotion

1 No promotion (5)
Y (D V) = X (I V) 2 Medium promotion (5)
Metric Data Nominal Data 3 High promotion (5)

Y (D V) = Sales
X (I V) = No promotion , Medium promotion , & High promotion
Repeated Measures ANOVA
Repeated Measures ANOVA

Repeated measures ANOVA is used


to compare three or more group means where
the participants are the same in each group.
Variables should have
1. One metric or continuous dependent variable.

2. Repeated observations more than two times of same objects.


Case Study-Diet

The dietician introduce new diet chart plan to


reduce weight loss, the dietician took 15 women who
considered themselves to be in need of losing weight and put
them on this diet for two months. Their weight was measured
in kilograms at the start of the diet and then after one month,
two month and three months.
Case Study-Diet

Y (D V)
Metric Data

Dependent variable, or test variable


(continuous), measured at more than two different times or for
more than two related conditions or units
Nonparametric Tests
Nonparametric Tests

Nonparametric Tests  are those tests


which do not rely on data belonging to any
particular parametric family of probability
distributions. It means these tests are
distribution free.
When to use Nonparametric Tests

Nonparametric tests are


used when assumptions of parametric
tests are not met.
Nonparametric Tests
When and why we use non-parametric tests?

1. Wilcoxon Rank Signed test(One sample)

2. Mann-Whitney test (Two independent sample)

3. Wilcoxon Rank Signed test (Two dependent sample)

4. Kruskal–Wallis test (More than two independent sample)

5. Friedman’s ANOVA (More than two dependent sample)


One -Sample Wilcoxon
Rank Signed test
One -Sample Wilcoxon Rank Signed test

The one sample sign test simply computes a


significance test of a hypothesized median value for a single
data set. The one sample sign test is a non parametric
hypothesis test used to determine whether statistically
significant difference exists between the median of a non-
normally distributed continuous data set and a standard. 
Case Study-State Bank of India

State Bank of India’s manager indicates


that the median number of savings account customers per day
is 53. A clerk from the same branch claims that it was more
than 53. Clerk collected the number of savings account
customers per day data for 10 random days. Can we reject the
branch manager’s claim at 0.05 significance level?
Day 1 2 3 4 5 6 7 8 9 10

Customers 50 55 54 59 57 62 37 66 65 64
The Mann–Whitney U test
The Mann–Whitney U test

The Mann-Whitney U test is a non-


parametric test that can be used in place of an
unpaired t-test. It is used to test the null hypothesis
that two samples come from the same population (i.e.
have the same median) or, alternatively, whether
observations in one sample tend to be larger than
observations in the other.
Variables should have

1. One  independent variable with two levels

2. Ordinal or Continuous scale dependent variable.

3. Observations should be independent.


The Mann–Whitney test
Y (D V) = X (I V)
Ordinal or Continuous Data Nominal Data
Ranking Data
• The tests in this lecture work on the principle of ranking the data for
each group:
– Lowest score = a rank of 1,

– Next highest score = a rank of 2, and so on.

– Tied ranks are given the same rank: the average of the potential
ranks.
• Add up the ranks for the two groups and take the lowest of these sums
to be our test statistic.
• The analysis is carried out on the ranks rather than the actual data.
Back Depression Inventory
• A neurologist investigated the depressant effects of certain recreational drugs.
– Tested 20 clubbers

– 10 were given an ecstasy tablet to take on a Saturday night

– 10 were allowed to drink only alcohol.

– Levels of depression were measured using the Beck Depression Inventory (BDI) the day
after and midweek.

• Rank the data ignoring the group to which a person belonged


– A similar number of high and low ranks in each group suggests depression levels do not
differ between the groups.
– A greater number of high ranks in the ecstasy group than the alcohol group suggests the
ecstasy group is more depressed than the alcohol group.
Back Depression Inventory

Ecstasy tablet
Y (D V) = X (I V)
Ordinal Data Nominal Data Drink alcohol

Y (D V) = Depression Score
X (I V) = Ecstasy tablet (10) & Drink alcohol (10)
Wilcoxon signed-rank test
The Wilcoxon signed-rank test
• Uses:

– To compare two sets of scores, when these scores come from the
same participants.
• Imagine the experimenter in the previous example was interested in
the change in depression levels for each of the two drugs.
– We still have to use a non-parametric test because the
distributions of scores for both drugs were non-normal on one of
the two days.
Back Depression Inventory
• A neurologist investigated the depressant effects of certain recreational
drugs (Ecstasy Tablet & Drink Alcohol).

A. Ecstasy Tablet

– Tested 10 clubbers

– 10 were given an ecstasy tablet to take on a Saturday, Monday &


Tuesday night.
B. Drink Alcohol

– Tested 10 clubbers

– 10 were given Drink Alcohol to take on a Saturday, Monday &


Tuesday night.
Back Depression Inventory
Ecstasy tablet to take on a Saturday, Monday & Tuesday night.

Participants Sunday Wednesday


1 15 28
2 35 35
3 16 35
4 18 24
5 19 39
6 17 32
7 27 27
8 16 29
9 13 36
10 20 35
Back Depression Inventory
Drink Alcohol to take on a Saturday, Monday & Tuesday night.

Participants Sunday Wednesday


1 16 5
2 15 6
3 20 30
4 15 8
5 16 9
6 13 7
7 14 6
8 19 17
9 18 3
10 18 10
Kruskal–Wallis test
Kruskal–Wallis test
• The Kruskal–Wallis test (Kruskal & Wallis, 1952;) is the non-parametric
counterpart of the one-way independent ANOVA .
– If you have data that have violated an assumption then this test can be
a useful way around the problem.
• The theory for the Kruskal–Wallis test is very similar to that of the Mann–
Whitney (and Wilcoxon rank-sum) test,
– Like the Mann–Whitney test, the Kruskal–Wallis test is based on
ranked data.

– The sum of ranks for each group is denoted by Ri (where i is used to


denote the particular group).
Coffee Consumption
• Does coffee consumption affect your heart rate?
• Variables
– Outcome: Heart Rate Score
– IV: Number of Coffee per day
• No Coffee per day
• 1 Cup Coffee per day
• 2 Cup Coffee per day
• 4 Cup Coffee per day
• Participants
– 40 participants (10 in each group)
Coffee Consumption
No Cup Coffee (10)
1 Cup Coffee (10)
Y (D V) = X (I V) 2 Cup Coffee (10)
Ordinal Data Nominal Data 4 Cup Coffee (10)

Y (D V) = Heart Rate Score


X (I V) = No Coffee , 1 Cup Coffee , 2 Cup Coffee & 4 Cup Coffee
Post hoc tests: Kruskal–Wallis test
• One way to do a non-parametric post hoc procedure is to use Mann–Whitney tests.
• Comparisons:
– Test 1: One Cup Coffee compared to no Cup Coffee.

– Test 2: Two Cup Coffee compared to no Cup Coffee

– Test 3: Four Cup Coffee compared to no Cup Coffee

• Bonferroni correction:
– Rather than use .05 as our critical level of significance, we’d use .05/3 = .0167.
Friedman’s ANOVA
Friedman’s ANOVA

It is used for testing differences between conditions

when there are more than two conditions and the same

participants have been used in all conditions (each case

contributes several scores to the data). If you have violated

some assumption of parametric tests then this test can be a

useful way around the problem.


Marketing campaign

To test the marketing campaign, the researcher


took 15 retail outlets who considered being in need of
increasing sales and putting them on this campaign for two
months. Their sales was measured in millions at the start of the
campaign and then after one month and two months.
Case Study –Lion Dance
– Can lions be trained to line-dance with different rewards?

– Participants: 150 lions

– Training

The lion was trained using either food or affection, not both

– Dance

The lion either learnt to line-dance or it did not.

– Outcome:

The number of lions (frequency) that could dance or not in each


reward condition.
Source: Discovering Statistics Using SPSS by

Andy P. Field, SAGE Publications

Potrebbero piacerti anche