Sei sulla pagina 1di 48

Research Methodology Framing the Testable Hypothesis

(Part I to IV)

SEC - 4-B(2) Ivy Das Gupta

1
What is a Hypothesis
❖ It is a principal instrument in research
❖ It means a mere assumption or supposition to be proved or disproved
❖ It is a formal question that is intended to be resolved
❖ It is a proposition or set of proposition set forth as an explanation for the occurrence of some specified group of
phenomena.
For example:
“Students who receive counselling will show a greater increase in creativity than students not receiving counselling”
Or

The automobile A is performing as well as automobile B.”


2
Characteristics of Hypothesis
❖ It should be clear and precise
❖ It should be capable of being tested
❖ It should state relationship between variables
❖ It should be limited in scope and must be specific
❖ It should be stated as far as possible in most simple terms
❖ It should be consistent with most known facts
❖ It should be amenable to testing within a reasonable time
❖ It must explain the facts that gave rise to the need for explanation

3
Basic Concepts: Null Hypothesis & Alternative Hypothesis
❖ If we are to compare method A with method B about its superiority and if we proceed on the
assumption that both methods are equally good, then this assumption is termed as the null
hypothesis. It is generally denoted by H0
❖ Alternatively, we may think method A is superior or method B is inferior, then it is known as
alternative hypothesis. It is generally denoted by H1
❖ Suppose we want to test the hypothesis that the population mean, 𝛍 is equal to the
hypothesised mean, μ H0 = 100

❖ Symbolically, we can express as:

H0: µ = μ H0 = 100

4
We may consider three
possible alternative hypotheses

5
❖ Null hypothesis and alternative hypothesis are chosen before the sample is
drawn
❖ In the choice of null hypothesis following considerations are kept in mind:
1. Alternative hypothesis is one which is proved &

2. Null hypothesis is one which is disproved

3. Null hypothesis is always be a specific hypothesis

6
Level of Significance

❖ It is some percentage normally 5% which should be chosen with great care


❖ When 5% level of significance is chosen it implies that H0 is rejected when the
sampling result has less than 0.05 probability of occurring if H0 is true
❖ In other words, the 5% level of significance means that the researcher is
willing to take as much as 5% risk to reject H0

7
Decision Rule of Test of Hypothesis

❖ Given null hypothesis H0 and alternative hypothesis H1


❖ A rule is decided which is known as Decision Rule
❖ On the basis of that rule researcher either accepts H0 (i.e., rejects H1) or
accepts H1 (i.e., rejects H0)

8
Type I error and Type II
error
The probability of Type I error is determined in
advance and is understood as the level of
significance of testing of hypothesis

With a fixed sample size n, when we try to reduce


Type I error, probability of Type II error increases.

Both types of errors cannot be reduced


simultaneously

9
Two-tailed test
❖ A two-tailed test rejects the null hypothesis, if say the sample mean is significantly higher or lower than
the hypothesised value of the population mean
❖ In a two-tailed test there are two rejection regions
❖ Symbolically, the two tailed test is appropriate when we have:

H0: µ = μ H0
As against

H1: μ ≠ μ H0
Which may mean μ > μ H0 or μ < μ H0

10
Acceptance and Rejection Regions in a Two-tailed Test

Mathematically can be said:

Acceptance Region A: |Z|⋜1.96


Rejection Region R: |Z| ⋝1.96

11
One-tailed Test
A one-tailed test is used when the sample
mean is either lower or higher than
hypothesised population mean

For instance left tailed test:


H0: µ = μ H0
H1: µ < μ H0

Acceptance Region A : Z > −1.645


Rejection Region R : Z ⋜ −1.645

12
One-tailed Test contd…..

Right tailed test:


H0: µ = μ H0
H1: µ > μ H0

Acceptance Region A : Z ⋜1.645


Rejection Region R : Z > 1.645

13
Part II
(27.04.2020)

14
Procedure for Hypothesis Testing
❖ Making a formal statement: The hypotheses - Null hypothesis (H0) and Alternative
Hypothesis (H1) should be clearly stated considering the research problem.
❖ For instance, suppose the average score in a National level aptitude test is 80. To
evaluate a state’s education system, the average score of 100 students selected
randomly was 75. The state wants to know if there is any significant difference
between the local and national scores. In such a situation the hypotheses may be
stated as below:

Null hypothesis, H0: 𝜇 = 80

Alternative hypothesis, H1: 𝜇 ≠ 80

15
❖ Selecting a significance level: The hypotheses are tested on a pre-determined level of
significance.
❖ Normally, either 5% level or 1% level is chosen.
❖ The level of significance must be adequate in the context of the purpose and nature of enquiry.
❖ Level of significance depends on:
1. The size of the sample
2. The magnitude of the difference between sample means
3. The variability of measurements within the sample
4. Whether the hypothesis is directional or non-directional (A directional hypothesis predicts the
direction of difference between say means.

16
❖ Deciding the distribution to use: The next step in testing of hypothesis is the
selection of appropriate sampling distribution. The choice generally remains
between Normal distribution and ’t’ distribution.

❖ Selecting a random sample and computing an appropriate value: Next step


is draw a sample to furnish empirical data.

17
❖ Calculation of Probability: Next is to calculate the probability that the
sample result would diverge as widely as it has from expectations, if the null
hypothesis were in fact true.

❖ Comparing the Probability: Final step consists of comparing the probability


thus calculated with the specified value for 𝜶, the significance level. If the
calculated probability ⦤ 𝛼 in case of one-tailed test (𝛼/2 in case two-tailed
test), then reject the null hypothesis. But if calculated probability is greater,
then accept the null hypothesis.

18
State H0 as well as H1

Specify the level of significance or 𝛼 value

Decide the correct sampling distribution

Sample a random sample(s) and workout an appropriate value from the sample data

Calculate the probability that sample result would diverge as widely as it has from expectations,
if H0 were true

Is this probability equal to or smaller than 𝞪 in case of one-tailed test and 𝞪/2 in case of two-tailed test

If yes, reject H0, some risk of Type I error If no, accept H0, some risk of Type II error
19
Measuring the Power of a Hypothesis Test
❖ The probability of Type I error is denoted as 𝛼 (the level of significance of the test)

❖ The probability of Type II error is referred to as 𝛽 (the probability of accepting H0, when H0 is
not true).
❖ It is desirable that 𝛽 to be as small as possible.

❖ Therefore, 1 - 𝛽 (the probability of rejecting H0, when H0 is not true) to be as large as possible.

❖ If 1 - 𝛽 is very nearer to unity, then the test is working well.

❖ If 1 - 𝛽 is very nearer to zero, then the test is poorly working.

❖ Thus 1 - 𝛽 technically measures the ‘power of the test’.

20
Power Curve

❖ In case if we plot the values of 1 - 𝛽 for each population parameter (say 𝜇, the
population mean) for which the H0 is not true or, alternatively H1 is true, the
resulting curve will be known as ‘power curve’ associated with the given test.
❖ Thus a power curve of a hypothesis test is the curve that shows the
conditional probability of rejecting H0 as a function of the population
parameter and size of the sample.
❖ The function defining this power curve is known as power function.

21
Operating Characteristic Function
❖ This function closely related to power function shows the conditional
probability of accepting H0 for all values of population parameters for a
given sample size, whether or not, the decision happens to be a correct one.
❖ If Power function is represented as H and Operating characteristic function
as L, then we have

L=1-H
❖ However, one needs only one of these two functions for any decision rule in
the context of hypothesis testing.

22
Computation of the Power of a Test
Example:

A certain chemical process is said to have produced 15 or less pounds of waste


material for every 60 lbs. batch with a corresponding standard deviation of 5 lbs.
A random sample of 1000 batches gives an average of 16 lbs. waste per batch.
Test at 10% level whether the average quantity of waste per batch has increased.
Compute the power of the test for 𝝁 = 16 lbs. If we raise the level of significance
to 20%, then how the power of the test for 𝝁 = 16 lbs would be affected?

23
Computation (cont….)
Solution:
H0: 𝜇 ⋜ 15 lbs

H1: 𝜇 > 15 lbs

As H1 is one sided, right-tailed test at 10% level of significance is applicable for finding the value of standard deviate z corresponding to .4000 area
of normal curve which comes to 1.28 as per normal curve area table.
The limit of 𝜇 for accepting H0:

Accept H0 if X ⋜ 15 + 1.28 (αp/ n )

Or, X ⋜ 15 + 1.28 (5/ 100 )

or, X ⋜ 15.64

At 10% level of significance otherwise accept H1.

But, the sample average is 16 lbs.


Therefore, H0 is rejected and it can be concluded that average waste per batch has increased.

24
1-β
❖ β is the conditional probability which depends on the value of 𝝁 which is 16
lbs in the present context.
❖ β = p (Accept H0: 𝝁 ⋜ 15| 𝝁 = 16)

❖ H0 is accepted at 10% level of significance if X ⋜ 15.64


❖ Therefore, β = p (X ⋜ 15.64| 𝝁 = 16) which is the probability of accepting H0
when H0 is not true.

25
1-β
❖ β = p (X ⋜ 15.64| 𝝁 = 16) is depicted in the adjacent
diagram
❖ Find the probability of the area which lies between 15.64
and 16

❖ Z = (X - 𝝁)/(𝞂/ n ) = (15.64 - 16)/(5/ 100 )


= - 0.72 -𝝱 =
❖ Corresponding to which the area is 0.2642 (Normal Curve
Area Table)
❖ Hence, 𝝱 = 0.5000 - 0.2642 = 0.2358

❖ 1 - 𝝱 = power of the test = 1 - 0.2358 = 0.7642

26
1 - β at 20% level of significance

❖ Now, X ⋜ 15 + 0.84 (5/ 100 ) ⋜ 15.42


❖ Accept H0 if X ⋜ 15.42 otherwise accept H1

❖ β = p(X ⋜ 15.42 |𝛍 = 16) = 0.1230 (from normal curve area table)


❖ Hence, 1 - β = 1 - 0.1230 = 0.8770

27
Part III
(04.05.2020)

28
Tests of Hypotheses
❖ Hypothesis testing determines the validity of the assumption more specifically, the null
hypothesis
❖ It is a choice between two conflicting hypotheses about the value of the population parameter
❖ It helps to decide on the basis of sample data, whether a hypothesis about the population is
likely to be true or false.
❖ Statisticians developed several tests of hypotheses, which can be classified as

(i) Parametric tests or Standard tests of hypotheses

(ii) Non-parametric tests or distribution-free test of hypotheses

29
Parametric Tests of Hypotheses

❖ This type of tests assume certain properties of the parent population from
which we draw samples.
❖ Assumptions like observations come from a normal population, sample size
is large, assumptions like about the population parameters like mean,
variance etc. must hold good before parametric tests can be used.

30
Important Parametric Tests
The important parametric tests are:

1. z - test

2. t - test
2
3. χ - test

4. F - test

All these tests are based on the assumption of normality, i.e., source data is considered to be normally distributed.

In some cases the population may not be normally distributed, yet tests will be applicable on account of the fact,
that, sample and sampling distributions closely approach normal distribution.

31
z- Test
❖ It is based on normal probability distribution
❖ It is used for judging the significance of several statistical measures, particularly the mean
❖ This is a most frequently used test
❖ This test is used even when binomial distribution or t-distribution is applicable on the presumption that as ’n’
becomes larger, such distribution tends to approximate normal distribution
❖ It is generally used to compare the mean of a sample to some hypothesised mean of a population in case of
large sample
❖ It is also used to know the significance of difference of means of two independent samples in case of large
samples when population variance is known
❖ This test may be used for judging the significance of median, mode, coefficient of correlation and several
other measures

32
t-Test
❖ It is based on t - distribution
❖ Ir is considered an appropriate test for judging the significance of sample mean or for judging
the significance of the difference of the means of two independent samples in case of small
samples, when population variance is not known and sample variance is used as the estimate
of population variance
❖ For two related samples paired t-test is used for judging the significance of difference of
means for two related samples
❖ It can also be used for judging the significance of the coefficients of simple and partial
correlation
❖ It is applicable only in case of small samples and when population variance is not known

33
Other Parametric Tests

2 2
❖ χ - test is based on χ distribution and is used for comparing a sample variance to a
theoretical population variance

❖ F-test is based on F distribution and is used to compare the variance of the two
independent samples. This test is also used in the context of analysis of variance
(ANOVA). It is also used for judging the significance of multiple correlation
coefficients.

34
Hypothesis Testing of Means

❖ Case 1: Population normal, population infinite, sample size may be large or


small but the variance of the population is known. H1 may be one-tailed or
two-tailed

z-test is used for testing hypothesis of mean and the test statistic is given by:
X  −  μH0
z=
σp / n

35
Hypothesis Testing of Means
❖ Case 2: Population normal, population finite, sample size may large or small,
but population variance is known. H1 may be two-tailed or one-tailed
hypothesis
z-test is used for testing hypothesis of mean and the test statistic is given by:

X  −  μH0
Z=
( )
σp / n  X [ (N − n)/(N − 1) ]

36
Hypothesis Testing of Means
❖ Case 3: Population normal, population infinite, sample size small and variance
of the population is unknown. H1 may be two-tailed or one-tailed hypothesis
t-test is used for testing hypothesis of mean and the test statistic is given by:
X  −  μH0
t= with degree of freedom (d.f.) = n - 1
(σs / n ) 

(xi  − x)
2

i
and σs  =  
(n − 1)

37
Hypothesis Testing of Means
❖ Case 4: Population normal, population finite, sample size small and variance of the
population is unknown. H1 may be two-tailed or one-tailed hypothesis

t-test is used for testing hypothesis of mean and the test statistic is given by:
x  − μH 0
T= for d.f. = (n - 1)
( )
σs / n X (N − n)/ (N − 1)

( xi  − x)
2

i
and σs  =  
(n − 1)

38
Hypothesis Testing of Means
❖ Case 5: Population may not be normal but sample size is large, variance of the population may be
known or unknown and H1 may one-tailed or two-tailed

z-test is used and the test statistic is as under:

39
Illustration

❖ A sample of 400 male students is found to have a mean height 67.47 inches.
Can it be reasonable regarded as a sample from a large population with mean
height 67.39 inches and standard deviation 1.30 inches? Test at 5% level of
significance.

40
Solution
❖ Taking the null hypothesis that the mean height of the population is 67.39”, we
can write:
H0: μ0 = 67.39
H1: μ0 ≠ 67.39

41
Solution
❖ As H1 is two-tailed, we will be applying a two-tailed test for determining the
rejection regions at 5% level of significance which comes to as under using
normal curve area table:
R: |z|>1.96

The observed value of z is 1.231 which is in the acceptance region since |z|
>1.96 and thus H0 is accepted. We may conclude that the given sample can be
regarded to have been taken from a population with mean height 67.39 inches
and standard deviation 1.30 inches at 5% level of significance.

42
Part IV
(10.05.2020)

43
Illustration

❖ Suppose there is a population of 20 industrial units of same size all of which


are experiencing excessive labour turnover problems. The past records show
that the mean of the distribution of annual turnover is 320 employees with a
standard deviation of 75 employees. A sample of 5 of these industrial units is
taken at random which gives a mean of annual turnover of 300 employees. Is
the sample mean consistent with population mean? Test at 5% level.

44
Solution

❖ Taking the null hypothesis that the population mean is 320 employees, we
can write:

H0: μ0 = 320 employees


H1: μ0 ≠ 320 employees
And the given information as under:
X = 300 employees, σP = 75 employees, n = 5 and N = 20

45
Solution
❖ Here we will apply Case 2: Population normal, population finite, sample size
may large or small, but population variance is known. H1 may be two-tailed
or one-tailed hypothesis
z-test is used for testing hypothesis of mean and the test statistic is given by:

X  −  μH0 300 − 320


Z= = = -0.67
( )
σp / n  X [ (N − n)/(N − 1) ] (75/ 5) * (20 − 5)/(20 − 1)

46
Solution
❖ As H1 is two-sided we shall apply a two-tailed test for determining the rejection region at 5%
level of significance
❖ Using normal curve area table:

R: |Z|> 1.96

The observed value of Z is - 0.67 which is in the acceptance region since R: |Z|> 1.96

Thus H0 is accepted at 5% level of significance

Therefore sample mean is consistent with population mean, i.e., population mean 300 is
supported by sample results.

47
Assignment

The mean of a certain production process is known to be 50 with a standard


deviation of 2.5. The production manager may welcome any change in mean
value towards higher side but would like to safeguard against decreasing
values of mean. He takes a sample of 12 items that gives a mean value of
48.5. What inference should the manager take for the production process on
the basis of sample results? Use 5 per cent level of significance for the
purpose.

48

Potrebbero piacerti anche