Sei sulla pagina 1di 13

TRAINING IN PARAMETRIC TESTS

Pangasinan State University


Lingayen Campus
October 19-20, 2019

Statistics the most important science in the whole world; for upon it depends the
practical applications of every science and of every art; the one science essential to all
political and social administration, all education all organization based on experience,
for it only gives results of our experience – Florence Nightingale

Statistics is the branch of science that deals with the collection, presentation, organization,
analysis and interpretation of data.

Population is the collection of all elements under consideration in a statistical inquiry.


Example:
1) The PSU Office of Admission is studying the relationship between the score in the college
entrance examination during application and the general point average (GPA) upon
graduation among graduates of the university from 2010 – 2018.
Population: collection of all graduates of the university from 2010 - 2018.
Variable of interest: score in the college entrance examination and GPA

2) The Department of Health is interested in determining the percentage of children below


12 years old infected by the Hepatitis B/Polio virus in Dagupan City in 2017.
Population: set of all children below 12 years old in Dagupan City in 2017.
Variable of interest: whether or not the child has ever been infected by the Hepatitis
B/Polio virus.

3) The research division of a certain pharmaceutical company is investigating the


effectiveness of a new diet pill in reducing weight on female adults.
Population: set of all female adults who will use the diet pill
Variable of interest: Weight before taking the diet pill, weight after taking the diet pill.

Sample is a subset of the population.

Variable is a characteristic or attribute of the elements in a collection that can assume different
values for the different elements.

An observation is a realized value of the variable.

Data is the collection of observations

Descriptive Statistics includes all the techniques used in organizing, summarizing and presenting
the data in hand.

Example:

1. The Philippine Atmospheric, Geophysical and Astronomical Services Administration


(PAGASA) measures the daily amount of rainfall in millimeters. They can use descriptive
statistics to compute the average daily amount of rainfall every month for the past year.
They can use the results to describe the amount of rainfall for the past year.

2. Given the daily sales performance for a product for the previous year we can draw a line
chart or a column chart (bar) to emphasize the upward/downward movement of the
series. Likewise, we can use descriptive statistics to calculate a quantity index per quarter
to compare sales by quarter for the previous year.

Inferential Statistics includes all the techniques used in analyzing the sample data that will lead
to generalization about a population from which the sample came from.

Example:

1. To examine the performance of the country’s financial system, we can use inferential
statistics to arrive at conclusions that apply to the entire economy using the data gathered
from a sample of companies or businesses in the country.

2. To determine if reforestation is effective, we can take a representative portion of


denuded forests and use inferential statistics to draw conclusions about the effect of
reforestation on all denuded forests.

3. The research division of a certain pharmaceutical company is investigating the


effectiveness of a new diet pill in reducing weight on female adults.
Statistical inquiry is a designed research that provides information needed to solve a research
problem.

STEP 1
Identify the Problem

STEP 2
Plan the Study

STEP 3
Collect the Data

STEP 4
Explore the Data

STEP 5
Analyze data and interpret the results

STEP 6
Present the results

Figure 1
Steps in statistical inquiry

STEP 1: Identify the problem


The researchers need to define and state the problem in a clear manner so that they can
arrive at appropriate solutions and recommendations later on. This is an important step in
statistical inquiry because it is at this point that the researchers establish the HEART of the whole
research.

Example:

1) The researchers want to determine if there is an association between the price and
production of lumber. They would have to answer the following questions first to
obtain a precise statement of the problem:
a) What kind of lumber will be included in the study? Will all types of lumber be
included or just one specific type?
b) Will the study include the whole production of lumber or only lumber produced
for sale?
c) What price for lumber will be used, the market price or the factory price?
d) What is the scope of the study? Are all the regions of the country included or just
a specific region or province only?
e) What period is covered in the study?

The statement of the research problem is usually in the form of a question. The researcher can state
the problem in the form of question, or in the form of a statement. One way of further refining the
statement of the problem is by formulating a hypothesis.

Example:
a) In the form of a question
“What are the factors affecting the job performance of an employee?”
“What is the relationship between wheat yield and amount of fertilizer?”
“What are the short and long term effects of the expanded value added tax on the economy
of the country?”

b) In the form of the statement or general aim:


“This study proposes to describe the relationship among job satisfaction, salary, quality of
relationship with the supervisor, and job performance.”
“This study aims to predict next month’s supply/demand of rice.”

c) In the form of hypothesis:


“The average grades of students in the Mathematics examination is higher using the new
teaching materials instead of the old teaching materials.”
“The average expenditures of households in district I is higher than of district II.”
“Sodium content in cereals produced by certain company exceeds the required daily limit.”
“Parents whose parents smoke are more likely to smoke in their adult life compared to
children whose parents do not smoke.”

2) Plan the Study


Some statistical inquiries do not reach completion or do not succeed in arriving at
useful information for sound decision making because of the researchers’ failure to
plan the study carefully. In coming with a plan the researchers need to take into
consideration all of the outputs in Step 1, namely the stated research problem and
the specific objectives.
The concrete output in Step 2 is the investigators’ research design. The research
design is a detailed discussion of the methods and strategies for data collection and
analysis that the investigators plan to use in order to meet all of the specific objectives
stated in Step 1. An effective research design is as simple as possible and at the same
time, cost efficient. With a simple research design, the investigators get to avoid
complications and errors in its implementation. With a cost-efficient research design
the investigators are confident in completing the study within the allotted budget and
time without sacrificing the quality of the information they get.

The Basic Elements of a Research Design are the Following:


 A list of variables in the study, whenever necessary the design of the instrument that will be used
to measure them.
A commonly used measurement instrument in a statistical inquiry is the QUESTIONNAIRE

 The data collection method that the researchers will use.


The investigator may use documented data in published or unpublished studies. However, if the
investigators cannot find any available relevant and reliable data, then they would have to collect
their own data.
The most common data collection methods are surveys, experiments and observation.
In the survey method, the investigator collects data by asking the selected respondents a set of
questions.
In an experiment, the investigator collects data under a controlled environment. There is direct
human interference on the variables that may affect the observed values of the variable of
interest.
In observation method, the investigator collects data by observing and recording the
phenomenon of interest during the actual time of occurrence.

 Sampling design if data will be collected from a sample.


The sampling design describes the method the investigators will use in selecting the elements
included in the sample and specifies the number of elements in the sample.

 Experimental design if data will be collected through an experiment


The experimental design describes how the investigators will determine the treatment that each
one of the participants of the study will receive.

 Methods for data analysis


In addition to being able to meet the research objectives, the choice of the data analysis method
must take into consideration the data collection method, type of data obtained, assumptions
about the data in the formulation of the statistical technique, and the availability of the tools or
statistical software needed in performing the analysis.

3) Collect the Data


The investigator carry out the plans specified in the research design in the data
collection. In addition,, the investigators take extra measures to ensure the quality of
the data collected. If the collected data were incomplete, outdated, inaccurate, or
worse yet, fabricated, then it will be useless to proceed with data analysis. Thus, the
investigators have to make sure that everyone involved in data collection has a
genuine appreciation of the quality of data.

4) Explore the Data


Prior to data analysis the investigators need to explore and understand the essential
features of their data. This process allows them to determine if their data satisfy he
assumptions made in the derivation of the statistical technique that they will use for
analysis. This process will also reveal to them if their data exhibit any peculiarities that
will create problems in the analysis.

5) Analyze and Interpret the Results


The investigators once more carry out the plans specified in the research design but
this time on the data analysis. They then examine all of the results on tables, charts,
estimated summary measures and tests of hypotheses. They need to check that they
were able to meet all of the specific objectives stated in Step 1. Based on the analysis
carried out, the investigators must be able to answer the research problem and give
recommendations on how this can be useful in decision making.

The investigators must double-check the results that contradict existing theory or the
earlier hypothesis made. They may have committed errors in data collection or
analysis. If not, they would have to propose possible explanations for these results or
suggest future statistical inquiries that could help explain the inconsistency.
6) Present the Results
After analyzing and interpreting the results, the investigators must present these
results in a clear and concise manner to the users of the research. All the time and
effort spent in conducting the inquiry will be in vain if the investigators do not
articulate what all the figures convey and how the presented information can be
useful for decision making. The presentation must also include a discussion of the
whole research process, from Step 1 to Step 5. This will help users evaluate for
themselves the reliability and credibility of the presented information.

The three ways of presenting results are textual, tabular, and graphical.
Textual presentation involves stating the results in paragraph form.
Tabular presentation involves showing the figures in rows and columns so that the
reader can easily comprehend the points made.
Graphical presentation involves placing data in graphic form to help the reader
visualize other important features of the data without having to look at too many
figures.

It is a capital mistake to theorize before one has data – Sir Arthur Conan Doyle

Measurement is the process of determining the value or label of the variable based
on what has been observed.

The Ratio level of measurement has all the following properties:


a) The numbers in the system in the system are used to classify a person/object in
distinct , on non-overlapping, and exhaustive categories;
b) The system arranges the categories according to magnitude;
c) The system has fixed unit of measurement representing a set size throughout the
scale; and
d) The system has an absolute zero.

Example:
1) Allowance of the students (in pesos)
2) Distance travelled (in kms)
3) Speed of the car in (km/hr)
4) Weight of new born baby ( in kgs)

Interval level of measurement satisfies only the first three properties of the ratio scale.

The only difference between the interval and ratio levels is the interpretation of the value
0 (zero) in their scales. The zero point in the interval scale is NOT AN ABSOLUTE ZERO. Unlike in
the ratio scale, the zero value in the interval scale has an ARBITRARY INTERPRETATION and does
not mean the absence of the property we are measuring.

Ordinal level of measurement satisfies only the first two properties of the ratio scale.
Example:
1) Performance rating of a salesperson measured as follows: 1 for “excellent”, 2 for
“very good”, 3 for “good”, 4 for “satisfactory”, and 5 for “poor”.
2) Faculty rank of a teacher measured as follows: 1 for Professor, 2 for Associate
Professor, 3 for Assistant Professor, and 4 for Instructor.
3) Ranking of student in class according to his academic performance as 1st, 2nd,, 3rd, and
so on.

Nominal level of measurement satisfies only the first property of the ratio level.

A parametric statistical test makes an assumption about the population parameters and the
distributions that the data came from. These types of test includes Student’s T tests and ANOVA
tests, which assume data is from a normal distribution.

The opposite is a nonparametric test, which doesn’t assume anything about the population
parameters. Nonparametric tests include chi-square, Fisher’s exact test and the Mann-Whitney
test.

Do not put your faith in what statistics say until you have carefully considered what
they do not say.
William W. Watt

Hypothesis Testing

A statistical hypothesis is a conjecture concerning one or more populations whose veracity can be
established using simple data. The null hypothesis, denoted as Ho, is a statistical hypothesis which the
researcher doubts to be true. The alternative hypothesis, denoted as Ha is the operational statement of
the theory that the researcher believe to be true and wishes to prove and is a contradiction of the null
hypothesis.

In hypothesis testing, the level of significance refers to the degree of significance in which we accept
or reject the null hypothesis. In hypothesis testing, 100% accuracy is not possible for accepting or rejecting
a null hypothesis. So, we therefore select a level of significance that is usually 1% and 5%. Level of
significance is the maximum probability of committing a Type I error. That is, P (Type I error) = α. This
probability is symbolized by α (Greek letter alpha).

After the significance level is chosen, a critical value is selected from a table for the
appropriate test statistic. The critical value determines the critical and noncritical regions. The
critical value is a value that separates the critical region from the noncritical region. The critical
or rejection region is the range of the values of the test value that indicates that there is
significant difference and that the null hypothesis (H0) should be rejected. On the contrary,
noncritical or nonrejection region is the range of the values of the test value that indicates that
the difference was probably due to chance and that null hypothesis (H0) should not be rejected.
A Type I error occurs if one rejects the null hypothesis when it is true. In hypothesis testing type I error
is denoted by alpha (α). In hypothesis testing, the normal curve that shows the critical region is called the
alpha region.

A Type II error occurs if one does not reject the null hypothesis when it is false. In hypothesis testing,
type II errors ate denoted by beta (β). In hypothesis testing, the normal curve that shows the acceptance
region is called the beta region.

Table 2: Possible Outcome of a Hypothesis Test


Statistical Decision H0: True H0: False
Do not reject H0 Correction decision Type II error
Confidence = 1 - α P (Type II error) = β
Reject H0 Type I error Correction decision
P (Type I error) = α Power = 1 - β

The hypothesis testing situation can be compared to a court trial. In a court trial, there are four
possible outcomes. The defendant is either guilty or innocent, and will be convicted or acquitted.
The hypotheses are

Statistical Decision H0: The defendant is innocent H0: The defendant is not innocent
Do not reject H0 CORRECT DECISION ACQUITTED
Reject H0 CONVICTED CORRECT DECISION

Next, the prosecutor will present the evidence and based on this evidence, the judge decides
the verdict, innocent or guilty. If the defendant is acquitted and did not commit the crime, a
correct decision has been made by the judge. On the other hand, if the defendant is acquitted
and has committed the crime, then Type II error has been made.

If the defendant is convicted but did not commit the crime, then a Type I error has been
committed. On the contrary, if the defendant is convicted and has committed the crime, then a
correct decision has been made.

Statistics are like bikini. What they reveal is suggestive but they
conceal is vital.
- Aaron Lavenstein, as quoted in Nature Genetics
Assumptions for Statistical Tests
Most of the statistical tests we will perform are based on a set of assumptions. When these assumptions
are violated the results of the analysis can be misleading or completely erroneous.

Typical assumptions are:

 Normality: Data have a normal distribution (or at least is symmetric)

 Homogeneity of variances (Homoscedasticity): Data from multiple groups have the same variance

 Linearity: Data have a linear relationship


 Independence: Data are independent

We explore in detail what it means for data to be normally distributed in Normal Distribution, but in
general it means that the graph of the data has the shape of a bell curve. Such data is symmetric around
its mean and has kurtosis equal to zero.

Testing for Normality and Symmetry


Since a number of the most common statistical tests rely on the normality of a sample or population, it is
often useful to test whether the underlying distribution is normal, or at least symmetric. This can be done
via the following approaches:

 Review the distribution graphically (via histograms, boxplots, QQ plots)


 Analyze the skewness and kurtosis
 Employ statistical tests (esp. Chi-square, Kolmogorov-Smironov, Shapiro-Wilk, Jarque-Barre,
D’Agostino-Pearson)

If data is not symmetric, sometimes it is useful to make a transformation whereby the transformed data
is symmetric and so can be analyzed more easily.

Some tests (e.g. ANOVA) require that the groups of data being studied have the same variance.
In Homogeneity of Variances we provide some tests for determining whether groups of data have the
same variance.

Some tests (e.g. Regression) require that there be a linear correlation between the dependent and
independent variables. Generally linearity can be tested graphically using scatter diagrams or via other
techniques explored in Correlation, Regression and Multiple Regression.

We touch on the notion of independence in Definition 3 of Basic Probability Concepts. In general, data are
independent when there is no correlation between them (see Correlation). Many tests require that data
be randomly sampled with each data element selected independently of data previously selected. E.g. if
we measure the monthly weight of 10 people over the course of 5 months, these 50 observations are not
independent since repeated measurements from the same people are not independent. Also the IQ of 20
married couples doesn’t constitute 40 independent observations.

Another approach for addressing problems with assumptions is by transforming the data
(see Transformations).

There is a different approach in the decision rule.

Using a p-value method:


 If p-value < α, reject the H0, and if p-value > α, do not reject the H0.

Using critical value method:


 If the computed test statistic is ≥ the critical value, reject Ho, and if the computed test statistic is
< the critical value, do not reject Ho.

Using confidence interval method


 When the confidence interval contains the hypothesized mean, do not reject H0.
 When the confidence interval does not contain the hypothesized mean, reject H0.

D.E.A.R Method

Define

Example

Apply

Report

There are three types of t-tests

Independent sample t-test – compare the mean of two independent populations

 Is there a mean difference in SAT scores between freshmen and sophomore students?

Repeated measures t-test a.k.a Paired sample t-test – one group of people tested more than once

 If a group of people are given a medication for high cholesterol, does their average cholesterol
levels decrease after one month?

One sample t-test – used to compare a sample mean with a known population mean or some other
meaningful, fixed value.

 Does the class of 2015 have higher or lower SAT scores than the SAT scores of all students?

Problem: Stephen Schmidt (1994) conducted a series of experiments examining the effects of humor on
memory. He collected a set of humorous sentences and then modified each one to produce a
nonhumurous version of the same sentence. The humorous sentences were then presented to one group
of participants and the non-humurous sentences were presented to another group. Each group was given
a test to determine how many sentences they could recall. Data similar to those obtained by Schmidt are
shown in the following table.

Number of sentences recalled


Humorous sentences Nonhomurous sentences
45246 76625433353 5242231 532334153

a. Do the data provide enough evidence to conclude that humor has a significant on memory? Use
a two tailed test at the 0.05 level of significance.
b. Calculate Cohen’s d to evaluate the size of the effect.
c. Calculate the percentage of variance explained by the treatment, r2, to measure the effect size.

State statistical result: t(df) = score p-value, level of significance

t(29)= 2.04, p = 0.02, d = 0.61

State the results in sentence form

The participants who used humorous sentences (M = 4.25, SD = 2.33, n = 16) had statistically
significant higher effect on memory than those who used non-humorous sentences (M = 3.0, SD = 1.73, n
= 16). The effect size was medium. A power of 90% or 0.90 indicates that if the study was conducted 10
times it is likely to produce similar results (i.e statistically significant) 9 times.
State decision about the null hypothesis

The t-test has insufficient evidence to reject the null hypothesis.

EFFECT SIZE

 Significance tell us that the effect was not due to chance (“real”)
 Effect size is a measure of how large the effect was (“real big”)
 Difference can be statistically significant but really not that practically impressive

REPORT

A one-way analysis of variance was conducted to evaluate the null hypothesis that there is no
difference on high school students’ level of satisfaction with school based on their family’s socioeconomic
status (N = 435). The independent variable, socioeconomic status, included three groups: Low (M = 21.36,
SD = 4.55, n = 147), Moderate (M = 22.10, SD = , n = 153), and High (M = 26.73, SD = 5.85, n = 135).

The assumption of normality was evaluated using histograms (see Figure No.) and found tenable
for all groups. The assumption of homogeneity of variances was tested and found tenable using Levene’s
test, F(2, 432) = 0.75, p = 0.48. The ANOVA was significant F(2, 432) = 4.64, p = 0.01, η2 = 0.02. Thus, there
is significant evidence to reject the null hypothesis and conclude there is significant difference between
high school students’ level of satisfaction and school based on their family’s socioeconomic status.
However, the actual difference in the mean scores between groups was quite small based on Cohen’s
(1988) conventions for interpreting effect size.

The Post hoc comparisons to evaluate pairwise difference among group means were conducted
with the use of Tukey HSD test since equal variances were tenable. Tests revealed significant pairwise
differences between the mean scores of students who come from families with low socioeconomic status
and students who come from families with high socioeconomic status, p < 0.05. Students who come from
families with medium socioeconomic status do not significantly differ from the other two groups, p > 0.05.

GUIDE

A one-way analysis of variance was conducted to evaluate the null hypothesis <state the null hypothesis>
(N = total number of observations). The independent variable, <identify the independent variable>,
included <number of groups> groups: <first category> (M = ______, SD = _____, n = ____), <2nd category>
(M = _____, SD =____ , n = _____), and <3rd category> (M = _____, SD = ______, n = _____).

The assumption of normality was evaluated using histograms (see Figure No.) and found tenable
(justifiable, supportable, arguable, reasonable, supportable, viable, workable, credible, acceptable) for all
groups. The assumption of homogeneity of variances was tested and found tenable using Levene’s test,
F(df 1, df2) = <Levene’s statistic>, p = <sig value>. The ANOVA was significant F(df1, df2) = <from ANOVA
table>, p = <sig value from the ANOVA table>, η2 = <compute the Cohen’s effect size by getting the ratio
of between groups sum of square and total sum of squares>. Thus, there is significant evidence to reject
the null hypothesis and conclude there is significant difference between <dependent variable and
independent variable>. However, the actual difference in the mean scores between groups was quite
<small, medium or large> based on Cohen’s (1988) conventions for interpreting effect size.

The Post hoc comparisons to evaluate pairwise difference among group means were conducted
with the use of Tukey HSD test since equal variances were tenable. Tests revealed significant pairwise
differences between the mean scores of students who come from families with <one group versus with
the other group based on multiple comparison table of post hoc test>, p < 0.05. Students who come from
<other group/s> do not significantly differ from the <other group/s>, p > 0.05.
*tenable – able to maintained or defended against attack or objection

LUI STAT TAO


louis_tattao@dmmmsu.edu.ph

Potrebbero piacerti anche