Sei sulla pagina 1di 51

Lesson 5: Finding Answers Through Data

Collection
Being deeply knowledgeable on one subject narrows one’s focus and
increases confidence, but it also blurs dissenting views until there are no
longer visible, thereby transforming data collection into biased confirmation
and morphing self-deception into self-assurance. – Michael Sherme
•How to gather and
analyze data using the
suitable techniques?

Guide Question:
I can…
• Collect data using appropriate instruments
• Present and interpret data in tabular and
graphical forms
• Use statistical techniques to analyze data --
- study of differences and relationships
limited to bivariate analysis.

Objectives:
• In conducting a quantitative
research, how important is the
research data collection
instrument to collect the data
appropriate for the study?

Starter:
• In quantitative research, where you
are to identify the relations between
adolescents usage pattern of social
media and their level of school
performance, what data collection
instrument is suitable to get the data?
Explain your answer.

Reinforcer:
• Following the research
problem in the reinforce
section, what statistical test
will you use to analyze the
data? Explain your answer.

Challenger:
• The class will be divided into groups and a
leader will be chosen to facilitate the tasks
for this group activity.

• Following the last activity, each group will


be asked to accomplish the following
questions:

Activity 1: Conducting a Class Survey


• Data Collection Technique and Procedure
• What research techniques will be appropriate to
get the data?
• How will you gather your data?
• Data Collection Instrument
• What data collection instrument will be suitable
for you to gather your data?
• Data Analysis
• How will you analyze your data?
• What statistical test will you apply?

Activity 1: Conducting a Class Survey


• You shall be given 15 minutes to
accomplish the task. Use the manila
paper to present your output.
• Each group will be given 3-5 minutes
to present the group output.

Activity 1: Conducting a Class Survey


• The instrument must be suitable to the
research design and must be based on
the theoretical framework of the study.

Important reminder when choosing the


instrument to use:
QUESTIONNAIRE
• Composed of a series of questions that is used
to get information, which is answered and
filled out by all the participants in the sample.
• Can be in oral or written form
• Most common and widely used
• Best to use when collecting data from a large
sample and when crossing geographical
limitations

Instruments used in Data Collection:


INTERVIEW GUIDE
• Helps you direct the conversation towards the
topic and issues you want to learn about
• May vary from highly scripted to relatively
loose
• Helps you to know what to ask about, in what
sequence, how to pose your questions and
how to pose follow-ups
• Provide guidance about what to do or say next
after your interviewee has answered the last
question

Instruments used in Data Collection:


TYPES OF INTERVIEW
• Unstructured Interview – more
conversational and may sometimes take
long to finish and conducted in a usual
situation
• Structured Interview – always operates
within a formal written instrument

Instruments used in Data Collection:


OBSERVATION – may also be used in quantitative
research with the use of an observation guide.
• Unstructured Observation – the observer monitors all
aspects of the phenomenon that seems relevant to the
problem at hand (e.g. observing children playing a new
toy)
• Structured Observation – specifies in detail what is to
be observed and how the measurements are to be
recorded (e.g. an auditor performing inventory analysis
on a store)
• requires preparation of record-keeping forms such as
category systems, checklists, and rating scales. The
researcher usually has some prior knowledge about the
behavior or event of interest.

Instruments used in Data Collection:


RECORDS
• Refers to all the numbers and statistics that
the institutions, organizations, and people
keep as a record of their activities such as
census data, educational records,
hospital/clinic records, and the like

Instruments used in Data Collection:


EXPERIMENTAL APPROACH
• Used for testing hypothesis of causal
relationships among variables
• The research controls the independent
variable and watches the effect on the
dependent variable

Instruments used in Data Collection:


• It most often refers to processes that convert
data into information or knowledge.
• INFORMATION – refers to either a
meaningful answer to a query or a meaningful
stimulus that can be considered into further
queries.
• The purpose of the data analysis and data
interpretation is to transform the data
collected into credible and understandable
evidence about the development of the study.

DATA PROCESSING
• TEXTUAL PRESENTATION – the
highlights and focus are written in
paragraph form for a textual presentation.
Textual presentation emphasizes and
focuses on important numbers and makes
meaningful comparison.

Ways of Presenting the Data Collected


• TABULAR PRESENTATION – shows
summary of data collected to present in a
manner that is more understandable and
more appealing.

Ways of Presenting the Data Collected


Total Sample Size: 24

Gender Male: 11 (46%)


Female: 13 (54%)

Course Fine Arts: 9 (37%)


Architecture: 6 (25%)
Journalism: 4 (17%)
Communication Arts (5 (20%)
School UP: 3 (12%)
ADMU: 4 (17%)
DLSU: 3 (12%)
UST: 5 (20%)
PNU: 4 (17%)
UE: 5 (20%
Attended in 2016 Summer Arts- Yes: 18 (75%)
Seminar Workshop No: 6 (25%)

Role in the 2016 Semianr- Speaker: 4 (17%)


Workshop on Arts Organizer: 3 (12%)
Demonstrator: 5 (20%)
Participant: 12 (50%)

Satisfaction with the demonstration Strongly Agree: 11 (46%)


and practice exercises Agree: 5 (20%)
Neutral: 2 (8%)
Disagree: 4 (14%)
Strongly Disagree: 2 (8%)
Number of Hours Tally Frequency
of Using Social
Media in a Week
Less than 6 hours // 2
6-10 hours //// 4
11-15 hours //// /// 8
16-20 hours / 1
More than 20 hours /// 3

• FREQUENCY DISTRIBUTION TABLE –


constructed for a tabular presentation and includes a
summary of the data showing the number of items or
frequency in each of the non-overlapping groupings
or classes.

Ways of Presenting the Data Collected


Question: Do you use social media as the primary means to communicate
with your friends?

Measurement Code Frequency Percent


Scale Distribution Distribution

Strongly Agree 1 14 58%

Agree 2 3 12%

Neutral 3 2 8%

Disagree 4 1 4%

Strongly Disagree 5 4 17%


How often do you use social media in a week?
Measurement Code Frequency Percent
Scale Distribution Distribution

Less than 6 hours 1 14 58%

6-10 hours 2 3 12%

11-15 hours 3 2 8%

16-20 hours 4 1 4%

More than 20 5 4 17%


hours
• GRAPHICAL PRESENTATION – used to
show numerical values or relationships.

Ways of Presenting the Data Collected


• PIE CHARTS –
graphs that are used
for presenting relative
frequency
distributions for a
given data.

Ways of Presenting the Data Collected


HISTOGRAM
• is a graphical method
for displaying the shape
of a distribution. It is
particularly useful when
there are a large number
of observations.

Ways of Presenting the Data Collected


• LINE GRAPH – best used
in showing specific values
of time series data. They
use line segments to
connect the data points
and how how one variable
is affected by the other as
it increases or decreases.

Ways of Presenting the Data Collected


• PICTOGRAPH – a
way of presenting
data using symbols
pictures in order to
match the frequency
for each category.

Ways of Presenting the Data Collected


• STEM PLOT –
also called
stem and leaf
display. It is
usually used
when you want
to preserve all
or the first few
digits of the
data values.

Ways of Presenting the Data Collected


Quiz #3
• Parametric Assumptions – assumptions
which refer to a quantity that is calculated
from data and describes a population
• Independent, unbiased samples - Two
samples are independent if the sample
values selected from one population are
not related or somehow paired or matched
with the sample values selected from the
other population.
• Data normally distributed - A normal
distribution, sometimes called the bell
curve, is a distribution that occurs naturally
in many situations. For example, the bell
curve is seen in tests like the SAT and
GRE. The bulk of students will score
the average (C), while smaller numbers of
students will score a B or D. An even
smaller percentage of students score an F
or an A. This creates a distribution that
resembles a bell (hence the nickname).
The bell curve is symmetrical. Half of the
data will fall to the left of the mean; half
will fall to the right.
• Equal variances - Variance refers
to the data spread or scatter.
Statistical tests, such as analysis of
variance (ANOVA), assume that
although different samples can
come from populations with
different means, they have the
same variance. Equal variances
(homoscedasticity) is when the
variances are approximately the
same across the samples.
• Continuous Data - Continuous
Data can take any value (within a
range).
• Discrete/Categorical Data -
Discrete Data can only take
certain values.
• Chi-square test - used to
determine if there is a significant
relationship between two nominal
(categorical) variables.
• True independent variable -
occurs when subjects arrive for
the study and the experimenter
randomly assigns them to groups.
• Regression Analysis - a set of
statistical processes for estimating
the relationships among variables
• Correlational Analysis - a method
of statistical evaluation used to
study the strength of a relationship
between two, numerically
measured, continuous variables
• Parametric - Parametric statistical
procedures rely on assumptions
about the shape of the distribution
(i.e., assume a normal
distribution) in the underlying
population and about the form or
parameters (i.e., means and
standard deviations) of the
assumed distribution.
• Non-parametric - Nonparametric
statistical procedures rely on no
or few assumptions about the
shape or parameters of the
population distribution from
which the sample was drawn.
• Pearson’s r/Pearson's Product
Moment Correlation - The
Pearson correlation evaluates the
linear relationship between two
continuous variables. A
relationship is linear when a
change in one variable is
associated with a proportional
change in the other variable. For
example, you might use a Pearson
correlation to evaluate whether
increases in temperature at your
production facility are associated
with decreasing thickness of your
chocolate coating.
• Spearman’s Rank-Order
Correlation - The Spearman
correlation evaluates the
monotonic relationship between
two continuous or ordinal
variables. In a monotonic
relationship, the variables tend to
change together, but not
necessarily at a constant rate. The
Spearman correlation coefficient
is based on the ranked values for
each variable rather than the raw
data.
• Spearman correlation is often
used to evaluate relationships
involving ordinal variables. For
example, you might use a
Spearman correlation to evaluate
whether the order in which
employees complete a test
exercise is related to the number
of months they have been
employed.
• Equal Variance - when the
variances are approximately the
same across the samples.
• Fmax test - also called Hartley’s
Fmax, is a test for homogeneity
of variance. In other words, the
spread (variance) of your data
should be similar across groups or
levels.
• Brown and Smythe’s test - The
Brown-Forsythe (B-F) Test is for
testing the assumption of equal
variances in ANOVA.
• Bartlett’s test - used to test
that variances are equal for all
samples. It checks that
the assumption of equal
variances is true before running
certain statistical tests like
the One-Way ANOVA. It’s used
when you’re fairly certain your
data comes from a normal
distribution.
• Means - average of a set of
data
• Paired t-test - Paired means
that you will look at the
differences between the two
groups. A paired test first
calculates the difference from
one group to the other, and
runs a one-sample t test.
• Unpaired t-test - Unpaired
means that you simply
compare the two groups. So,
you will build a model for each
group (calculate the mean and
variance), and see whether
there is a difference.
• Unpaired t-test - You should use
a paired t test if you do a within-
subject design. What a paired t
test does is to take differences
between data in the two groups,
and see whether the distribution
of the differences is too different
from the t distribution. Because it
uses the differences between the
groups, a paired t test does not
assume the variances of the
population of the two groups are
equal. But it still assumes the
normality. The null hypothesis is
there is no significant difference
in the means between the two
groups. If the p value is less than
0.05, you reject the null
hypothesis, and say that you find
a significant difference.
• Data transformation -
replacement of a variable by a
function of that variable: for
example, replacing a variable x
by the square root of x or the
logarithm of x. In a stronger
sense, a transformation is a
replacement that changes the
shape of a distribution or
relationship.
• Mann-Whitney U - used to
compare differences between two
independent groups when the
dependent variable is either
ordinal or continuous, but not
normally distributed.
• Wilcoxon Rank sums test - a
nonparametric alternative to the
two sample t-test which is based
solely on the order in which the
observations from the two
samples fall.
• ANOVA - Analysis of Variance
(ANOVA) is a statistical method
used to test differences between
two or more means.
• Post hoc test - Post-hoc (Latin,
meaning “after this”) means to
analyze the results of your
experimental data. They are often
based on a familywise error rate;
the probability of at least one
Type I error in a set (family) of
comparisons.
• Tukey’s - The purpose of Tukey’s
test is to figure out which groups
in your sample differ. It uses the
“Honest Significant Difference,”
a number that represents the
distance between groups, to
compare every mean with every
other mean.
• Bonferroni’s - This multiple-
comparison post-hoc
correction is used when you
are performing many
independent or dependent
statistical tests at the same
time.
• Kruskal-Walis - used when the
assumptions of one-way
ANOVA are not met.
• Dunn’s test - a post hoc (i.e.
it’s run after an ANOVA) non
parametric test (a “distribution
free” test that doesn’t assume
your data comes from a
particular distribution).
Choosing the Appropriate Statistical Test/s
Type of Data # of Groups Test Hypothesis Statistical Test
For
Ratio/Interval 2 Correlation Kendall’s
Tau/Pearson’s r
-do- 2 Variances Fmax test
-do- 2 Means T-test
-do- 2+ Variances Analysis of
Variance
(ANOVA)
Ordinal 2 Correlation Spearman’s rho
-do- 2+ Correlation Kruskal-Wallis
ANOVA
Nominal 2 categories Association Chi-Square
(frequency data)
Output Variable

Nominal Ordinal Interval –


Ratio
Nominal Chi-square Mann Unpaired t-
Whitney test or Mann
Kruskal- Whitney
Wallis
Ordinal Chi-Square Spearman Linear
Mann Rank regression or
Whitney Spearman
Interval Logistic Poisson Pearson’s r
Ratio Regression Regression Linear
Regression or
Spearmean

Statistical Test Alternatives: Parametric – Non Parametric


• Null Hypothesis: There is no association
between gender and soft drink preference.
• Type of Data: Gender and soft drink brand
are both nominal variables
• Statistical Test: Chi-Square

Examples to illustrate a statistical test


• Null Hypothesis: There is no correlation
between Mathematics score and the
number of hours spent in studying the
Mathematics subject.
• Type of Data: Math score and number of
hours are both ratio variables
• Statistical Test: Pearson’s r or Kendall’s
Tau

Examples to illustrate a statistical test


• Null Hypothesis: There is no difference
between the Mathematics scores of
Sections A and B
• Type of Data: Math scores of both sections
A and B are ratio variables.
• Statistical Test: T-test

Examples to illustrate a statistical test


Encode your data properly and correctly after choosing a
specific statistical test to analyze your data.
• Review data entries.
• Verify the manner of data
collection.
• Avoid biased results.

How to ensure data accuracy


and integrity?
Before the performance task, the
students will be asked to bring out a
yellow pad. The, they will be asked to
write down the null hypothesis of their
group. When everyone has a copy of it,
the next slide shall be shown.

Performance Task 4 (Individual Activity)


A. Based on your group’s topic, provide answers to the
following: (2 points)
• Type of Data:
• Statistical Test:
B. Write the type of data and the appropriate statistical test for
the following null hypotheses: (6 points)
1. There is no difference between the reading comprehension
scores of Sections Loving and Family-Oriented.
2. There is no relationship between the social media usage
pattern and the number of sleeping hours of Grade 12
students.
3. There is no association between religion and the mall
preference of individuals.

Performance Task 4 (Individual Activity)

Potrebbero piacerti anche