Sei sulla pagina 1di 13

STATISTICS IN RESEARCH

Elena M. Manaig

Approaches to Research

Many of the researches are done in one of two ways:


1. Two or more groups are compared such as in varietal tests where the
characteristics of two
or more varieties are compared and fertilizer experiments using different forms or
types of fertilizer, different levels, etc.

2. Variables within one group are related.


Examples: Relating length of panicles and weight of individual grains
Determining relationship between feed consumption and gain in
weight

Comparing Groups: Quantitative Data


When two or more groups are compared, the comparison can be made in a
variety of ways: through frequency polygons, calculation of one or more measures
of central tendency (averages) and calculation of one or more measures of
variability (spreads).

Once descriptive statistics have been calculated, they must be interpreted.


At this point, the task is to describe in words what the polygons, averages and
spreads tell about the question/problem or hypothesis being investigated. Key
questions arise: How large does a difference in means between two groups have to
be in order to be important? When will this difference make a difference? How
does one decide?

a. Use information about known groups


b. Calculate the effect size
mean of experimental group – mean of comparison group
Effect size = ————————————————————————
standard deviation of comparison group
c. Use inferential statistics. Use tests of significance only to judge the
generalizability of results and not to evaluate the magnitude of difference
between sample means.
d. In making conclusions, sometimes there is a need to reiterate the condition
under which the characteristics being compared were tested.

Relating Variables within a Group: Quantitative Data

Whenever quantitative variables within a single group are examined, the


appropriate techniques are scatterplot and correlation coefficient. In
interpreting scatterplots and correlation coefficients the question is similar to
interpreting differences between means: How large must a coefficient of
correlation be to suggest important or significant relationship? What does an
important relationship look like in a scatterplot?

Inferential statistics must be calculated only if the researcher can give a


convincing argument that the relationship found in the sample is important. The
test of significance must be used to judge generalizablity and not to evaluate the
magnitude of relationship.
Interpretation of Correlation Coefficients When Testing Research Hypothesis
Magnitude of r Interpretation
.00 to .40 Of little practical importance except in unusual circumstances; perhaps
of theoretical value
.41 to .60 Large enough to be of practical as well as theoretical use
.61 to .80 Very important
.81 to 1.00 If not an error in calculation, a very sizable relationship

Comparing Groups: Categorical Data

Groups may be compared when the data involved are categorical or qualitative by
reporting either percentages or proportions and frequencies in crossbreak or
contingency tables. The summary statistics must be interpreted carefully- even
percentages. Percentages may be misleading unless the number of cases is also
given.

Relating Variables within a Group: Categorical Data

The procedures available are the same as those for comparing groups- percentages
or proportions and frequencies in crossbreak tables. Example is determining
whether the gender of farmers is related to their extension exposure.

Table 1. Distribution of plants by fertilizer Table 2. Distribution of


farmers by educational
application and degree of damage by pest attainment and adoption of
technology

Damage Adopter Non-


adopter
Severe Mild Elementary
Without fertilizer 12 20 High School
With fertilizer 18 14 College

Summary of commonly used statistical techniques:

Quantitative Data Categorical Data


Two or more groups are compared
Descriptive statistics Frequency polygons Percentages
Averages Bar graphs
Spreads Pie charts
Effect size Crossbreak (contingency tables

Inferential Statistics
t-test for means
Chi square
ANOVA
ANACOVA
Confidence interval
Mann-Whitney U test
Kruskal-Wallis ANOVA
Sign test
Friedman ANOVA
Relationships among variables are
Studied within one group
Descriptive statistics Scatterplot Crossbreak (contingency) tables
Correlation Contingency coefficient
Inferential Statistics t-test for r Chi square
Confidence interval

REVIEW OF BASIC STATISTICS

Types of Data:
1. Qualitative or Categorical data differ in kind but not in degree or amount.
They are collected on qualitative variables.
2. Quantitative data vary in degree, amount or magnitude. These are further
classified into discrete and continuous. They are collected on quantitative
variables.

Classification of variables on which data can be collected:


Qualitative/Categorical Quantitative
Discrete Continuous
People Sex, civil status, ethnicity,
organizational affiliation,
occupation, attitude Household size, no. of dependents, frequency of farm visit
Age, height, income, expenses on food, amount of time spent on the farm
Plants Species, variety, color of
flowers, shape of fruits,
growth habit Population, no. of fruits, no. of tillers, no. of cavans, no. of
croppings Height, weight of fruits, biomass, length of panicles
area occupied per plant
Animals Breed, type (meat or dairy) color of hair, body condition
feeding habit Population, no. of eggs,
litter size, no. of parasites
frequency of feeding, Body weight, body length
feed consumption,
feed efficiency
Barangay sources of income, types of existing organizations, resources, dominant
religion Population, no. of voters,
no. of project beneficiaries,
crime incidence Income, land area, amount of taxes collected, population density

Levels of Data Measurement


TYPE NATURE EXAMPLES KIND OF ANALYSIS
Nominal Uses categories or classifications; no measure; order is meaningless;
lowest level of data measurement since data cannot be transformed. 1. Types of
organization
-Professional
-Scientific
-Honor Societies
2. Animals raised
-Chicken -Cattle
-Swine -Goat Non-parametric:
-Chi square ( 2)
-Percentages
- etc

Ordinal No measure; data are arranged in meaningful order, rank or classes, but
the distances between each order or class are not equal
Scales are mutually exclusive Examples:
1. size expressed as small, medium and large
2. performance rating expressed as O, VS, S, F and P
3. Class of municipality as A, B, C, D, E, F
Non-parametric:
-Spearman’s rank correlation
-Sums of Rank test
-Kruskal Wallis test
Interval -Categories are ordered or ranked using equality of distance
-Classes are mutually exclusive
-Higher level of data measurement than nominal and ordinal
-Zero point has no true value 1. Average monthly income
5,000 – 10,000
10,001 – 15,000
15,001 – 20,000
2. Scales in measuring temperature
3. Likert rating scale
1 2 3 4 5
Poor Fair Ave VS O Parametric and non-parametric
-Pearson’s correlation coefficient
-ANOVA, t tests, z tests,
-Regression analysis
Ratio -Possesses the characteristics of nominal, ordinal and interval
-Possesses true zero point
-Highest level of data measurement -Income
-Profits
-Height
-Weight
-Population density
-Income per month
-Daily expenses Parametric and non-parametric
-Pearson’s correlation coefficient
-ANOVA, t tests, z tests,
-Regression analysis

Divisions of the Field of Statistics

A. Descriptive Statistics- concerned with describing and summarizing data

Statistic- is sample observation. It is obtained from sample data sampling or


survey data.
Parameter- population observation. It is obtained from total enumeration or
census

Techniques for summarizing quantitative data:

1. Use of frequency polygons and histograms. Frequency polygons can assume


normality, can be positively skewed (tail trails off to the right) or negatively
skewed (tail trails off to the left)
2. Use of averages or central tendency. Average is a single figure used in
some definite way to represent the whole set of data. The averages commonly used
are mean, median and mode.
▪ Mean is obtained by getting the sum of all the scores divided by the number of
scores. ▪ Median is the middle item. It divides the set of data into two equal
parts, the upper
50% and the lower 50%.
▪ Mode is the most frequently occurring item.
3. Use of spreads or measures of variability.
▪ Range- represents the distance between the highest and lowest scores in a
set of data.
▪ Mean Absolute Deviation (MAD)- is the mean of the absolute deviations
from a
central value usually the mean.
▪ Standard deviation- is the square root of the sum of the squared
deviations from the
mean. Its square is the variance.

4. Use of correlation. This is used to determine the relationship between two


or more variables such as amount of rainfall and yield of rice, height and length
of panicles, etc.

5. Use of scatterplots. A scatterplot is a graphical or pictorial presentation


of the relationship between two quantitative variables.

Table 1. Average mount of rainfall


per year and yield of rice

Rainfall Yield of rice


cm cav/ha
32.77 87.9
18.29 82.2
28.70 86.3
47.24 93.6
22.35 83.8
26.16 85.3
40.39 90.9
33.27 88.7

B. Inferential Statistics

Inferential statistics refer to certain types of procedures that allow researchers


to make inferences about a population based on findings from a sample. As with
descriptive statistics, the techniques differ depending on the type or nature of
data and the purpose the analysis is intended. This includes estimation,
hypothesis testing and prediction. The basic difficulty in inferential statistics
is sampling error- samples are virtually never identical to their parent
population. However, there are techniques to reduce sampling error. These are a)
randomization, b) increasing the sample size, and c) stratification or blocking.
What is also important is that the researcher should provide an estimate of the
standard error of the mean, s
sx = ——
n-1
Hypothesis Testing

1. Inferences about the population: Significance of the difference between


population observation (parameter) and sample observation (statistic) is tested
using z test if sample is large, that is, n ≥ 30, and t test if sample is small,
that is, n < 30 .
Null form:
Ho: x = μ (The sample mean is equal to the population mean), therefore, Ho
: x – μ = 0
Ho: μ = 90 kg (For example, the mean weight of finished hogs is estimated to
be 90 kg)
Alternative form:
Ha: x ≠ μ (The sample mean is not equal to the population mean),
therefore, Ho : x – μ ≠ 0
Ha: μ ≠ 90 kg (For example, the mean weight of finished hogs is not 90 kg)

2. Inferences about two or more population means. Two groups are being
compared and hypothesis is tested using samples from each group or population
Examples: - Average yields of two varieties of rice will be compared.

- Comparing the achievement of students under two teaching


methods
- Comparing the mean life spans of smokers and non-smokers
For studies like those mentioned above, the test used is z test samples from the
populations being studied are both large. If one or both samples are small, the
appropriate test to use is t test.
The hypothesis to be tested, in null form is Ho: μ1 = μ2 or Ho: μ1 – μ2 = 0
where the x¬1 and x2 will be used. In alternative form, the hypothesis is Ha: μ1
≠ μ2 or Ha: μ1 – μ2 ≠ 0

3. Inferences about three or more population means


Examples: - Comparing mean yields of four varieties of rice
Ho: μ1 = μ2 = μ3 = μ4 Ha: μ1 ≠ μ2 ≠ μ3 ≠ μ4
- Comparing the achievements of students under three teaching methods
Ho: μ1 = μ2 = μ3 Ha: μ1 ≠ μ2 ≠ μ3
- Comparing the yield of corn applied with five levels of
fertilizer
Ho: μ1 = μ2 = μ3 = μ4 = μ5 Ha: μ1 ≠ μ2 ≠ μ3 ≠ μ4 ≠ μ5
The test statistics to use for these hypotheses are:
One-way analysis of variance (1-W ANOVA) if the design of
experiment is a
completely randomized design (CRD).
Two-way analysis of variance (2-W ANOVA) if the design of experiment is
a
randomized complete block design (RCBD).
Three-way analysis of variance (3-W ANOVA) if the design of
experiment is a Latin
square design (LSD).

For experimental units where there is heterogeneity and the heterogeneity is


so that
it cannot be coped by blocking and if the initial or contributing characteristic
is measurable, of covariance (ANACOVA) is more effective than ANOVA

Note: Varieties of ANOVA are also used for experimental designs other than the
three
basic ones mentioned, e.g., ANOVA in factorial experiment in split plot design, in
strip plot design, etc.

4. Inferences about sample proportion, p and population proportion, P, or


between two population proportions, P1 and P2, the appropriate test statistics are
z test if samples used are large and t test if one or both samples are small.
Hypotheses can be stated mathematically as:
Ho: p = P for comparing sample and population proportions
Ho: P1 = P2 for comparing two population proportions

5. For inferences about three or more population proportions, chi square test
is used
Ho: P1 = P2 = P3 = . . . = Pn
Methods of Data Presentation

After data are collected, grouped or categorized, they must be presented in


some way that is interesting and easy-to-understand to the reader or data
consumer.

Data can be presented in:

1. Textual Form. This makes use of statements or paragraphs and appropriate


when only few numbers or quantities are being presented. Language that is simple,
easy to understand, precise and accurate is recommended.
2. Tabular Form. This is used when many quantities are being presented. Data
are presented in rows, described by stubs, and columns, described by captions.
The table heading contains the table number and title. The title usually
indicates what are being presented, where the data refer and when the data apply.
3. Graphical Form. This is a pictorial presentation of data which effectively
shows quantities, percentages or proportions, and relationship between two
variables. Several types are available: line, bar, pie, pictograph and
statistical maps

Table 1. Average mount of


rainfall and yield of rice

Rainfall Yield of rice


cm cav/ha
32.77 87.9
18.29 82.2
28.70 86.3
47.24 93.6
22.35 83.8
26.16 85.3
40.39 90.9
33.27 88.7

Table 2. Comparative yield of


four varieties of rice

Variety
1 85
2 80
3 90
4 110

Table 3. No. of technology


Adopters by town
Town No. of Adopters
Pangil 25
Siniloan 33
Mabitac 20
Famy 35
Sta. Maria 30

Table 4. Distribution of farmers


according to yield of their rice
crop
Yield, No. of
cav/ha Classmark farmers
80-82 81 3
83-85 84 10
86-88 87 14
89-91 90 20
92-94 93 25
95-97 96 17
98-100 99 11
101-103 102 9
104-106 105 2

Research Designs for Survey

Three major characteristics of survey research:


1. Information is collected from a group of people in order to describe some
aspects or characteristics (awareness, beliefs, attitude) of the population of
which that group is a part.
2. The main way in which the information is collected is through asking
questions; the answers to these questions by the members of the group constitute
the data of the study.
3. Information is collected from a sample rather than from every member of the
population.

Types of Survey Research:


1. Cross-sectional surveys. In a cross-sectional survey information is
collected at just one point in time, although the time it takes to collect all the
data desired may range from a day to a few months or more. When an entire
population is surveyed, it is called census.

2. Longitudinal surveys. A longitudinal study collects information at


different points in time, in order to study changes over time.

Three longitudinal designs are employed:


a) In a trend study, different samples from the same population are surveyed at

different points in time. The members of the population may change over
time.
b) In a cohort study, a specific population is followed over a period of time.
The members of the population are the same over the course of the study.
c) In a panel study, a specific population is followed over a period of time.
The researcher selects a sample right at the beginning of the study. Same
individuals are surveyed at different times during the course of the study.

Steps in Survey Research


1. Problem definition
2. Identification of the target population
3. Selection of the Sample
4. Preparation of the Instrument
5. Data Collection. This can be done by:
a) direct administration to a group
b) mail
c) telephone interview
d) interview by internet, e-mail
e) personal interview
6. Analysis and Interpretation of Data

Basic Experimental Research Designs


1. Completely Randomized Design (CRD). CRD is appropriate for homogeneous
experimental units. The treatments are assigned to plots or units at random.

2. Randomized Complete Block Design (RCBD). This is used when there is a one-
way pattern of heterogeneity. The blocks are oriented across the direction of
heterogeneity such that the heterogeneity within block is minimized and the
heterogeneity between blocks is maximized. The treatments are assigned at random
within a block All the treatments must be present in a block.

3. Latin Square Design (LSD). The experimental units are first arranged in
rows and columns and the treatments are assigned in such a way that no treatment
is repeated in a row as well as in a column.

Other Designs:
1. Split Plot Design. This involves assigning the levels of the first and less
important factor to the main plot using any of the basic designs and then followed
by assigning the levels of the second and more important factor to the subplots
within the mainplots. For illustration, consider a factorial experiment involving
three cropping systems and five pest control methods (3x5) following a RCBD with
four replications.

The steps are:

a) The area is first divided into four blocks

Block 1 Block 2 Block 3


Block 4

b) Each of the four blocks is then further divided into three main plots.
Therefore there are 12
12 main plots all in all.
c) Each of the 12 main plots is the further divided into five subplots
resulting to 60 subplots.

2. Strip Plot Design


a) One factor is randomly allocated to the horizontal strips.

b) The second factor is randomly allocated to the vertical strips

Requirements for a Valid Experimental Design

Uniqueness of Experimental Research


1. It is the only type of research that directly attempts to influence a
particular variable.
2. It is the only type that can really test hypotheses about cause-and-effect
relationships.

The independent variable is referred to as experimental or treatment variable


while the dependent variable, also called as the criterion or outcome variable
refers to the outcomes or results of the study.

The major characteristic of experimental research which distinguishes it from all


other types of research, is that the researchers manipulate the independent
variable. They decide the nature of treatment.
Examples:
1. In fertilizer experiments, the treatments can be the amount, method of
application, type or source.
2. In varietal tests, the researcher chooses the varieties to be subject to
testing and comparison
3. In experiments with animals, the treatments can be kind of ration, level of
energy, level of protein, frequency of feeding, method of feeding, lighting
regime, etc.
4. In educational research, the treatments can be method of teaching, schedule
of class, types of learning materials, rewards given to students, etc.

Requirements for a Valid Experimental Design

1. Comparison of Groups. An experiment usually involves two groups of


subjects, an experimental group and a control or comparison group although it is
also possible to conduct experiment with only one group or with three or more
groups. The experimental group receives a treatment of some sort while the
control group receives no treatment, or the comparison group receives a different
treatment. The control or comparison group is crucially important in all
experimental research because it serves the purpose of determining whether the
treatment has had effect or whether one treatment is more effective than another.
2. Manipulation of the Independent Variable. This is the second essential
characteristic of experimental research. Here the researcher deliberately and
directly determines what form, extent or level the independent variable will take
and then which group will receive which form or level. The independent variable
in an experimental study can be established in several ways:
a) one form of the variable versus another, e.g., comparing dry and wet feeding
methods in pigs; direct seeding and transplanting methods or organic and inorganic
fertilizer in crops; lecture and case study methods of instruction in economics
b) presence versus absence of a particular form, e.g., with and without
fertilizer (crop), with and without feed supplement (animal), with and without
visual aids (teaching), with and without MSG (food).
c) varying degrees of same form, e.g., effect of different levels of fertilizer
on the yield of crop, different amounts of hormones on the growth of animals,
different degrees of counseling on attitude of students, different amounts of
extension services on technology adoption by farmers.
3. Randomization. This refers to random assignment of subjects to groups or
treatments. Random assignment means that every individual participating in the
experiment (subject) has an equal chance of being assigned to any of the
experimental or control conditions being compared. Random assignment is different
from random selection. The latter means that every member of a population has an
equal chance to be included in the sample. Three things should be noted about
random assignment of subjects to groups:
a) it is done before the experiment begins
b) it is a process of assigning or distributing subjects, not a result of such
distribution
c) it allows the researcher to form groups that, right at the beginning of the
study, are equivalent, that is they differ only by chance.
4. Control of extraneous variables. Extraneous variables are factors other
than the treatments used that can be cause of a result. For example if the
treatment used is fertilizer levels then the differences in the soil fertility at
different spots in the experimental area to be used is an extraneous variable.
Control of extraneous variables refers to control for threats to internal validity
to eliminate or minimize their possible effect on the dependent variable.

Ways to eliminate or minimize threats to internal validity due to extraneous


variables:
a) Random assignment
b) Holding criterion variable constant
c) Building the variable into the design
d) Use subjects as their own controls
e) Analysis of covariance

5. Blocking or local control. This refers to grouping or stratifying the area


or experimental units in such a way that the homogeneity within block is minimized
and the heterogeneity between blocks is maximized.

Potrebbero piacerti anche