Basic Statistics 18 19 20

STATISTICS
- Deals with the collection, organization, presentation, analysis, and interpretation of numerical data.
– As a branch of mathematics that examines and investigates ways to process and analyze the
data gathered.
- is a basic tool of measurement, evaluation and research.
- is sometimes used to refer to any measure computed on the basis of the data obtained from a
characteristic of a population under study.
DIVISION OF STATISTICS
MAJOR AREAS
1. DESCRIPTIVE STATISTICS – is the totality of methods and treatments employed in the
collection, description, and analysis of numerical data.
2. INFERENTIAL STATISTICS – is the logical process from sample analysis to a generalization or
conclusion about a population. It is also called STATISTICAL INFERENCE OR INDUCTIVE
STATISTICS.
BASIC TERMS:
1. POPULATION – consists of all the members of the group about which you want to draw a
conclusion.
2. SAMPLE – Is a portion, or part, of the population of interest selected for analysis.
3. PARAMETER – is a numerical index describing a characteristic of a population.
4. STATISTIC – is a numerical index describing a characteristic of a sample
Two major characteristic of objects, people, or events

5. CONSTANT – is a characteristic of objects, people or events that does not vary. (Example: the
temperature at which water boils is a constant)
6. VARIABLE – is a characteristic of objects, people, or events that can take of different values. It
can vary in quantity (ex: weight of people) or in quality (hair color of people).
TWO TYPES OF RANDOM VARIABLES:

1. QUALITATIVE VARIABLE - a variable that is conceptualized and analyzed as distinct categories,
with no continuum implied. (CATEGORICAL VARIABLE) Example: eye color, gender, occupation,
religious preference, etc.
2. QUANTITATIVE VARIABLE – a variable that is conceptualized and analyzed along a continuum
implied.it differs in amount of degree. (NUMERICAL VARIABLE) EXAMPLE: height, weight, salary,
age, etc.
CLASSIFICATION OF VARIABLES ACCORDING TO PORPUSE
A. EXPERIMENTATL CLASSIFICATION: a researcher may classify variables according to the

function they serve in the experiment.
1. INDEPENDENT VARIABLES – are variables controlled by the experimenter/researcher, and
expected to have an effect on the behavior of the subjects. (EXPLANATORY VARIABLE)
2. DEPENDENT VARIABLE – is some measure of the behavior of subjects and expected to be
influenced by the independent variable. (OUTCOME VARIABLE)
EXAMPLE: To predict the value of fertilizer on the growth of plants, the dependent variable is the
growth of plants while the independent variable is the amount of fertilizer used.
B. MATHEMATICAL CLASSIFICATION. Variables may also be classified in terms of mathematical

values they may take on within a given interval.
1. CONTINUOUS VARIABLE – is a variable which can assume any of an infinite number of values,
and can be associated with points on a continuous line interval.
Example: height, weight, volume, etc
2. DISCRETE VARIABLE – is a variable which consists of either a finite number of values or
countable number of values.
Example: gender, courses, Olympic games, etc.
DATA – refers to the kinds of information researchers obtain on subjects of their research.
TWO MAIN SOURCES OF DATA:

1. PRIMARY DATA: are data that come from an original source, and are intended to answer specific
research questions, can be taken by interview, questionnaire, survey, or experimentation.
2. SECONDARY DATA – are data that are taken from previously recorded data, such as information
in research conducted, industry financial statements, business periodicals, and government reports.
It can also be taken electronically.
TYPES OF DATA ACCORDING TO FORM

1. QUANTITATIVE DATA – These are the data that are measured on a scale.
2. QUALITATIVE DATA – these are the observations that can be classified into single category or
a set of categories.
DATA COLLECTION METHODS:
1. OBSERVATION
2. INTERVIEW
3. QUESTIONNAIRE
LEVELS OF MEASUREMENT/MEASUREMENT OF SCALE

1. NOMINAL SCALE – is the simplest, and the most limited form of measurement researchers can
use. It is used to differentiate categories in order to show differences.
Example: gender: Male or Female, yes or no
2. ORDINAL SCALE – IS ONE of which data are not only classified but also ordered in some way
– high to low or least to most.
Example: student class designation: freshmen, sophomore, junior, senior
Movie classification: G PG, PG-13, R-18, X
3. INTERVAL SCALE – has the attributes of ordinal scale plus another feature. The distances
between the points on the scale are equal. Are either discrete or continuous.
Example: scores (70 and 75; 90 and 95)
4. RATIO SCALE – IS similar to the interval scale only it has an actual, or true zero point which
indicates a total absence of the property being measured.
Example: weight ( in pounds or kilograms; Age: in years or days; salary: in peso)
SAMPLING: refers to the process of selecting the subjects who will participate in a research study.
A. RANDOM SAMPLING: is a process whose members had an equal chance of being selected
from the population; it is also called probability sampling.
1. Simple Random Sampling – is a process of selecting n sample size in the population via random
numbers or through lottery.
2. SYSTEMATIC SAMPLING – is a process of selecting kth element in the population until the
desired number of subjects or respondents is attained.
3. STRATIFIED SAMPLING is a process of subdividing the population into subgroups or strata and
drawing members at random from each subgroup or stratum.
4. CLUSTER SAMPLING – is a process of selecting clusters from a population which is very large
or widely spread out over a wide geographical area.
B. NON-RANDOM SAMPLING –is a sampling procedure where samples selected in a deliberate

manner. With little or no attention to randomization. It is also called non-probability sampling.
1. CONVENIENCE SAMPLING – is a process of selecting group of individual who (conveniently)

are available for study.
2. PURPOSIVE SAMPLING – is a process of selecting based from judgment to select a sample

which the researcher believed, based on prior information, will provide the data they need. It is also
called judgment sampling
3. QUOTA SAMPLING – is applied when an investigator survey collects information from an

assigned number or quota of individuals from one several sample units fulfilling certain prescribed
criteria belonging to one stratum.
4. SNOWBALL SAMPLING – is a technique in which one or more members of a population are

located and used to lead the researchers to other members of the population.
5. VOLUNTARY SAMLING – is a technique when sample are composed of respondents who are
self-select into study survey
6. Judgment sampling – is a technique when the researcher relies on his/her personal/sound

judgment in choosing to participate to study or the sample selected is based on the opinion of expert.
METHODS OF COLLECTING DATA

1. DIRECT OR INTERVIEW METHOD – is a face to face encounter between interviewer and
interviewee.
2. INDIRECT OR QUESTIONNAIRE METHOD – utilized questionnaire to obtain information
3. REGISTRATION MATHOD – governed by laws (example: birth certificates, and licenses0
4. OBSERVATION METHOD – IS used to data that are pertaining to behaviors of an individual or

group
5. EXPERIMENT METHOD – USED to determine the cause and effect relationship of ertain
phenomena under controlled conditions
METHODS OF PRESENTING DATA

1. TEXTUAL METHOD - this method presents the collected data in narrative and paragraph forms.
2. TABULAR METHOD – this method presents data in table which are orderly arranged in rows and
columns
3. GRAPHICAL METHOD – Presents the collected data in visual or pictorial form to get a clear view
of data
POPULATION – refers to the entire group or a set of individuals or items to whom the researchers
would like to generalize the results of the study.
EXAMPLE:
RESEARCH PROBLEM: The Effects of Multimedia Instruction on the Mathematical Achievement
of Grade 7 Junior High School Students in the Division of City Schools, Cabuyao.
TARGET POPULATION: All grade 7 junior high school students in the Division of City Schools,
Cabuyao.
ACCESSIBLE POPULATION: All grade 7 junior high school in the pilot high schools, Division of
City Schools, Cabuyao
SAMPLE: Ten percent of the grade 7 junior high school students in the pilot high schools, Division
of City Schools, Cabuyao
SAMPLE – is a group of individuals in a research study on which information or generalization
about the population is drawn.
1. SLOVIN’S FORMULA
- is used to calculate the sample size (n) given the population size (N) and a margin of error (e).
- it's a random sampling technique formula to estimate sampling size
-It is computed as n = N / (1+Ne2).

whereas:
n = no. of samples
N = total population
e = error margin / margin of error
2. LYNCH FORMULA – is more complicated than the Slovin Formula

Where:
Z = the value of the normal variable (1.96) for a reliability of level of 0.95
P = the largest possible proportion (0.50)
D = sampling error (0.05)
N = population size
n = sample size
- the sample size derived from the lynch formula is 10 less than the sample size derived from the
Slovin.
Guidelines suggested by Fraenkel (1994) with regard to the number of subjects needed.
1. DESCRIPTIVE STUDIES – a sample minimum number of 100 is essential.
2. CORRELATIONAL STUDIES – a sample of at least 50 is deemed necessary to establish the
existence of relationship
3. EXPERIMENTAL RESEARCH, 30 per group as a minimum although 15 subjects are
acceptable.
4. EX POST FACTO OR CAUSAL COMPARATIVE GROUP – 15 subjects per research.
Example:
Compute a sufficient sample size of a target population consisting of 1,524 sixth-graders in a
given school district using the Slovin and Lynch Formula.
LESSON 3: FREQUENCY DISTRIBUTION

FREQUENCY DISTRIBUTION – is a grouping of data into categories showing the number of
observations in each of the non-overlapping classes.
- is a tabular summary of a set of data that shows the frequency or number of data items that fall
in each of several distinct classes.
DEFINING SOME TERMS:
1. RAW DATA – is the data collected in original form
2. RANGE – is the difference of the highest and the lowest value in a distribution
3. CLASS LIMITS (or APPARENT LIMITS) – is the highest and lowest values describing a class.
4. CLASS BOUNDERIES ( or REAL LIMITS) is the upper or lower values of a class for a group
frequency distribution whose values has additional decimal place more than the class limits and
end with the digit 5.
5. INTERVAL (or WIDTH) – is the distance between the class lower boundary and the class upper
boundary and it is denoted by the symbol i.
6. FREQUENCY (f) – is the number of values in a specific class of a frequency distribution.
7. RELATIVE FREQUENCY (rf) – is the value obtained when the frequencies in each class of the
frequency distribution is divided by the total number of values.
8. PERCENTAGE – is obtained by multiplying the relative frequency by 100%.
9. CUMULATIVE FREQUENCY (cf) – is the sum of the frequencies accumulated up to the upper
boundary of a class in a frequency distribution.
10. MIDPOINT – is the point halfway between the class limits of each class and is representative
of the data within the class.
DETERMINING CLASS INTERVAL:
1. RULE 1:
SUGGESTED CLASS INTERVAL = Range/Number of Classes = HV – LV/ k
Where: HV = highest value ina data set
LV = lowest value in a data set
K = number of classes
i = suggested class interval
2. RULE 2:
Suggested Class Interval = Range / 1 + 3.322 (logarithm of total frequencies)
3. RULE 3:
Suggested Class Interval = HV – LV/Number of Classes
MEASURE OF CENTRAL TENDENCY

CENTRAL TENDENCY – refers to a central reference value which is usually close to the point of
greatest concentration of the measurements and may in some sense be bought to typify the whole
set.
- Commonly referred to as an average, is a single value that represents a data set.
- Its purpose is to locate the center of the data set.
ARITHMETIC MEAN, often called as the mean, is the most frequently used measure of central
tendency. Is commonly understood as the arithmetic average.
- is appropriate to determine the central tendency of an interval or ratio data.
PROPERTIES OF MEAN
1. A set of data that has only one mean.
2. Mean can be applied for interval and ratio data.
3. All values in the data set are included in computing the mean.
4. The mean is very useful in comparing two or more data sets.
5. Mean is affected by the extreme small or large values on a data set.
6. The mean cannot be computed for the data in a frequency distribution with an open-ended
class
7. Mean is most appropriate in symmetrical data.
WEIGHTED MEAN – is particularly useful when various classes or groups contribute differently to
the total. It is found by multiplying each value by its corresponding weight and dividing by the sum
of weights.
MEDIAN – is the midpoint of the data ray.
- is the central value of an ordered distribution.
PROPERTIES OF MEDIAN
1. The median is unique, there is only one median for a set of data.
2. The median is found by arranging the set of data from lowest or highest (or highest to lowest)
and getting the value of the middle observation.
3. Median is not affected by the extreme small or large values.
4. Median can be computed for an open-ended frequency distribution
5. Median can be applied for ordinal, interval and ratio data
6. Median is most appropriate in a skewed data.
MODE – is the value in a data set that appears most frequently. Like the median and unlike the
mean, extreme values in a data set do not affect the mode.
UNIMODAL – a data set that has only one value that occur the greatest frequency
BIMODAL – if the data has two values with the same greatest frequency
MULTIMODAL – if the data set have more than two modes
NO MODE – a data set values have the same number frequency
PROPERTIES OF MODE
1. The mode is found by locating the most frequently occurring value
2. The mode is the easiest average to compute
3. There can be more than one mode or even no mode in any given data set.
4. Mode is not affected by the extreme small or large values.
5. Mode can be applied for nominal, ordinal, interval and ratio data.
DATA ARRAY – the data set is ordered whether ascending or descending.

- is an appropriate measure of central tendency for data that are ordinal or above, but is more
valuable in an ordinal type of data.
MIDRANGE – is the average of the lowest and highest value in a data set.
PROPERTIES OF THE MIDRANGE
1. The midrange is easy to compute
2. The midrange gives the midpoint
3. The midrange is unique
4. Midrange is affected by the extreme small or large values
5. Midrange can be applied for interval and ratio data.
MEASURES OF DISPERSION AND LOCATION

STANDARD DEVIATION – is a statistical term that provides a good indication of volatility. It
measures how widely values are dispersed from the average.
- Is calculated as the square root of variance.
DISPERSION – is the difference between the actual value and the average value
RANGE – is the simplest and easiest way to determine measure of dispersion. It is the difference
of the highest value and the lowest value in the data set.
AVERAGE DEVIATION – is the absolute difference between that element and a given point. It is
a summary statistic of statistical dispersion or variability. It is also called the mean absolute
deviation.
VARIANCE – is a mathematical expectation of the average squared deviations from the mean.

Basic Statistics 18 19 20

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Basic Statistics 18 19 20

Caricato da

Copyright:

Formati disponibili

STATISTICS

Two major characteristic of objects, people, or events

TWO TYPES OF RANDOM VARIABLES:

CLASSIFICATION OF VARIABLES ACCORDING TO PORPUSE

A. EXPERIMENTATL CLASSIFICATION: a researcher may classify variables according to the

B. MATHEMATICAL CLASSIFICATION. Variables may also be classified in terms of mathematical

TWO MAIN SOURCES OF DATA:

TYPES OF DATA ACCORDING TO FORM

LEVELS OF MEASUREMENT/MEASUREMENT OF SCALE

B. NON-RANDOM SAMPLING –is a sampling procedure where samples selected in a deliberate

1. CONVENIENCE SAMPLING – is a process of selecting group of individual who (conveniently)

2. PURPOSIVE SAMPLING – is a process of selecting based from judgment to select a sample

3. QUOTA SAMPLING – is applied when an investigator survey collects information from an

4. SNOWBALL SAMPLING – is a technique in which one or more members of a population are

6. Judgment sampling – is a technique when the researcher relies on his/her personal/sound

METHODS OF COLLECTING DATA

2. INDIRECT OR QUESTIONNAIRE METHOD – utilized questionnaire to obtain information

3. REGISTRATION MATHOD – governed by laws (example: birth certificates, and licenses0

4. OBSERVATION METHOD – IS used to data that are pertaining to behaviors of an individual or

METHODS OF PRESENTING DATA

- it's a random sampling technique formula to estimate sampling size

-It is computed as n = N / (1+Ne2).

2. LYNCH FORMULA – is more complicated than the Slovin Formula

LESSON 3: FREQUENCY DISTRIBUTION

MEASURE OF CENTRAL TENDENCY

DATA ARRAY – the data set is ordered whether ascending or descending.

MEASURES OF DISPERSION AND LOCATION

Potrebbero piacerti anche