Sei sulla pagina 1di 59

Statistics and Its Application for

Undergraduate Research
CEU Makati
MIAD 7th floor
LV Campus
What is Statistics?
What is Statistics?
Statistics may simply refer to the numbers
recorded or collected as data.
Examples:
• Scores from sports
• Students’ information, attendance and grades
• Births and deaths
• Business sales and profits
What is Statistics?
Statistics also refers to the branch of
Mathematics that deals with the
Collection,
Organization,
Presentation or Summarization,
Analysis,
and Interpretation of data.
Divisions of Statistics
• Descriptive Statistics
• Inferential Statistics
Descriptive Statistics
• Numbers are used to describe the sample
• No hypotheses are tested
• Descriptive Statistics deals with:
– Collecting data
– Organizing the data collected
– Presenting the data in summarized form
Inferential Statistics
• Inferential Statistics deals with the analysis
and interpretation of the summarized data.
• Inferential Statistics can be divided into
– Parametric Techniques and
– Non-Parametric Techniques.
Basic Statistical Terms
• Population
• Sample
Basic Statistical Terms
• Population
Entire number or the totality of the people,
objects, events or things being studied
Basic Statistical Terms
• Sample
Small group taken from the population
Basic Statistical Terms
• Parameter
• Statistic
Basic Statistical Terms
• Parameter
A characteristic, property, or description of a
population
Basic Statistical Terms
• Statistic
A characteristic, property or description of a
sample
Statistical Variables
• Quantitative and Qualitative Variables
• Discrete and Continuous Variables
• Independent and Dependent Variables
Statistical Variables
• Qualitative Variable
Uses words to describe different types or
kinds, qualities, properties or characteristics
Statistical Variables
• Quantitative Variable
Uses numbers to describe different types or
kinds, qualities, properties or characteristics
Statistical Variables
• Discrete Variables
Can be obtained by counting
Refers to frequency
Cannot have fractional or decimal parts
Statistical Variables
• Continuous Variables
Can be obtained by weighing or measuring
Can have fractional or decimal parts
Statistical Variables
• Independent Variable
Not affected by other variables
Statistical Variables
• Dependent Variable
Affected by the changes that happen to other
variables
Levels of Measurement
• Nominal/Categorical Level
• Ordinal Level
• Interval Level
• Ratio Level
Levels of Measurement
• Nominal/Categorical Level
– Classification
– Labels
– Categories
Levels of Measurement
• Ordinal Level
– Order
– Sequence
Levels of Measurement
• Interval Level
– Characterized by the use of units which may be
divided into smaller units
– No absolute zero starting point
– May be arranged in order or sequence
– Differences between values may be computed
Levels of Measurement
• Ratio Level
– Modification of the Interval Level of Measurement
– Has an absolute zero starting point
Collection of Data
• In collecting data, 3 questions need to be
answered, namely:
– How many respondents do you need?
– Who will be chosen as respondents?
– How will the data be collected from these
respondents?
How many respondents are needed?
• Slovin’s Formula

N
n
1  Ne 2

Where
n = sample size
N = population size
e = margin of error

Note: Used when nothing is known about the


population.
Example
Suppose that you have a group of
1,000 city government employees
and you want to survey them to find
out which tools are best suited to
their jobs. How many of these city
government employees will you need
as respondents?
Example
• Population size, N, is 1000 city government
employees
• Margin of error, e, is based on the confidence
level.
• Confidence level refers to the percentage of
all possible samples that can be expected to
include the true population parameter.
Think of it as the level of accuracy.
Example
• If a 95% level of confidence is used, then the
margin of error would be
100% - 95% = 5% = 0.05
e = 0.05
Plugging the values into the Sloven’s Formula

N
n
1  Ne 2
Example
1000
n
1  (1000)(0.052 )

n  285.7142857
n  286
Therefore, 286 out of the 1,000 city government
employees will be taken as respondents.
Example
A group of students will conduct a study
on the members of an organization who
are into collecting vinyl records. The
organization has 150,250 registered
members. How many of these members
are needed as respondents if a .05
margin of error is to be used?
Example
• Total population, N = 150,250
Margin of error, e = 0.05
Plugging the values into the Sloven’s Formula

N
n
1  Ne 2
Example
150,250
𝑛= 2
1 + (150,250) 0.05
𝑛 = 398.9379356
𝑛 = 399
Therefore, 399 out of the 150,250
members of the organization will be
taken as respondents.
Who will be chosen as respondents?
• The sampling techniques are used to select
the respondents.
• There are two classifications of sampling
techniques, namely:
– Probability Sampling
– Non-Probability Sampling
Who will be chosen as respondents?
• Probability sampling includes:
– Simple random sampling
– Stratified random sampling
– Systematic sampling
– Cluster sampling
– Multi-stage sampling
Who will be chosen as respondents?
• Non-Probability sampling includes:
– Convenience sampling
– Quota sampling
– Dimensional sampling
– Purposive or Judgmental sampling
– Snowball or Referral sampling
How will the data be collected?
Data collection methods will be used to collect
the data. These methods are
• Survey
– Interview/Direct method
– Questionnaire/Indirect method
• Registration
• Experiments
• Observation
Organization of the Data
• After collecting data, it has to be organized.
Organization of data can be done by the
following methods:
Sorting
Array
Stem-and-leaf Display
Frequency Distribution Table, FDT
Organization of the Data
• The easiest method for organizing data
is by Sorting.
• Sorting is done by arranging the data
alphabetically or chronologically.
• Sorting can be in ascending order
(lowest to highest, a to z) or descending
order (highest to lowest, z to a).
Organization of the Data
• An improvement over sorting is the
Array.
• An array is constructed by first sorting
the data then by dividing them into
rows or columns of equal length.
Organization of the Data
• An improvement over the array is the
Stem-and-Leaf Display.
• The stem-and-leaf display is constructed
by using the ten’s digits as the stem and
the one’s digits as the leaves.
Organization of the Data
• The main method for organizing data is
the use of the Frequency Distribution
Table, or simply, FDT.
Frequency Distribution Table, FDT
Categories/ Class Intervals Frequency or Relative frequency
Category 1 Count or percentage
Category 2 Count or percentage
Category 3 Count or percentage
Category 4 Count or percentage
Total N
Relative Frequency
Relative frequency is another name for percent.

f
%  100
N
Where
% = relative frequency
f = frequency
N = total frquency
Construction of the FDT
1. Find the range.
Range = Highest value – lowest value
2. Determine the desired number of class
interval, CI, by using the Herbert Sturges’
Formula.
k = 1 + 3.322 log N
where N is the total frequency or the
total population
Construction of the FDT
3. Compute for the class size, i, also
known as class width.
𝑟𝑎𝑛𝑔𝑒
𝑖=
𝑘
4. Set the lower limits.
5. Set the upper limits.
6. Tally the data.
7. Summarize the FDT.
Construction of the FDT
74 71 70 69 84 90
72 78 75 75 86 89
83 81 82 64 91 92
85 88 68 87 83 94
88 73 82 72 86 77
Presentation/Summary of Data
• The most frequently used statistical
computations for summarizing or describing
collected data are:
– Relative Frequency or Percent
– Measures of the Center
• Mean
• Median
• Mode
– Measures of Spread
• Standard Deviation
• Variance
• Range
Presentation/Summary of Data
• Other Measures which can be computed:
– Measures of Position/Quantiles
• Quartiles
• Deciles
• Percentiles
– Measures of Shape
• Kurtosis
• Skewness
Relative Frequency/Percent
Relative frequency is another name for percent

f
%  100
N
Where
% = relative frequency
f = frequency
N = total frequency
Mean
The mean is the average of the numerical data
collected. The formula is given as:

X
 x
n
Where
X = mean
x = data
n = total number of data
Mean
For grouped data, the formula is:

X
 fx
n
Where
X = mean
f = frequency
x = data
n = total number of data
Standard Deviation
The standard deviation is the average of the
deviation of the original data from the mean. Its
formula is given as:

s
 (x  X ) 2

n 1
Where
s = standard deviation
X = mean
x = data
n = total number of data
Standard Deviation
For grouped data, the formula is:

 f (x  X )
2

s
n 1
Where
s = standard deviation
f = frequency
X = mean
x = data
n = total number of data
Analysis and Interpretation of
Data
Frequently Used Statistical
Treatments/Computations
For Comparison:
Parametric Techniques:
• Z-test
• T-test
• F-test (Analysis of Variance/ANOVA)
Non-Parametric Technique
• Chi-square test
Frequently Used Statistical
Treatments/Computations
For Determining Degree of Relationship:
Parametric Technique:
• Pearson Product-Moment
Non-Parametric Technique:
• Spearman Rank-Order
• Chi-square
Frequently Used Statistical
Treatments/Computations
For Prediction:
Regression Analysis

Potrebbero piacerti anche