Sei sulla pagina 1di 4

HOW TO ORGANIZE DATA

One way of organizing raw data or observations is through the use of frequency
distribution table. One such example is a profile of cooperatives in a province which is
given below:
Initial Capital, in Pesos Number of Cooperatives
Below 25,000 43
25,000 49,999 28
50,000 74,999 17
75,000 and above 12
STEPS IN CONSTRUCTING A FREQUENCY DISTRIBUTION TABLE:
1. Obtain the number of class intervals to be used.
Usually, the number of class intervals should be anywhere from 5 to 20. Too
many intervals would result in a loss of organization. Too few intervals, on the
other hand, would result in a loss of detail. To obtain a more specific guide, we
can use Sturges Rule which states that:
K = 1 + 3.322*(log
10
n)
where
K = the number of class intervals rounded upwards
n = the number of observations
2. Obtain the size of the intervals.
To obtain an initial estimate of the size of the intervals we can use the formula:
r
c c
i
K
I
SO
I
LO
i

,
_


,
_

2 2
where
i
i
= the initial estimate of the interval size
LO = the largest observed value
I
c
= the smallest increment of change in data
SO = smallest observed value
K
r
= the K value obtained from Sturges rule rounded upwards
The interval size, I, is obtained by rounding to the nearest decimal places as
indicated by the given raw data. If the data are in whole numbers, round to the
nearest integer or whole number.
3. Compute for the excess space.
EXCESS = Space available Required Space, where
Space available = K
r
*I
Required Space = [LO + I
c
/2] [SO I
c
/2]
4. Construct the table.
With the interval size and the number of intervals known, we can construct the
frequency distribution by first dividing the excess between the lowest and the
highest ends of the data.
GRAPHICAL METHODS FOR DESCRIBING QUANTITATIVE DATA:
(1) Frequency Histogram a bar graph representation of a frequency distribution
table. Marked along the horizontal axis are the class boundaries (CB).
Frequencies are marked along the vertical axis. Each interval is drawn as a bar
bounded or defined by the class boundaries and the corresponding frequencies.
(2) Frequency Polygon uses class midpoints (CM) to represent the intervals. Class
midpoint is computed as the average of the lower class limit (LCL) and the upper
class limit (UCL). Class limits are the visible limits of the intervals in the
frequency distribution table.
NUMERICAL DESCRIPTIVE MEASURES
- are numbers that are used to create a mental image of a data set.
(1) Measures of Central Tendency or Location:
The measure of central tendency is the point about which scores tend to cluster; a
sort of average in a series. It is the center of concentration of scores in any set of
data. It is a single number which represents the general level of performance of
the group.
The three measures of central tendency in common use are: Mean, Median and
Mode.
MEAN is defined as the sum of the values in the data group divided by the
number of values. The formula is:
n
X
X

where X = the raw data or observations


n = the number of observations or values
For grouped data which is in the form of a frequency table, the formula is:
n
fx
X

where f = frequency of each class interval


x = class midpoints
n = total number of observations
MEDIAN the middle value in an arrayed data (data which has been arranged in
ascending order).
For grouped data, the formula is:
I
f
F
n
LCB X
med
med
1
1
1
1
]
1

+
2
~
where:
LCB
med
= lower class boundary of the median class
n = number of observations
F = cumulative frequency of the class before the median class
f
med
= frequency of the median class
I = class interval size
The median class is identified as the class whose cumulative frequency reaches
n/2 first.
MODE the value which occurs with the most number of times in a data set.
For grouped data, the formula is:
I
f f
f
LCB M
1
]
1

+
2 1
1
mod
where:
LCB
mod
= lower class boundary of the modal class
f
1
= the difference between the frequency of the modal class and
the class immediately before it
f
2
= the difference between the frequency of the modal class and
the class immediately after it
I = size of the class interval
(2) Measures of Dispersion or Variability:
The measures of dispersion indicate the nature or degree of clustering. The more
concentrated the values about the mean or average the more meaningful is the
average as a measure of location.
RANGE is the simplest measure of spread or variability. It is the difference
between the highest score and lowest score in any given set of data or distribution.
In the case of the data grouped into intervals, the range becomes the difference
between the higher boundary of the highest class and the lower boundary of
lowest class.
The range is not considered a stable measure of variability because it
considers only extreme values thus its value can fluctuate greatly with a change in
a single score either the highest or the lowest score.
STANDARD DEVIATION the most useful measure of variability. It is special
form of average deviation from the mean which is affected by all individual
values of the items in any given distributions.
For ungrouped population, the standard deviation is given by:
( )
N
X

where X = the individual values of all the items


= the population mean
N = the population size
For grouped population, the standard deviation is given by:
( ) [ ]
N
X f

where X = the class midpoints


f = frequency of each class interval
= the population mean
N = the population size
For ungrouped samples, the standard deviation is given by:
( )
1
2


n
X X
s
where X = the individual values of all the items
Xbar = the sample mean
n = sample size
For grouped samples, the standard deviation is given by:
( ) [ ]
1
2


n
X X f
s
where X = the class midpoints
Xbar = the sample mean
n = sample size

Potrebbero piacerti anche