Sei sulla pagina 1di 49

Frequency distribution

classification of data
Process of arranging data in groups and classes according to resemblance and similarities. Data units having common characteristics are placed in one class and whole data are thus divided in to number of classes

Classification have two kinds Classification according to attributes Classification according to variables

Classification according to attributes


Under this method the data are classified on the basis of qualitative characteristics known as attributes Eg literacy, unemployment Also called descriptive characteristics Not measurable.

Classification according to variables


Under this method data are classified on the basis of quantitative characteristics. Eg : age, height , weight etc. Capable of direct measurement

Frequency distribution or frequency table


Orderly arrangement of data are classified according to the magnitude of observations. Components of a frequency distribution are Classes: A large number of observations varying in a wide range are classified in several groups according to the size. Each of this groups is defined by an interval called class interval.

Class limits: the largest and smallest possible measurements in each class are known as class limits. Class mark: the value exactly at the middle of a class interval is called class mark.

Magnitude of class interval: the difference between lower and upper class boundaries is called the magnitude of class interval. Class frequency: the number of observations falling within a particular class interval is called class frequency.

A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers, with sufficient explanatory and qualifying words, phrases and statements in the form of titles, heading and footnotes to make clear the full meaning of the data and their origin

OBJECTIVES OF TABULATION 1.To simplify the complex data 2. To economize space 3. To facilitate comparison 4. To facilitate statistical analysis 5. To save time 6.To depict trend 7. To help reference

Graphical and diagrammatic representation of data


The transformation of data through visual methods like graphs, diagrams, maps and charts is Graphical and diagrammatic called representation of data.

The need of representing data graphically


Graphics, such as maps, graphs and diagrams, are used to represent large volumes of data Graphical form makes it possible to easily draw visual impressions of data. The graphic method of the representation of data enhances our understanding. It makes the comparisons easy.

It is a time consuming task to draw inferences about whatever is being presented in non graphical form. It presents characteristics in a simplified way. These makes it easy to understand the patterns of population growth, distribution and the density, sex ratio, agesex composition, occupational structure, etc.

Types of Diagrams The diagrams and the maps is of following types: (i) O ne-dimensional diagrams such as line graph, poly graph, bar diagram, histogram, age, sex, pyramid, etc.; (ii) Two-dimensional diagram such as pie diagram and rectangular diagram; (iii) Three-dimensional diagrams such as cube and spherical diagrams.

The most commonly drawn diagrams and maps are: Line graphs Bar diagrams Pie diagram

Line Graph
The line graphs are usually drawn to represent the time series data related to the temperature, rainfall, population growth, birth rates and the death rates.

Polygraph
Polygraph is a line graph in which two or more than two variables are shown on a same diagram by different lines. It helps in comparing the data. Examples which can be shown aspolygraph are: The growth rate of different crops like rice, wheat, pulses in one diagram. The birth rates, death rates and life expectancy in one diagram.

Bar Diagram
It is also called a columnar diagram. The bar diagrams are drawn through columns of equal width

Pie Diagram
Pie diagram is another graphical method of the representation of data. It is drawn to depict the total value of the given attribute using a circle. Dividing the circle into corresponding degrees of angle then represent the sub sets of the data. Hence, it is also called as Divided Circle Diagram.

Measures of central tendency


A single value that attempts to describe a set of data by identifying the central position within that set of data. sometimes called measures of central location.

Types of measures of central tendency


Arithmetic mean Median Mode Geometric mean Harmonic mean

Arithmetic mean
Mathematical average. Method of representing the whole data by one figure. Simple measure and most widely used.

Mean in individual series

Mean in the case of discrete frequency distribution


X = sigma(fx) n

Median
Value of an item which occupies the central position when the items are arranged in the ascending and descending order of their magnitude.

Median in the individual series


Arrange the value in the data in ascending and descending order of their magnitude and find out the value of middle item. Median = (n+1)/2

Median in discrete frequency distribution


Median =size of n+1/2 th item where n=total of frequency

Median in continuous frequency distribution


Median= l1+l2-l1 *(m-c) fm l1-lower limit l2-upper limit Fm-frequency m- n/2 n-total frequency c-cumulative frequency

Mode
Value of the variable which occurs most frequently in a distribution.

Mode in individual series


In the case of individual series the value which occurs more number of times. When no item appears more number of times than others we say mode is ill-defined. In that case mode is obtained by the formula mode=3median-2mean.

Mode in discrete frequency distribution


In the case of discrete frequency distribution the value having highest frequency is taken as mode.

Mode in continuous frequency distribution


Mode= l1+(f1-f0) * c 2f1-f0-f2 l1-lower limit F0,f2 are respectively the frequencies of classes just preceding and succeeding model classes F1- the frequency of the model class

Measures of dispersion
Refers to the variability in the size of items. Speaks about the spread or scatter of the values in a series. Tells the extend to which the value of a series differ between each other or from their average.

Measures of dispersion are classified in to two: Absolute measures Relative measures

Absolute measures of dispersion


Expressed in the same units in which the datas are collected. Measures the variability in a series Various absolute measures of dispersion are: Range Quartile deviation Mean deviation Standard deviation

Relative measures of dispersion


Ratio of a measure of dispersion to appropriate average from which deviations are measured. Also called coefficient of dispersion. Useful for comparing two series for their variability.

Important relative measures of dispersion: 1. Coefficient of range 2. Coefficient of quartile deviation 3. Coefficient of mean deviation 4. Coefficient of variation

Range
Simplest possible measures of dispersion. Range =H-L H-highest value L-lowest value Coefficient of range= H-L H+L

Standard deviation
Square root of the mean of the squares of the deviation of all values of a series from their arithmetic mean.

SD in individual series
SD=

SD in discrete series

SD in continuous series

Variance
The variance of a data set is calculated by taking the arithmetic mean of the squared differences between each value and the mean value.

V=

THANK YOU

Potrebbero piacerti anche