Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Statistics is a numerical statement of facts in any department of enquiry placed in relation to each other. -Bowley Statistics are the classified facts representing the conditions of the people in a State specially those facts which can be stated in numbers or any tabular or classified arrangement. -Webster Statistics can be defined as the aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in in a systematic manner, for a pre-determined purpose and placed in relation to each other. -Secrist Statistics is the science of collecting, organizing , analyzing, interpreting and presenting data.
SCOPE OF STATISTICS
1.Social Sciences
-Man Power Planning -Crime Rates -Income & Wealth Analysis of Society -In studying Pricing, Production, Consumption, Investments & Profits etc.
2.Planning
-Agriculture -Industry -Textiles -Education etc. For ex. Five Year Plans in India.
4. Economics
- Family Budgeting
-Applied in solving economic problems related to production, consumption, distribution of products as per income & wealth related patterns, wages, prices, profits & individual savings, investments, unemployment & poverty etc.
i)
Marketing
Marketing Policy Decisions depend on forecasting, demand analysis, time & motion studies, inventory control, investments & analysis of consumer data for production & sales.
Distribution
iv) Sales
-Demand Analysis -Sales Forecasts
v) Personnel
- Wage plans, Incentive plans, Cost of living, Labor turnover ratio, Employment trends, Accidental Rates, Performance Appraisals etc.
LIMITATIONS OF STATISTICS
Does not study individual items, deals with aggregates. Statistical laws are not exact. Not suitable for the study of qualitative phenomenon. Statistical methods are only means and not end for solving problems.
Definitions Continued
Observations: Numerical quantities that measure specific characteristics. Examples include height, weight, gross sales, net profit, etc.
Classes / class intervals: Subgroups within a set of collected data. Ex.10-20,20-30 etc Width of class-interval = upper limit lower limit Mid Value = (U.L + L.L)/2 Frequency: The number of times a certain value or class of values occurs.
Frequency Distribution Table: The organization of raw data into table form using classes and frequencies.
More Definitions
Cumulative Frequency of a class is the sum of the frequency of that class and the frequencies of all the preceding or succeeding classes which are listed in some sensible order (numerical order, alphabetical order, etc.)
Marks of ten students of a class in Statistics 15, 35, 55, 67, 78, 84, 79, 90, 89, 94
No. of Students 12 18 10 6 4
60 62 64 66 68
Frequency
8
2
30-35 35-40
40-45
40 23
9
Frequency 2 6
21-30
31-40 41-50
10
15 12
A.F= Lower Limit of Next C.I Upper Limit of Previous C.I 2 using the given inclusive type class intervals. 2. Obtain new class intervals as follows: New Lower Limit = Old Lower limit A.F New Upper Limit = Old Upper Limit + A.F
2
6 10 15 12
20.5-30.5
30.5-40.5
10
15
40.5-50.5
12
20-25
15
34 6 10 8 2
15
15 +34 =49 49 + 6 =55 55 + 10 = 65 65 + 8 = 73 73 + 2 = 75
60 + 15 = 75
26 + 34 = 60 20 + 6 = 26 10 + 10 = 20 2 + 8 = 10 2
Also known as averages. Values show a distinct tendency to cluster or group around a value. This behavior is central tendency of data. The value around which the data clusters is the measure of central tendency which represents the whole set of data.
Objectives of Averages
To find out one value that represents the whole mass of data. To enable comparison. To establish relationship. To derive inferences about universe to which sample belongs. To aid decision making.
Should be rigidly defined. Should be mathematically expressed. Should be readily comprehensible & easy to calculate. Should be calculated on the basis of all the observations. Should be least affected by extreme values and sampling fluctuations. Should be suitable for further mathematical treatment.
Arithmetic Mean Geometric Mean Harmonic Mean Median Mode Partition Values like Deciles ,Quartiles & Percentiles.
Averages
Mathematical Averages
Positional Averages
A.M
G.M
H.M
Median
Mode
Arithmetic Mean
Discrete Frequency Distribution = f1x1 + f2x2 + ..fnxn = fx N Where N = f1 +f2 ++fn n = no. of observations
Freq.
fx
x1
x2 x3
f1
f2 f3
f1x1
f2x2 f3x3
x4
f4
f4x4
fX
60 62 64 66 68
12 18 10 6 4 50 = N
Continuous Frequency Distribution - Direct Method - Assumed Mean Method - Step Deviation Method
Direct Method
= f1x1 + f2x2 + ..fnxn = fx N f Where N = f1 +f2 ++fn x = mid value of a C.I = (U.L + L.L) 2
Step Deviation Method = A + fd x i N where A = assumed mean N=f d=xA i x = mid value i = width of C.I = U.L L.L
6 12 17 10
5 7 9 11 13
12-14 5 Total 50 = f
15-20
20-25 25-30 30-35 35-40
7
9 8 6 4 f= 36
17.5
22.5 = A 27.5 32.5 37.5
-5
0 5 10 15
-35
0 40 60 60 fd = 105
= 22.5 + 2100 x 5
3600 = 22.5 + 2.916 = 25.416 Ans.
Illustration
Marks X or more Cum. Freq.
C.I
10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Freq.
140-133= 7 133-118=15 118-100=18 100-75=25 75-45=30 45-25=20 25-9=16 9-2=7 2-0=2
Proceed as usual
10 20 30 40 50 60 70 80 90 100
What if
C.I 50-59 40-49 Frequency 1 3
30-39 20-29
10-19 0-9 Total
8 10
15 3 N=40
A.F = (L.L of 1st C.I U.L of 2nd C.I)/2 = (50-49)/2 = 0.5 New C.I L.L of new C.I = L.L of original C.I A.F U.L of new C.I= U.L of original C.I + A.F For ex. For 1st C.I,new L.L = 50-0.5 = 49.5 new U.L = 59 +0.5 = 59.5 and so on. Now Continue as usual.
Determining missing frequency when A.M is known Illustration Mean = 16.82 Marks Freq. M.V (x) d= (x A)/i fd
0-5
5-10 10-15 15-20 20-25 25-30 30-35
10
12 16 ? = f4 14 10 8 N = 70 + f4
2.5
7.5 12.5 17.5 = A 22.5 27.5 32.5
-3
-2 -1 0 1 2 3
-30
-24 -16 0 14 20 24 fd = -12
Determining missing frequency when A.M is known Illustration Soln. A + fd x I f = 16.82 (given) , I = 5 Hence 16.82 = 17.5 + ( -12 )x 5 70 + f4 = - 0.68 = - 60 70 + f4
Combined A.M
Suppose for k different series with n1,n2nk observations each, the respective A.M s are 1,2,.k. Then the A.M of the new series obtained on combining all the n1,n2,nk observations is obtained using the formula: = n11+n22+.+nkk n1+n2+.+nk
Illustration- Combined A.M There are two branches of a Co. employing 100 and 80employees respectively .If A.Ms of the monthly salaries paid by the two branches are Rs.4570 and Rs.6750 respectively, find the A.M of the salaries of the employees of the Co. as a whole. Soln. Given No. of employees in 1st factory, n1 = 100 Avg. Salary of employees in 1st factory, 1 = Rs. 4750 No. of employees in 2nd factory, n2 = 80 Avg. Salary of employees in 2nd factory, 2 = Rs.6750 Avg salary of the employees of the Co. as a whole = 100 x 4750 + 80 x 6750 = 997000 = Rs. 5538.89 100 + 80 180
Q1.
No.of workers
200
700
900
800
600
400
45-49 18
Q2
No.of Students
Q3
125-175
175-225
225-275
275-325
325-375
375-425
425-475
10
25
35
12
10
Q4
Less than 300 Less than 400 Less than 500 Less than 600
194
265 324 374 392 400
Merits of A.M
Is rigidly defined and has a definite value. Is based on all the observations. Is capable of algebraic treatments for further data analysis & interpretation. Easy to calculate & simple to understand. For a large no. of observations, A.M provides a good basis of comparison.
Drawbacks of A.M
Being based on all the observations, is considerably affected by abnormal observations. For ex. A.M of 1000, 25, 35 & 40 will be (1000+25+35+40)/4 = 275 which is not at all a representative figure. Cannot be calculated even if a single observation is missing. Cannot be obtained just by inspection as in case of median & mode. May give absurd results. For ex. If avg. no. of children per family is to be calculated and the result is 3.4 children per family, how would you interpret it?
Lower Staff
100
150 350 = w
15000 106000 = wX
The value of the middle term of a series arranged in ascending or descending order of magnitude. Its value is the value of the middle item irrespective of all other values.
Calculation of Median
Individual Series N = no. of observations or items in the series - Arrange all the items in ascending or descending order of magnitude. Case I N = Odd Median = Value at (N+1) th position in 2 the arranged series. Case II N = Even Median = A.M of values at (N, N+1)th 2 2 position.
Calculation of Median Illustration (Individual Series) Ex.1 Find the median 5, 7, 9, 12, 10, 8, 7, 15,21 Solution: Arranging in ascending order we get 5, 7, 7, 8, 9, 10, 12, 15, 21 Here N = 9 i.e odd Hence Md = (N+1) th item in the arranged order 2 = (9 +1) th item 2 = 5 th item = 9 Ans.
Ex 2. Find the median 10, 18, 9, 17, 15, 24, 30, 11 Solution Arranging in ascending order we get 9, 10, 11, 15, 17, 18, 24, 30 Here N = 8 i.e even Hence Md = A.M of the ( N , N+1)th items in the 2 2 arranged order. = A.M of (4th, 5th) items = (15 + 17) 2 = 16 Ans.
Calculation of Median
Discrete Frequency Distribution (i) Find less than type cum.frequency. (ii) Find N/2.( N = f) (iii) Find the cum.freq. just greater than N/2. Suppose it is C. (iv) Find the corresponding value of X. (the item) This is median.
No. of students
12 18 10 6 4
Cum. Freq.
12 30 40 46 50
Here N = 50 (i) N/2 = 25 (ii) Cum. Frequency just greater than N/2 = 30 (iii)Corresponding value of item is 62. Median = 60 Ans.
N = 50
Calculation of Median
Where L1 = L.L of median class L2 = U.L of median class C =cum.freq. of class preceding the median class. f = frequency of median class.
N/2 = 3600/2 = 1800 Cum.freq. just greater than 1800 is 2600. Hence median class is 25-30. Hence L1 = 25 L2 = 30 C = 1800 f = 800 Md = 25 + 1800 - 1800 (30 25 ) 800 = 25 Ans.
30-35
35-40
600
400 f= 3600
3200
3600
Cumulative Freq.
0-20
20-40 40-60 60-80 80-100
14
? = f1 27 ? = f2 15 N = 100
14
14 + f1 41 + f1 41+ f1+f2 56 + f1 + f2
L1 = 40 L2 = 60 f = 27 C = 14 + f1
Q1. Age 20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60
No. of Persons 14 28 33 30 20 15 13 7
Q2. Value
Less than 10 Less than 20 Less than 30 Less than 40 Less than 50 Less than 60 Less than 70 Less than 80
70-80
18
229 = N
Merits - Median
Is rigidly defined. Can be easily calculated. Not affected by extreme values. Can be located merely by inspection.
Demerits - Median
May not represent the entire series in many cases. Not suitable for further algebraic treatment. More likely to be affected by sampling fluctuations.
Mode
The value occurring the largest no. of times in a series. That is the value having the maximum frequency. Is calculated for discrete and continuous frequency distributions only. For ex. How to obtain the mode for 1,2,3,4,5 ? as the maximum frequency is 1 and each observation has frequency 1.
No.of students 1
130
132 135 140 141 Total
3
2 2 1 1 10
1.Look for the class-interval with maximum frequency. This is the modal class. 2. Note down the following: L1 = lower limit of the modal class. i = width of class-interval f0 = frequency of class preceding the modal class. f1 = frequency of modal class. f2 = frequency of class succeeding the modal class.
Geometric Mean
Individual Series G = (x1.x2.x3xn)1/n log G = 1 (logx1 + log x2 +.+ logxn) n G = antilog ( 1 log x) n
Geometric Mean
Geometric Mean
Continuous Frequency Distribution - Formula same as in case of discrete frequency distribution with x (as observations) replaced by x (as mid-values)
Harmonic Mean
H= n (1 ) x
Harmonic Mean
-Discrete Frequency Distribution H= 1 1( f1 + f2+..+ fn ) N x 1 x2 xn
H= N (fi ) xi
Harmonic Mean
Continuous Frequency Distribution - Formula same as that of Discrete Frequency Distribution with x (as observations) replaced by x (as mid values).