Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Numerical values of an observation around which most numerical values of other observations in the data set show a tendency to cluster or group Extent to which values are dispersed around the central value called variation. Extent of departure of numerical values from symmetrical distribution around the central value called skew ness
should be rigidly defined It should be based on all the observations Easy to understand and calculate Should have sampling stability Should not be unduly affected by extreme observation
Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 1 2 3 4 5 6
Mode measure of location recognized by the location of the most frequently occurring value of a set of data Sales during 20 days period 53,56,57,58,58,60,61,63,63,64,64,65,65,67, 68,71,71,71,71,74 (ascending order data)
Mode
The mode is always a category or score The mode is not necessarily the category with the majority(more than 50% of the cases) The mode is the only measure of central tendency for nominal variables Some distributions are bimodal
The median is a measure of central tendency for variables which are at least ordinal. The median represents the exact middle of a distribution.
It is the score that divides the distribution into two equal parts
Finding the Median in sorted data How satisfied are you with your health insurance? Responses of 7 Individuals very dissatisfied very satisfied somewhat satisfied
very dissatisfied
somewhat dissatisfied somewhat satisfied very satisfied
Total(N)
somewhat satisfied ( The middle case =Median) somewhat satisfied very satisfied very satisfied _________________________________________________
The median is the response associated with the middle case. You find the middle case by :(N + 1) 2 Since N= 7, the middle case is the (7 + 1) 2, or the 4th case The response associated with the 4th case is somewhat satisfied. Therefore the median is: Somewhat satisfied.
The median is located halfway between the two middle cases. When the variable is interval we can average the two middle cases. Median = 12.61 + 13.38 = 12.99 2
Median
Robust measure of central tendency Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
Median = 5
Med = L + (n/2) cf h f
Partition Values: Quartiles, Deciles, and Percentiles Quartiles Divide an ordered data set into 4 equal parts - 2nd Quartile - Median Deciles Divide an ordered data set into 10 equal parts - 5th Decile - Median Percentiles Divide an ordered data set into 100 equal parts - 50th Percentile - Median
i = 1,2,3
_____________________________
Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.
___________________________________________________________
Objectives of an Average
Determine one single value that may be used to describe the character sticks of entire series. Facilitate comparison at a particular point of time Facilitate statistical inference Helps in decision making process
The Mean
_________________________________________________________________ Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores. _________________________________________________________________
Y
Y = raw scores of the variable y __ Y = the mean of y
Y
N
Crime RATE per 1000 29.3 28.9 32.9 36.5 25 14.7 58.4 48.8 12.8 21.8 3.4 6.6 40.6 12.9 19.8
Total
392.4
Sample statistic a numerical value used as a summary measure using data of the sample for estimation or hypothesis testing
Population parameter - a numerical value used as a summary measure using data of the population
X
N
Sample Size
Xn
X
i 1
Population Size
X1 X 2 N
XN
__ Y =fY N
where __ Y = fY = fY = N =
the mean a score multiplied by its frequency the sum of all the f Ys the total number of cases in the distribution
fY
Weighted Mean
w x w
i
i i
Indirect method
The human resource manager at a city hospital began a study of the overtime hours of the registered nurses. Fifteen nurses were selected at random and following overtime hours were recorded during a month: 13 13 12 15 17 15 5 12 6 7 12 10 9 13 12 5 9 6 10 5 6 9 6 9 12
Arithmetic mean of grouped (classified) data Direct & Step deviation method)
The following distribution gives the pattern of overtime work done by 100 employees of a company. Calculate the average overtime work done per employee
No. of Employees 11 20 35 20 8 6 Mid Value 12.5 17.5 22.5 27.5 32.5 37.5
d=(m-A)/5 -2 -1 0 1 2 3
fd -22 -20 0 20 16 18 12
Geometric Mean
Geometric Mean of a set of n numbers is defined as the nth root of the product of the n numbers and is used to average percents, indexes, and relatives. The formula is: (Xi > 0)
X G n X1 X 2
Xn
More directly measures the change over more than one period Geometric Mean Arithmetic Mean
30
Distributions can be either symmetrical or skewed, depending on whether there are more frequencies at one end of the distribution than the other.
Symmetrical Distributions
A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical, so that if it is divided into two halves, each will be the mirror image of the other.
In a unimodal symmetrical distribution the mean, median, and mode are identical.
Mean < Median < Mode Mean = Median = Mode Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)
IF variable is Nominal..
IF variable is Ordinal...
Mode or Median
Calculate the mean, median and mode for the following data pertaining to marks in statistics. There are 80 students in class and the test is of 140 marks. Marks more than No. of Students 0 80 20 76 40 50 60 28 80 18 100 9 120 3