Sei sulla pagina 1di 8

CENTRAL TENDENCY: 

One of the important objectives of statistical analysis is to get one single value, which is
represents, the entire data. Such a value is called measure of central tendency or average.
So average in statistics is defined as single value which is representative of the entire
data. Since an average represents the entire data its value lies between the two extremes
i.e., the largest and smallest items. Hence an average is called a measure of central
tendency. 
 
1. Mean  
2. Median  
3. Mode 
MODE 
 
A statistical term that refers to the most frequently occurring number found in a set of
numbers. The mode is found by collecting and organizing the data in order to count the
frequency of each result. The result with the highest occurrences is the mode of the
set. The ​mode in a list of numbers refers to the list of numbers that occur most
frequently. It is the value that appears most often in a set of data. 
The mode is defined as the element that appears most number of times in a given set of
elements. Or mode can also be defined as the element with the largest frequency in a
given data set. 
The mode of a discrete probability distribution is the value ​x​ at which its probability
mass function takes its maximum value. In other words, it is the value that is most likely
to be sampled. The mode of a continuous probability distribution is the value ​x​ at which
its probability density function has its maximum value, so, informally speaking, the mode
is at the peak. 
Like the statistical mean and median, the mode is a way of expressing, in a single
number, important information about a random variable or a population. The numerical
value of the mode is the same as that of the mean and median in a normal distribution and
it may be very different in highly skewed distributions. 
The mode is not necessarily unique, since the same maximum frequency may be attained
at different values. The most extreme case occurs in uniform distributions, where all
values occur equally frequently. 
It is important to note that for a given data set, there can be more than one mode and if no
number occurs more than once in the set, then there is no mode for that set of numbers. 
As long as those elements all have the same frequency and that frequency is the highest,
they are all the modal elements of the data set. 
As noted above, the mode is not necessarily unique, since the probability mass function
or probability density function may take the same maximum value at several points ​x​1​, ​x2​​ ,
etc. 
The above definition tells us that only ​global maxima​ are modes. Slightly confusingly,
when a probability density function has multiple local maxima it is common to refer to all
of the local maxima as modes of the distribution. Such a continuous distribution is
called multimodal (as opposed to unimodal). 
In symmetric unimodal distributions, such as the normal (or Gaussian) distribution (the
distribution whose density function, when graphed, gives the famous "bell curve"), the
mean (if defined), median and mode all coincide. For samples, if it is known that they are
drawn from a symmetric distribution, the sample mean can be used as an estimate of the
population mode. 

 
 
 
Mode of a sample 
The mode of a sample is the element that occurs most often in the collection 
Examples: 
To find the mode of:
9, 3, 3, 44, 17, 17, 44, 15, 15, 15, 27, 40, 8,
Put the numbers is order for ease:
3, 3, 8, 9, 15, 15, 15, 17, 17, 27, 40, 44, 44,
The Mode is 15 (15 occurs the most at 3 times) 
[1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17] the mode of the sample is 6. 
 
 
 
 
 
 
 
 
 
 
 
 
 
Example 2 
3, 12, 15, 3, 15, 8, 20, 19, 3, 15, 12, 19, 9 
Solution 
Mode = 3 and 15 
Here the mode is not unique hence the dataset may be said to be bimodal 
 
A set with more than two modes may be described as multimodal 
Mode for Grouped Data 
As we saw in the section on data, grouped data is divided into classes. We have defined
mode as the element which has the highest frequency in a given data set. In grouped data,
we can find two kinds of mode: the Modal Class or class with the highest frequency and
the mode itself, which we calculate from the modal class using the formula below. 

 
Where, 
● L​ is the lower class limit of the modal class 
● f​1​ is the frequency of the modal class 
● f​0​ is the frequency of the class before the modal class in the frequency table 
● f​2​ is the frequency of the class after the modal class in the frequency table 
● h​ is the class interval of the modal class 
To find the modal class and the actual mode of the data set below 
Number  Frequency 
1 - 3  7 
4 - 6  6 
7 - 9  4 
10 - 12  2 
13 - 15  2 
16 - 18  8 
19 - 21  1 
22 - 24  2 
25 - 27  3 
28 - 30  2 
 
Solution 
Modal class = 10 - 12 

 
Where, 
● L​ = 10 
● f​1​ = 9 
● f​2​ = 2 
● f​0​ = 4 
● h​ = 3 
Therefore, 

 
Solving the above using the order of operations: 

 
 
 
 
 
For a sample from a continuous distribution​, such as [0.935..., 1.211..., 2.430...,
3.668..., 3.874...], the concept is unusable in its raw form, since no two values will be
exactly the same, so each value will occur precisely once. In order to estimate the mode,
the usual practice is to discrete the data by assigning frequency values to intervals of
equal distance, as for making a histogram, effectively replacing the values by the
midpoints of the intervals they are assigned to. The mode is then the value where the
histogram reaches its peak. For small or middle-sized samples the outcome of this
procedure is sensitive to the choice of interval width if chosen too narrow or too wide;
typically one should have a sizable fraction of the data concentrated in a relatively small
number of intervals (5 to 10), while the fraction of the data falling outside these intervals
is also sizable. An alternate approach is kernel density estimation, which essentially blurs
point samples to produce a continuous estimate of the probability density function which
can provide an estimate of the mode. 
Use 
Unlike mean and median, the concept of mode also makes sense for "nominal data" (i.e.,
not consisting of numerical values in the case of mean, or even of ordered values in the
case of median). For example, taking a sample of Korean family names, one might find
that "Kim" occurs more often than any other name. Then "Kim" would be the mode of
the sample. In any voting system where a plurality determines victory, a single modal
value determines the victor, while a multi-modal outcome would require some
tie-breaking procedure to take place. 
Unlike median, the concept of mean makes sense for any random variable assuming
values from a vector space, including the real numbers (a one-dimensional vector space)
and the integers (which can be considered embedded in the reals). For example, a
distribution of points in the plane will typically have a mean and a mode, but the concept
of median does not apply. The median makes sense when there is a linear on the possible
values. Generalizations of the concept of median to higher-dimensional spaces are
the geometric median and the center point. 
Uniqueness  
For some probability distributions, the expected value may be infinite or undefined, but if
defined, it is unique. The mean of a (finite) sample is always defined. The median is the
value such that the fractions not exceeding it and not falling below it are both at least 1/2.
It is not necessarily unique, but never infinite or totally undefined. For a data sample it is
the "halfway" value when the list of values is ordered in increasing value, where usually
for a list of even length the numerical average is taken of the two values closest to
"halfway". Finally, as said before, the mode is not necessarily unique.
Certain pathological distributions (for example, the Cantor distribution) have no defined
mode at all. For a finite data sample, the mode is one (or more) of the values in the
sample. 
 
Advantages and Disadvantages of Mode 
Advantages​: 
● It is easy to understand and simple to calculate. 
● It is not affected by extreme large or small values. 
● It can be located only by inspection in ungrouped data and discrete frequency
distribution. 
● It can be useful for qualitative data. 
● It can be computed in open-end frequency table. 
● It can be located graphically. 
Disadvantages​: 
● It is not well defined. 
● It is not based on all the values. 
● It is stable for large values and it will not be well defined if the data consists of
small number of values. 
● It is not capable of further mathematical treatment. 
Sometimes, the data is having one or more than one mode and sometimes the data
having no mode at all. 
 

Potrebbero piacerti anche