Sei sulla pagina 1di 25

Frequency Distributions

Ungrouped and Grouped Percentiles and percentile ranks Graphs

Frequency Distributions

Tables and graphs Any scale of measurement How many times does each score occur?

How to make a frequency distribution

Find the highest score in the raw data. Find the lowest score in the raw data. Write down the score values from highest to lowest in a column headed X. Count how many times each score occurs, and write the frequency in a second column headed f. Add up the f column to check: Sf = N.

An example
The DIT-2 test of moral reasoning was taken by 20 Houghton seniors. On the new N-scale, which measures the rejection of self-interest in moral reasoning, their scores were: 20 18 16 13 19 19 20 18 18 17 18 19 17 19 18 17 19 18 15 19 The highest score is 20; the lowest is 13.

X f 20 2 19 6 18 6 17 3 16 1 15 1 14 0 13 1 Sf = 20

Scores are listed from highest to lowest under X. The count of each score is placed under f. The sum of f = N (20) Note that most of the 20 seniors scored very high on the scale. 85% scored 17 or higher.

Another example
The DIT-2 test of moral reasoning was taken by 20 Houghton sophomores. On the new N-scale, which measures the rejection of self-interest in moral reasoning, their scores were: 13 12 15 10 11 10 9 11 13 10 10 11 10 12 11 10 11 10 8 12 The highest score is 15, and the lowest is 8.

X f 15 1 14 0 13 2 12 3 11 5 10 7 9 1 8 1 Sf = 20

Scores are listed from highest to lowest under X. The count of each score is placed under f. The sum of f = N (20) Note that most of the 20 sophomores scored in the middle of the scale. 85% scored between 10 and 13.

Compare the seniors with the sophomores.


Seniors X f 20 2 19 6 18 6 17 3 16 1 15 1 14 0 13 1 Sophomores X f 15 1 14 0 13 2 12 3 11 5 10 7 9 1 8 1
Although the distributions overlap,most seniors score well above most sophomores. Seniors are more likely than sophomores to reject selfinterest when they make moral decisions .

Frequency distribution

Organizes data
Scores in order Frequency of each score

10 12 11 21 19 16 16 15 9 17 15 18

9 7 20 17 8 20

17 16 14 13 12 14

6 13 18 21 10 12

X Frequency (f)
21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6

p= f/N Cum f Cum %


0.07 0.07 0.03 0.07 0.10 0.10 0.07 0.07 0.07 0.10 0.03 0.07 0.07 0.03 0.03 0.03 30 28 26 25 23 20 17 15 13 11 8 7 5 3 2 1 100.00 93.33 86.67 83.00 76.33 66.33 56.33 49.67 43.00 36.67 26.67 23.33 16.67 10.00 6.67 3.33

2 2 1 2 3 3 Sf = N = 30 2 2 2 3 1 2 2 1 1 1

Grouped frequency distributions

Find the range of the scores, high - low. Select an interval size that will give about 10 groups. Use multiples of the interval size as the nominal lower limits of each interval. List and tally the scores, and fill in the frequency and cumf columns.

Percentiles or percentile points

The percentile is the score (value of X) that has a given percentage of the number of scores (N) below it. For example, if a distribution contains 60 scores (N = 60), then the 50th percentile is the score that has 30 scores (50% of 60 scores) below it. The 20th percentile is the score that has 12 scores (20% of 60) below it.

Percentile Ranks

The percentile rank is the percentage of the scores in a distribution that fall below a given score. For example, if an SAT verbal score of 660 is higher than 80% of the scores in the distribution, then the percentile rank of a score of 660 is 80. If an SAT math score of 550 is higher than 55% of the scores in the distribution, then the percentile rank of a score of 550 is 55.

Percentile or percentile rank?

Score

Percent

Calculating percentiles
1. Multiply the percentage in the question by N to find the number of scores below the desired percentile (cumfp) 2. Find the lower limit of the interval containing cumfp. 3. Determine the proportion of the scores in the interval needed to reach cumfp (cumfp - cumfL)/ fi 4. Find the corresponding score.

The formula approach

Using the logical method, we can derive a formula, viz.: P50 = LRL + (cumfp - cumfb) (i ) fw from which a parallel formula is derived, viz.: P50 = LRL + i cumfp - cumfb fw

Percentile examples: Verbal SAT


Class Interval 760-800 710-750 660-700 610-660 560-600 510-550 460-500 410-450 f cum f 4 60 4 56 6 52 12 46 16 34 11 18 5 7 2 2 60

Percentile rank

PRank = cumfL + (fi /i)( X XL ) N

x 100

Later in the course, we will learn a much simpler method of finding percentiles and percentile ranks.

Graph types

Bar graph: For frequency distributions of discrete variables, often nominal or ordinal data. Bars represent separate groups, so they should be separated. Histogram: For frequency distributions of continuous variables, usually interval or ratio data. Bars represent segments of a range, so they should touch.

Bar graphs
16 14 12 10 8 6 4 2 0 POL BAD PSY SOC

Number of students in statistics class from each of four majors, fall, 2005

Histogram
8 7

Scores of statistics students on first exam

1 60

70

80

90

100

Frequency polygons
8

60

70

80

90

100

These are formed by drawing lines to connect the middle of the tops of the bars in a histogram, closing to the X axis at each end or tail. Dont forget to close the tails to the X axis.

More graphs

Cumulative frequency or cumulative percentage curve Stem-and-leaf diagram Excel charts

Characteristic shapes of distributions

One-mode symmetrical
The normal distribution

Bimodal, multimodal, and no-mode distributions Skewed distributions


Negative skew and positive skew Skew is mean - median

Rectangular distributions

Potrebbero piacerti anche