Sei sulla pagina 1di 38

Levels of Measurement

Not every statistical operation can be used with every variable. The type of statistical operations we employ will depend on how our variables are measured.

Nominal Ordinal Cordinal

Nominal Level of Measurement


Numbers or other symbols are assigned to a set of categories for the purpose of naming, labeling, or classifying the observations. Examples:
Political Party (UDF, LDF) Religion (Hindu, Muslim, Christian) Gender (Male, Female) The nominal data are also called qualitative or categorical

Ordinal Level of Measurement


Ordinal variables that can be ranked from low to high. Example: Social Class
Upper Class Middle Class Working Class

Cordinal Level of Measurement


Variables where measurements for all cases are expressed in the same units. Data which are in real numbers. Distinction also made as ratio and interval data. Examples: Age Income We also refers this type of data as quantitative or numerical.

Levels of Measurement-contd.
The critical difference between ordinal and interval data is that the intervals or differences between values of interval data are consistent and meaningful.

Summary-types of data
cordinal: Values are real numbers; All calculations are valid; Data may be treated as ordinal or nominal. Ordinal: Values must represent the ranked order of the data; Calculation based on ordering process are valid; Data may be treated as nominal but not as interval. Nominal: Values are arbitrary numbers that represent categories; Only calculation based on the frequencies of occurrence are valid; Data may not be treated as ordinal or interval.

Discrete and Continuous Variables


Discrete variables: variables that have a minimum-sized unit of measurement, which cannot be sub-divided
Example: the number of children per family

Continuous variables: variables that, in theory, can take on all possible numerical values in a given interval
Example: length, income

Analyzing Data:
Descriptive and Inferential Statistics
Population: The total set of individuals, objects, groups, or events in which the researcher is interested. A descriptive measure of a population is called a parameter. Sample: A relatively small subset selected from a population. A descriptive measure of a sample is called a statistic. Descriptive statistics: Procedures that help us organize, summarize and present the data collected from either a sample or a population, in a convenient and informative way. Inferential statistics: The logic and procedures concerned with making conclusions or inferences about characteristics of populations based on sample data.

What is statistics?
Statistics is a science which consists of collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusion based on data.

The Organization of Information: Frequency Distributions


Frequency Distributions Proportions and Percentages Percentage Distributions
Frequency Distributions for Nominal Variables Frequency Distributions for Ordinal Variables Frequency Distributions for Interval-Ratio Variables

Cumulative Distributions

Graphical Presentation of data


The Pie Chart The Bar Graph The Histogram Times Series Charts Stem and Leaf plots It is important to choose the appropriate graphs to make statistical information coherent.

Pie chart
Pie chart: a graph showing the differences in frequencies or percentages among categories of a nominal or an ordinal variable. The categories are displayed as segments of a circle whose pieces add up to 100 percent of the total frequencies.

The Bar Graph


Bar graph: a graph showing the differences in frequencies or percentages among categories of a nominal or an ordinal variable. The categories are displayed as rectangles of equal width with their height proportional to the frequency or percentage of the category.

The Histogram
Histogram: a graph showing the differences in frequencies or percentages among categories of an interval-ratio variable. The categories are displayed as contiguous bars, with width proportional to the width of the category and height proportional to the frequency or percentage of that category.

200

100

Std. Dev = 17.03 Mean = 44.5 0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 60.0 55.0 65.0 70.0 75.0 80.0 85.0 90.0 N = 1422.00

AGE OF RESPONDENT

Shapes of histograms
There are four typical shape characteristics

Shapes of histograms

Negatively skewed Positively skewed

Modal classes
A modal class is the one with the largest number of observations. A unimodal histogram

The modal class

Time Series Charts


Time series chart: a graph displaying changes in a variables at different points in time. It shows time (measured in units such as years or months) on the horizontal axis and the frequencies (percentages or rates) of another variable on the vertical axis.

Weight Data Males: 140 145 160 190 155 165 150 190 195 138 160 155 153 145 170 175 175 170 180 135 170 157 130 185 190 155 170 155 215 150 145 155 155 150 155 150 180 160 135 160 130 155 150 148 155 150 140 180 190 145 150 164 140 142 136 123 155

Females: 140 120 130 138 121 125 116 145 150 112 125 130 120 130 131 120 118 125 135 125 118 122 115 102 115 150 110 116 108 95 125 133 110 150 108

Uses of Graphical Display


Graphical display may enable researchers to seek transformations, to locate special effects and or outliers, to look for patterns and groups and generally to hunt for the novel and unexpected phenomena or to suggest hypothesis worth investigating. Tukey claims that the greatest value of the picture is when it forces us to notice what we never expected to see.

Four data sets of Anscombes plots


Data sets
1-3 1 2 3 4

10 8 13 9 11 14 6 4 12 7 5

8.04 6.95 7.58 8.81 8.33 9.96 7.24 4.26 10.84 4.82 5.68

9.14 8.14 8.74 8.77 9.26 8.10 6.13 3.10 9.13 7.26 4.74

7.46 6.77 12.74 7.11 7.81 8.84 6.08 5.39 8.15 6.42 5.73

8 8 8 8 8 8 8 19 8 8 8

6.58 5.76 7.71 8.84 8.47 7.04 5.25 12.50 5.56 7.91 6.89

Regression results Number of observations = 11 Mean of Xs =9.0 Mean of Ys = 7.5 Regression coefficent of b of Y on X=0.5 Equation of regression line = Y= 3+0.5X Estimated standard error of b =0.118 Multiple R square = 0.667

ANSCOMBE PLOTS

The illustartion by Anscombes plots is clearly indicative of that the pictures are particularly valuable in an exploratory setting because not only they can confirm or contradict what we thought in advance about the data, but they can also reveal in a dramatic way the things that we did not even suspect.

Summarizing Distribution: Measures of Center


Possibilities

What is the most likely outcome? What outcome do we expect? What is the outcome in the middle?

The Mode
The category or score with the largest frequency (or percentage) in the distribution. The mode can be calculated for variables with levels of measurement that are: nominal, ordinal, or intervalratio.

The Median
The score that divides the distribution into two equal parts, so that half the cases are above it and half below it. The median is the middle score, or average of middle scores in a distribution.

Median Exercise
Calculate the median:
Given the ordered list of cases, the median is the value of the case in position (n+1)/2. Example: A Sample of 10 adults was asked to report the number of hours they spent on the internet in the previous month. The results are listed below: 0, 7, 12,5,33,14, 8,0,9,22 Place them in ascending order as follows: 0,0,5,7,8,9,12,14,22,33 The median is the average of the 5th and 6th observations (the middle two) which are 8 and 9. The median is 8.5

Other ordered statistics


Order statistics are concerned within an ordered list of cases. First quartile (Q1)- dividing the lowest one fourth of the data from the upper three fourths. The second quartile (Q2) is the same as the median. It divides the lower half of the data from the upper half. The third quartile (Q3) separates the lower three fourths of the data from the upper one fourth.

Depth means a values position relative to the nearest extreme. Median depth= (n+1)/2. Quartile depth= (tmd+1)/2 Truncated median depth means the integer part of the median depth. Compute the first and third quartile for the above example.

The Mean
The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.

Formula for the Mean

x
i 1

x bar equals the sum of all the scores, x, divided by the number of scores, n.

Shape of the Distribution


Symmetrical (mean is about equal to median) Skewed
Negatively mean < median Positively mean > median

Bimodal (two distinct modes) Multi-modal (more than 2 distinct modes)

Considerations for Choosing a Measure of Central Tendency


For a nominal variable, the mode is the only measure that can be used. For ordinal variables, the mode and the median may be used. For interval-ratio variables, the mode, median, and mean may all be calculated. The mean provides the most information about the distribution, but the median is preferred if the distribution is skewed.

Potrebbero piacerti anche