Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Mari Sudha
Outline
Glossary Levels of Measurement Sampling Organizing Data Statistics
Glossary
Population
Group of individuals under study
Sample
A finite subset of statistical individuals in a population
Parameter
A value, usually unknown (and which therefore has to be estimated), used to represent a certain population characteristic. For example, the population mean is a parameter that is often used to indicate the average value of a quantity. Denoted by Greek letters e.g., ,
Statistic
A quantity that is calculated from a sample of data; Possible to draw more than one sample from the same population - the value of a statistic will in general vary from sample to sample. Often assigned Roman letters (e.g. m and s)
Glossary (cont.)
Sample Size
No. of individuals in a sample
Population Frame
List of sampling units from which the sample is selected (directories, maps, registered voters, list(s), etc.)
Statistical Inference
Makes use of information from a sample to draw conclusions (inferences) about the population from which the sample was taken
Experiment
Any process or study which results in the collection of data, the outcome of which is unknown
Glossary (cont.)
Random Process
An experiment, trial, or observation that can be repeated numerous times under the same conditions; outcome of which are independent and identically distributed. It is in no ways affected by any previous outcome and cannot be predicted with certainty
Random Variable
A variable whose value results from a measurement on some type of random process e.g., the tossing of a coin Can be classified as either discrete (a random variable that may assume either a finite number of values or an infinite sequence of values) or as continuous (a variable that may assume any numerical value in an interval or collection of intervals
Independent Variables
Variables that are manipulated and whose effects are measured and compared; also known as treatments; may include price levels, advertising themes etc.,
Glossary (cont.)
Experimental/Test Unit
Individuals, organizations, or other entities whose response to the independent variables or treatments is examined; may include consumers, stores, or geographic areas
Dependent Variables
Variables that measure the effect of the independent variables on the test units; may include sales, profits, and market share
Extraneous Variables
Variables other than the independent variables that affect the response on the test units; can confound the dependent variable measures such that it weakens or invalidates the results of the experiment Includes store size, store location, and competitive effort
Raw Data
Data collected in original form
Glossary (cont.)
Frequency
Variables that measure the effect of the independent variables on the test units; may include sales, profits, and market share
Frequency Distribution
The organization of raw data in table form with classes and frequencies
Measurement Scales
Variables differ in how well they can be measured, i.e., in
how much measurable information their measurement scale can provide
There is obviously some measurement error involved in every measurement, which determines the amount of information that we can obtain Another factor that determines the amount of information that can be provided by a variable is its type/level of measurement scale
Outline
Glossary Levels of Measurement Sampling Organizing Data Statistics
Levels of Measurement
Data obtained from measurement classified using numbers (In order to determine the way we are going to measure the
variables)
Classification can be done with different levels of precision or levels of measurement Important to know the LOM we are working on partly determines the arithmetic and statistical operations that can be carried out on them
Nominal (cont.)
Simple and widely used when relationship between two variables is to be studied Nominal Scale numbers are no more than labels; used specifically to identify different categories of responses E.g.,
What is your gender? [ ] Male [ ] Female
Nominal (cont.)
E.g., A survey of retail stores done on two dimensions way of maintaining stocks and daily turnover.
How do you stock items at present? [ ] By product category [ ] At a centralized store [ ] Department wise [ ] Single warehouse Daily turnover of consumer is? [ ] Between 100 200 [ ] Between 200 300 [ ] Above 300
Ordinal (cont.)
E.g., Results of a horse race, which say only which horses arrived first, second, third, etc. but include no information about times Textual labels can be instead of numbers to represent the category responses
Ordinal (cont.)
E.g.1, Rank the following attributes (1 5), on their importance in a microwave oven
1. 2. 3. 4. Company Name Functions Price Comfort
5. Design
The most important attribute is ranked 1 by the respondents and the least important is ranked 5. Instead of numbers, letters or symbols too can be used to rate in a ordinal scale. Such scale makes no attempt to measure the degree of favorability of different rankings
Ordinal (cont.)
If there are 4 different types of fertilizers and if they are ordered on the basis of quality as Grade A, Grade B, Grade C, Grade D is again an Ordinal Scale If there are 5 different brands of Talcum Powder and if a respondent ranks them based on say, Freshness into Rank 1 having maximum Freshness Rank 2 the second maximum Freshness, and so on, an Ordinal Scale results
1 2 3 4
Tells us that position 5 on the scale is above position 4 and also the distance from 5 to 4 is same as distance from 4 to 3 Does not permit conclusion that position 4 is twice as strong as position 2 because no zero position has been established
Interval (cont.)
E.g.2, Calendar years are an interval scale. The arbitrary 0 (or 1 depending on your viewpoint) was assigned when Christ was born and time before this is labeled BC E.g.3, Difference between the following values is measured by a fixed scale
- Money - People - Education (in years)
Ratio (cont.)
Data on certain demographic or descriptive attributes, if they are obtained through open-ended questions, will have ratio-scale properties E.g.,
What is your annual income before taxes? ______ $ How far is the Theater from your home ? ______ miles Answers to these questions have a natural, unambiguous starting point, namely zero. Since starting point is not chosen arbitrarily, computing and interpreting ratio makes sense. For example we can say that a respondent with an annual income of $ 40,000 earns twice as much as one with an annual income of $ 20,000
Ordinal: The central tendency can be represented by its mode or its median, but the mean cannot be defined
Interval: Can be represented by its mode, its median, or its arithmetic mean. Statistical dispersion can be measured by range, inter-quartile range, and standard deviation.
Ratio
Field
Outline
Glossary Levels of Measurement Sampling Organizing Data Statistics
Sampling
Depends upon the nature of the data and type of enquiry Procedure for selecting a sample
- Decide on the target population/audience - Identification of population frame - Selection of sampling procedure/technique - Decide the sample size - Execute the Sampling Process (Select the sample individuals)
The nature of selecting a sample can be broadly classified under three heads:
- Non-Probability Sampling - Probability Sampling - Mixed Sampling
Sampling (cont.)
Procedure for selecting a sample - Decide on the target population/audience
- Identification of population frame - Selection of sampling procedure/technique - Decide the sample size - Execute the Sampling Process (Select the sample individuals)
Sampling (cont.)
Non-Probability Sampling
- Every individual in the population does not have equal chance of being selected - Suffers from drawbacks of favoritism and nepotism depending upon beliefs and prejudice of investigator - Statistically valid statements cannot be made about the precision of the estimates (i.e. predictive value is weak) - Methods of Non-Prob. Sampling: 1. Convenience Sampling
2. Judgment Sampling 3. Quota Sampling 4. Snowball Sampling
Sampling (cont.)
Mixed Sampling
- Samples selected partly according to some laws of chance and partly
Sampling (cont.)
Probability Sampling
- Every individual in the population has an equal chance of being selected
Outline
Glossary Levels of Measurement Sampling Organizing Data Statistics
Organizing Data
The first step in the analysis of the data is organizing the collected numbers A frequency distribution is a tool for organizing data The first step in drawing a frequency distribution is to construct a frequency table A frequency table is a way of organizing the data by listing every possible score (including those not actually obtained in the sample) as a column of numbers and the frequency of occurrence of each score as another
Information contained in the frequency table may be transformed to a graphical or pictorial form, like:
I. Histograms II. Absolute Frequency Polygons III. Relative Frequency Polygons IV. Absolute Cumulative Frequency Polygons V. Relative Cumulative Polygons VI. Box Plots VII. Pie Charts etc.,
Data Analysis
The steps in the analysis of the data include:
- Data must be accurately scored and systematically organized to facilitate data analysis I. Scoring: assigning a total to each participants instrument II. Tabulating: the mechanics of organizing the data III. Coding: assigning numerals (e.g., ID) to data IV. Performing both the initial and more detailed analysis
Outline
Glossary Levels of Measurement Sampling Organizing Data Statistics
Statistics
Descriptive Statistics
Gives numerical and graphic procedures to summarize a collection of data in a clear and understandable way
Inferential Statistics
Provides procedures to draw inferences about a population from a sample
An unsophisticated forecaster uses statistics as a drunken man uses lamp-posts for support rather than for illumination ~ Andrew Lang