00 mi piace00 non mi piace

2 visualizzazioni63 paginetools

Jan 13, 2018

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

tools

© All Rights Reserved

2 visualizzazioni

00 mi piace00 non mi piace

tools

© All Rights Reserved

Sei sulla pagina 1di 63

research

Major Categories of Data

2 Major Categories

1. Primary

2. Secondary Data

Major Categories of Data

This is data that has never been gathered before, whether in a particular

way, or at a certain period of time.

2. Secondary Data- data come from other studies done by other

researchers or institutions or organizations.

There is no less validity with secondary data, but you should be well

informed about how it was collected.

Sources of Primary data

Questionnaire survey

It is distributed—or made accessible if online—to a

predetermined selection of individuals.

Individuals complete and return the questionnaire or

submit online.

Face-to-face interview

Interviewer asks questions, usually following a guide

or protocol.

Interviewer records answers.

Sources of Primary data

Telephone interview

Interviewer asks questions, usually following a guide or protocol.

Interviewer records responses.

Group techniques (interview, facilitated workshop, focus group)

This involves group discussion of predetermined issue or topic in

person or through teleconferencing.

Group members share certain common characteristics.

Facilitator or moderator leads the group.

Assistant moderator usually records responses.

Sources of Secondary Data

Document review

Researchers review documents, and identify relevant information.

They keep track of the information retrieved from documents.

Types of Data

2. Quantitative- numerical data

Can be Discrete or Continuous

Types of Data

Overview: Overview:

Deals with descriptions. Deals with numbers.

Data can be observed but Data which can be measured.

not measured. Length, height, area, volume,

Colors, textures, smells, weight, speed, time,

tastes, appearance, beauty, temperature, humidity,

etc. sound levels, cost, members,

Qualitative → Quality ages, etc.

Quantitative → Quantity

Examples of types of Data:

Qualitative vs Quantitative

Examples of types of Data:

Qualitative vs Quantitative

Examples of types of Data:

Qualitative vs Quantitative

Types of Quantitative Data:

Continuous vs Discrete

Continuous Discrete

be continuous if the values be discrete if the values belonging

belonging to the set can take on to the set are distinct and

ANY value within a finite or separate (unconnected values).

infinite interval.

Examples of Continuous vs Discrete

data

Continuous Discrete

Examples: Examples:

• The height of a horse (could be any • The number of people in your class (no

value within the range of horse heights). fractional parts of a person).

• Time to complete a task (which could be • The number of TV sets in a home (no

measured to fractions of seconds). fractional parts of a TV set).

• The outdoor temperature at noon (any • The number of puppies in a liter (no

value within possible temperatures fractional puppies).

ranges.) • The number of questions on a math test

• The speed of a car on Route 3 (no incomplete questions).

(assuming legal speed limits).

NOTE: Continuous data usually requires a NOTE: Discrete data is counted. In whole

measuring device. (Ruler, stop watch, numbers. The description of the task is

thermometer, speedometer, etc.). It usually preceded by the words "number

Types of Measurement Scales

categorize the

different types of

variables

Types of Variables

something that we can manipulate and something we can control.

1. Independent variable, sometimes called an experimental or

predictor variable, is a variable that is being manipulated in an

experiment in order to observe the effect on a dependent

variable

2. Dependent variable, sometimes called an outcome variable.

Types of scales: Nominal

any quantitative value.

May be called “name or labels”

These scales are mutually exclusive meaning there is no

overlap

None of them have any numerical significance

Examples of nominal scale

single

married

widow/er

separated

Age

Educational Attainment

Occupation

Religion

Examples of nominal scale

(e.g. male/female) is called “dichotomous”

Types of Scales: Ordinal

happiness, discomfort

it is the order of the values is what’s important and

significant

but the differences between each one is not really known

Examples of ordinal scale

Types of Scales: Interval

only the order, but also the exact differences between the

values

“Interval” itself means “space in between”

the problem with interval scales: they don’t have a “true

zero.”

Examples of interval scale

difference between each value is the same.

For example, the difference between 60 and 50 degrees is a measurable 10

degrees, as is the difference between 80 and 70 degrees.

And there is no such thing as no temperature or zero temperature.

Without a true zero, it is impossible to compute ratios.

Time is another good example of an interval scale in which the

increments are known, consistent, and measurable.

Types of Scales: Ratio

scales

because they tell us about the order,

they tell us the exact value between units,

they also have an absolute zero–which allows for a wide range of

both descriptive and inferential statistics to be applied.

Ratio scales provide a wealth of possibilities when it comes to statistical

analysis.

These variables can be meaningfully added, subtracted, multiplied,

divided (ratios).

Central tendency can be measured by mode, median, or mean;

measures of dispersion, such as standard deviation and coefficient of

Examples of ratio scales

Summary of types of measurements

Ordinal scales provide good information about the order of choices,

such as in a customer satisfaction survey.

Interval scales give us the order of values + the ability to quantify

the difference between each one.

Ratio scales give us the ultimate–order, interval values, plus the

ability to calculate ratios since a “true zero” can be defined.

Tests of Relationships

and Difference

Univariate Descriptive Statistics

1. Mean

2. Median

3. mode

B. Measures of Dispersion

1. Range

2. Variance

3. Standard Deviation

Univariate Descriptive Statistics

1. Mean

2. Median

3. mode

Measures of Central Tendency

attempts to describe a set of data by identifying the

central position within that set of data.

sometimes called measures of central location.

They are also classed as summary statistics.

Measures of Central Tendency:

the MEAN

1. Mean

the mathematical average

the most popular and well known measure of central

tendency

It can be used with both discrete and continuous data,

although its use is most often with continuous data

Measures of Central Tendency:

the MEAN

Measures of Central Tendency:

the MEAN

Measures of Central Tendency:

the MEAN

Measures of Central Tendency:

the MEDIAN

2. Median

is the middle score for a set of data that has

been arranged in order of magnitude.

Measures of Central Tendency:

the MEDIAN

65 55 89 56 35 14 56 55 87 45 92

We first need to rearrange that data into order of magnitude (smallest first):

14 35 45 55 55 56 56 65 87 89 92

Our median mark is the middle mark - in this case, 56 (highlighted in bold). It is

the middle mark because there are 5 scores before it and 5 scores after it. This

works fine when you have an odd number of scores,

Measures of Central Tendency:

the MEDIAN

But what happens when you have an even number of scores? What if you had only 10

scores? Well, you simply have to take the middle two scores and average the result. So, if

we look at the example below:

65 55 89 56 35 14 56 55 87 45

We again rearrange that data into order of magnitude (smallest first):

14 35 45 55 55 56 56 65 87 89

Only now we have to take the 5th and 6th score in our data set and average them to

get a median of 55.5.

Measures of Central Tendency:

the MODE

3. Mode

The mode is the most frequent score in our data set.

Therefore, sometimes consider the mode as being the most popular

option.

It is used for categorical data where we wish to know which is the

most common category.

Measures of Central Tendency:

the MODE

represents the highest bar in a

bar chart or histogram.

Measures of Central Tendency:

the MODE

form of transport, in this particular

data set, is the bus.

Univariate Descriptive Statistics

Mean Median Mode

1. Range

2. Variance

3. Standard Deviation

Measures of Dispersion

statistics.

But, people's heights, weights, ages, etc., do vary.

We often need to measure the extent to which scores in a

dataset differ from each other.

Such a measure is called the dispersion of a distribution.

Measures of Dispersion: the RANGE

It can be thought of in two ways.

As a quantity: the difference between the highest and lowest

scores in a distribution.

Example highest score is 100 and the lowest is 68. "The range of

scores on the exam was 32."

As an interval; the lowest and highest scores may be reported as the

range.

"The range was 62 to 94," which would be written (62, 94).

Measures of Dispersion: the

VARIANCE

Measures of Dispersion: the

VARIANCE

Measures of Dispersion: the

VARIANCE

Measures of Dispersion: the

VARIANCE

Measures of Dispersion:

the STANDARD DEVIATION

whether a particular set of data is significantly different from a

control set of data.

In order to make that conclusion, one must know the variability

of the data.

The measure of variability used is nearly always the standard

deviation.

Measures of Dispersion:

the STANDARD DEVIATION

Measures of Dispersion:

the STANDARD DEVIATION

We have discussed the Univariate

Descriptive Statistics

Multivariate Inferential Statistical Tests

Bi and Multivariate Inferential

Statistical Tests

relationship between two different variables.

This is better to see those variables are interrelated or not.

This kind of data analysis process is useful enough to test hypotheses

of association and causality.

It helps to verify how it is easy to predict the easiness and prediction

of the value in terms of dependent variable in case of a known case

value of an independent variable.

Bi and Multivariate Inferential

Statistical Tests

1. chi-square test

2. T-test

3. Z-test

4. ANOVA

The Chi-Square Test

data with data we would expect to obtain according to a specific

hypothesis.

For example, if, according to Mendel's laws, you expected 10 of 20

offspring from a cross to be male and the actual observed number

was 8 males, then you might want to know about the "goodness to

fit" between the observed and expected.

Were the deviations (differences between observed and expected)

the result of chance, or were they due to other factors.

The Chi-Square Test

conclude that something other than chance is at work, causing the

observed to differ from the expected.

The chi-square test is always testing what scientists call the

Null hypothesis (Ho), which states that there is no significant

difference

between the expected and observed results

The Chi-Square Test formula

Here,

O = Observed frequency

E = Expected frequency

∑ = Summation

X 2 = Chi Square value

The Paired T-test

population means in the case of two samples that

are correlated.

Paired sample t-test is used in ‘before-after’ studies,

or when the samples are the matched pairs, or when

it is a case-control study.

The Paired T-test

want to know whether or not the training had any impact on the

efficiency of the employee, we could use the paired sample test.

We collect data from the employee using a questionnaire, before the

training and after the training.

By using the paired sample t-test, we can statistically conclude

whether or not training has improved the efficiency of the

employee.

In medicine, by using the paired sample t-test, we can figure out

whether or not a particular medicine will cure the illness

The Paired T-test

Where,

d bar is the mean difference between two samples,

s² is the sample variance,

n is the sample size and

t is a paired sample t-test with n-1 degrees of

freedom

The Paired T-test

The Z-test

Z test assumes normal distribution under null hypothesis.

Z test is performed on a large number of data or on a population

data.

On the other hand, for a small data or sample data, T test is

performed.

The Z-test

Where,

x = Standardized random variable

x̄ = Mean of the data

σ = Population standard deviation

The ANOVA

the existence of differences among several population means.

The technique requires the analysis of different forms of variances –

hence the name.

But note: ANOVA is not used to show that variances are different

rather it is used to show that means are different.

Where,

F = Anova Coefficient

MST = Mean sum of squares due to treatment

MSE = Mean sum of squares due to error.

The ANOVA

Formula for MSE is given below:

Where, Where,

SST = Sum of squares due to treatment SSE = Sum of squares due to error

p = Total number of populations S = Standard deviation of the

n = Total number of samples in a population. samples

N = Total number of observations.

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.