Sei sulla pagina 1di 52

Chapter Two

Methods,
Measurement, and
Stats in I/O
Psychology

What Is Science?
Science process for generating a
body of knowledge
A Logic of Inquiry or way of doing things

Scientists strive for an understanding of


the world in which we live

Goals of Science
Description
Accurate portrayal of phenomenon

Explanation
Better understand phenomenon

Prediction
Anticipate occurrence of events

Control
Able to manipulate variables to affect
behavior

Assumptions of Science
Empiricism
Best way to understand behavior is to
generate theory and test with data

Determinism
Behavior is orderly and systematic

Discoverability
It is possible to discover the orderliness

Theories
Set of interrelated concepts and propositions
that present a systematic view of phenomena
Purpose to describe, explain, and predict
So, theories further the goals of science
Some say that the primary objective of science is
theory building

What makes a good theory?

Parsimony
Precision
Testability
Usefulness
Generativity

Chicken or Egg? Data or Theory?


Induction Approach to science that
consists of working data to theory
Deduction Start with theory and then
collect data
Good scientists use both approaches
The Cyclical Inductive-Deductive Model of
Research

Deduction
Starts here
Theory
Development

Induction
Starts
Here
Additional
Data Collection

Theory
Refinement

Theory
Refinement

Data Collection

The Cyclical Inductive-Deductive


Model of Research
Can start with either data or theory
Most research is probably driven by
inductive processes original theory is
likely based on data from somewhere

Research Terminology and Basics


Overview
Causal Inference want to conduct
experiments so that we can infer
causality from studies
Key Terms:
Independent and Dependent Variables
Internal and External Validity
Control

Independent and Dependent Variables


IVs and DVs
IV Anything that is systematically
manipulated by the researcher
Predictor, Precursor, Antecedent

DV Variable of interest that we design


our studies to assess
Criteria, Outcomes, Consequences

Extraneous Variable any other


variable that can contaminate results

Internal and External Validity


Internal Extent to which we can draw causal
inferences about variables
Are results due to the IV?
Control, control, control

External Extent to which results obtained


generalize to/across other people, settings,
and times
Student samples to employees?

Tradeoff between internal and external


validity is important to consider

Control
Must ensure that we can make a causal
inference about our IV affecting our DV
Ways to Control
Hold extraneous variables constant
Systematically manipulate different levels
of extraneous variables part of design
Statistically control

Research Process Model


Formulate the Hypotheses

Design the Study

Collect the Data

Analyze the Data

Report the Findings

Research Designs
Two critical issues concerning experimental
method: random assignment and variable
manipulation
Random Assignment
Each participant has an equally likely chance of being
assigned to any condition

Manipulation
Control, Control, Control

Laboratory Designs
Field and Quasi-Experiments
Observational Methods

Laboratory Experiments
Random assignment and manipulation
of IVs are used to increase control
Contrived setting for study
Trade internal for external validity
Lab typically has high internal validity
Questioned in terms of external validity
Depends on purpose of study

Try to balance both concerns

Field and Quasi-Experiments


Field Experiments
Take advantage of realism for external
validity issues
Use random assignment and manipulation
within real settings

Quasi-Experiment Field studies


without random assignment
Random assignment not always practical
use intact groups
Very common in I/O psychology

Observational Methods
Neither random assignment nor
manipulation
Also called correlational designs or
descriptive approaches

Make use of available resources limits


ability to make causal inferences
Still very important
Able to describe relationships
Great deal to be gained from description
Can still be useful in predicting behavior

Data Collection Techniques


Overview
Naturalistic Observation
Participant Observation
Unobtrusive Observation

Case Studies
Archival Research
Surveys
Self-Administered Questionnaires
Interviews

Naturalistic Observation
Observation of someone or something
in its natural environment
Participant Observation observer tries to
blend in completely with those to be
observed
Unobtrusive tries to objectively and
unobtrusively observe, but not blend in

Case Studies
Examination of individual, group,
company, or society
Professional history of famous CEO

Very beneficial in terms of description


and providing details about typical or
exceptional firms or individuals

Archival Research
Rely on secondary data sets collected
by other people for general or specific
purposes
Quality of research affected by strength of
data set very little control over this
GATB dataset

Usually include both cross-sectional


and longitudinal data

Surveys Overview
Involve the selection of a sample and
administration of a questionnaire
Very commonly used in I/O research
Two approaches are:
Self-administered questionnaire
Interview

Self-Administered Questionnaires
Surveys completed by respondents in
absence of investigator
Commonly used in both lab and field
research

Useful for three reasons:


Easy to administer
Can administer to large group at one time
Provide respondents with anonymity

Interviews
Investigator asks a series of questions
either verbally or in written form
Time intensive, but can get rich data
Clear benefits:
Response rates higher
Ambiguities about questions can be
cleared up

Measurement
Measurement Assignment of numbers
to objects or events in such a way as to
represent specified attributes of the
objects
Attribute Dimension along which
individuals can be measured and along
which they vary
Measurement Error things that can
make measurement inaccurate

Measurement Overview
Because of measurement error, we
must carefully consider two important
measurement concerns:
Reliability
Validity

Reliability
The consistency or stability of a measure
It is imperative that a predictor be measured
reliably
Unsystematic measurement error renders a
measure unreliable
We cannot predict attitudes, performance, or
behaviors without reliable measurement
limit to validity

Test-Retest Reliability
Test-Retest reflects consistency of a
test over time
Stability coefficient
Administer test at time 1 and time 2 and
see if individuals have a similar rank order
at both time 1 and time 2

Mechanical Comprehension
at
Second Test Administration

Mechanical Comprehension
at
First Test Administration

Mechanical Comprehension
at
First Test Administration

Mechanical Comprehension
at
Second Test Administration

Low

Low Reliability
Low

High
High

Low

High Reliability
Low

High
High

Parallel Forms Reliability


Parallel forms extent to which two
independent forms of a test are similar
measures of the same construct
Coefficient of equivalence
Two different forms of a final
Survey on paper and computer
Test for disabled applicants

Inter-Rater Reliability
Inter-Rater extent to which multiple
raters or judges agree on ratings made
about a person, thing or behavior
Examine the correlation between ratings of
two different judges rating the same person
Helps protect against interpersonal biases

Internal Consistency Reliability


Internal Consistency indication of
interrelatedness of items
Tells us how well items hang together
Split-half split test in half by odd and
even number questions
Inter-item look at relationships among
every item to test for consistency
(Cronbachs Alpha)

Rule of thumb for Reliability should be


greater than .70

Validity
Construct Validity extent to which a
test measures the underlying construct
it was intended to measure
Construct abstract quality that is not
observable and difficult to measure
Self-esteem, intelligence, cognitive ability

Validity Overview
Two types of evidence used to
demonstrate Construct Validity
Content Validity degree to which a test
covers a representative sample of
quality being assessed
Not established in a quantitative sense

Criterion-Related Validity degree to


which a test is a good predictor of
attitudes, behavior, or performance

Approaches to Criterion-Related
Validity
Predictive Validity extent to which scores
obtained at one point in time predict
criteria at some later time.
GREs, GPAs, research experience, etc.
predicting success in graduate school

Concurrent Validity extent to which a test


predicts a criterion that is measured at
same time as test
Want to see if newly developed selection tests
predict performance of current employees

Predictive Designs
Gather predictor data on all of the applicants.
Some of the applicants would be hired to fill the
open positions based on predictors that are
not part of our selection battery.
If we hire only high scorers on the new predictors,
then we will not be able to examine if low scorers are
unsuccessful on the job.

After months on the job, we gather performance


data, which serve as the criteria.
A validity coefficient is computed between the
predictor score and the criterion score that
indicates the strength of the relationship.

Concurrent Designs
Data on both predictors and criteria are
collected from incumbent employees at
the same time.
A validity coefficient is computed
between the predictor score and the
criterion score which indicates the
strength of the relationship

Excellent

Job
Performance

Cognitive Ability
Score

High

Low

High Validity

Excellent

Job
Performance

Cognitive Ability
Score

High

Poor

Low

Low Validity

Low

Element

Concurrent Design

Predictive Design

Participants

Incumbents

Applicants

Predictor
Time 1
Measurement

Time 2

Criterion
Time 1
Measurement

Time 2

Selection
Decision

Made prior to Time


1 and based on
other predictors

To be made between
Time 1 and Time 2
and to be based on
other predictors

Validity
Coefficient

Correlation
Between predictor
and criterion

Correlation between
predictor and criterion

Components of Construct Validity


Convergent Validity Measure is related
to other measures of similar constructs
Divergent Validity Measure is not
related to measures of dissimilar
constructs
These are demonstrated by using
concurrent and/or predictive validity
designs

Summary of Reliability Types and Validity Approaches


Test-Retest: Stability of test over time

Reliability

Parallel Forms: Equivalence of two test forms

Internal Consistency: Consistency among test items

Content: Test representative of domain


Construct
Validity

Predictive: Test scores predict future criterion


Divergent
Convergent

Criterion-Related

Concurrent: Test scores predict current criterion


Divergent
Convergent

Statistics Overview
Statistic An efficient device for
summarizing in numbers the values,
characteristics, or scores describing a
series of cases
Types of Stats
Measures of Central Tendency
Measures of Dispersion
Correlation and Regression
Meta-Analysis

Measures of Central Tendency


Characterize a typical member of the
group
Mode Most frequent score in a distribution
Best for categorical data

Median Score in the middle of a


distribution
Best when some numbers are outliers

Mean Arithmetic average of group of


scores
Most useful and common measure

Measures of Dispersion
Tell how closely scores are grouped
around the mean spreadoutedness
Range Spread of scores from low to
high
Variance More useful measure of
dispersion than the Range
Standard Deviation Square root of
variance, retains original metric of score

Shapes of Distributions
Normal Distribution Depicted by bellshaped curve. Most scores are around
the mean with fewer at the extremes of
the distribution
Lots of qualities/characteristics are
distributed normally (performance,
intelligence, height)
Use of Normal Distribution:
Calculate percentile score where person
ranks compared to population

2%

-3

13.5%

-2

34%

-1

34%

Mean

Number of standard deviations from the mean

13.5%

2%

Correlation
Correlation Coefficient Index of the
strength of relationship between two
variables (r)
Direction
Positive (elevator) vs. Negative (teeter-totter)

Magnitude
0 to +/- 1.00

J o b P e rf o rm a n c e

r = .00

Mechanical Comprehension

r = -.70

J o b P e rf o rm a n c e

J o b P e rf o rm a n c e

r = .70

Mechanical Comprehension

Mechanical Comprehension

Regression
Regression Can predict one variable
from another
Regress the DV on the IV!!
Validity Coefficient just an r
Coefficient of Determination percentage
of variance in the criterion accounted for by
the predictor (r2)

Coefficient of Determination
(r 2 )
Mechanical
Comprehension

Job
Performance

r = .60
r 2 = .36
r

Figure 7.1: Selection Battery Example


Assessment
Center
Work
Sample

r2 = .25

r 2 = .36

Interview

r2 = .10

Criterion
Performance

Test Battery Multiple R = .74


Test Battery R

= .55

Meta-Analysis
Methodology used to do quantitative
literature reviews
Used to do simple narrative reviews
Combine the empirical findings to
quantify the relationship between two
variables
Not a panacea!!

Potrebbero piacerti anche