Chapter Two: Methods, Measurement, and Stats in I/O Psychology

Chapter Two
Methods,
Measurement, and
Stats in I/O
Psychology
What Is Science?
Science process for generating a
body of knowledge
A Logic of Inquiry or way of doing things
Scientists strive for an understanding of

the world in which we live
Goals of Science
Description
Accurate portrayal of phenomenon
Explanation
Better understand phenomenon
Prediction
Anticipate occurrence of events
Control
Able to manipulate variables to affect
behavior
Assumptions of Science
Empiricism
Best way to understand behavior is to
generate theory and test with data
Determinism
Behavior is orderly and systematic
Discoverability
It is possible to discover the orderliness
Theories
Set of interrelated concepts and propositions
that present a systematic view of phenomena
Purpose to describe, explain, and predict
So, theories further the goals of science
Some say that the primary objective of science is
theory building
What makes a good theory?
Parsimony
Precision
Testability
Usefulness
Generativity
Chicken or Egg? Data or Theory?

Induction Approach to science that
consists of working data to theory
Deduction Start with theory and then
collect data
Good scientists use both approaches
The Cyclical Inductive-Deductive Model of
Research
Deduction
Starts here
Theory
Development
Induction
Starts
Here
Additional
Data Collection
Theory
Refinement
Theory
Refinement
Data Collection
The Cyclical Inductive-Deductive

Model of Research
Can start with either data or theory
Most research is probably driven by
inductive processes original theory is
likely based on data from somewhere
Research Terminology and Basics

Overview
Causal Inference want to conduct
experiments so that we can infer
causality from studies
Key Terms:
Independent and Dependent Variables
Internal and External Validity
Control
Independent and Dependent Variables

IVs and DVs
IV Anything that is systematically
manipulated by the researcher
Predictor, Precursor, Antecedent
DV Variable of interest that we design

our studies to assess
Criteria, Outcomes, Consequences
Extraneous Variable any other

variable that can contaminate results
Internal and External Validity

Internal Extent to which we can draw causal
inferences about variables
Are results due to the IV?
Control, control, control
External Extent to which results obtained

generalize to/across other people, settings,
and times
Student samples to employees?
Tradeoff between internal and external

validity is important to consider
Control
Must ensure that we can make a causal
inference about our IV affecting our DV
Ways to Control
Hold extraneous variables constant
Systematically manipulate different levels
of extraneous variables part of design
Statistically control
Research Process Model

Formulate the Hypotheses
Design the Study
Collect the Data
Analyze the Data
Report the Findings
Research Designs
Two critical issues concerning experimental
method: random assignment and variable
manipulation
Random Assignment
Each participant has an equally likely chance of being
assigned to any condition
Manipulation
Control, Control, Control
Laboratory Designs
Field and Quasi-Experiments
Observational Methods
Laboratory Experiments
Random assignment and manipulation
of IVs are used to increase control
Contrived setting for study
Trade internal for external validity
Lab typically has high internal validity
Questioned in terms of external validity
Depends on purpose of study
Try to balance both concerns
Field and Quasi-Experiments

Field Experiments
Take advantage of realism for external
validity issues
Use random assignment and manipulation
within real settings
Quasi-Experiment Field studies

without random assignment
Random assignment not always practical
use intact groups
Very common in I/O psychology
Observational Methods
Neither random assignment nor
manipulation
Also called correlational designs or
descriptive approaches
Make use of available resources limits

ability to make causal inferences
Still very important
Able to describe relationships
Great deal to be gained from description
Can still be useful in predicting behavior
Data Collection Techniques

Overview
Naturalistic Observation
Participant Observation
Unobtrusive Observation
Case Studies
Archival Research
Surveys
Self-Administered Questionnaires
Interviews
Naturalistic Observation
Observation of someone or something
in its natural environment
Participant Observation observer tries to
blend in completely with those to be
observed
Unobtrusive tries to objectively and
unobtrusively observe, but not blend in
Case Studies
Examination of individual, group,
company, or society
Professional history of famous CEO
Very beneficial in terms of description

and providing details about typical or
exceptional firms or individuals
Archival Research
Rely on secondary data sets collected
by other people for general or specific
purposes
Quality of research affected by strength of
data set very little control over this
GATB dataset
Usually include both cross-sectional

and longitudinal data
Surveys Overview
Involve the selection of a sample and
administration of a questionnaire
Very commonly used in I/O research
Two approaches are:
Self-administered questionnaire
Interview
Self-Administered Questionnaires
Surveys completed by respondents in
absence of investigator
Commonly used in both lab and field
research
Useful for three reasons:

Easy to administer
Can administer to large group at one time
Provide respondents with anonymity
Interviews
Investigator asks a series of questions
either verbally or in written form
Time intensive, but can get rich data
Clear benefits:
Response rates higher
Ambiguities about questions can be
cleared up
Measurement
Measurement Assignment of numbers
to objects or events in such a way as to
represent specified attributes of the
objects
Attribute Dimension along which
individuals can be measured and along
which they vary
Measurement Error things that can
make measurement inaccurate
Measurement Overview
Because of measurement error, we
must carefully consider two important
measurement concerns:
Reliability
Validity
Reliability
The consistency or stability of a measure
It is imperative that a predictor be measured
reliably
Unsystematic measurement error renders a
measure unreliable
We cannot predict attitudes, performance, or
behaviors without reliable measurement
limit to validity
Test-Retest Reliability
Test-Retest reflects consistency of a
test over time
Stability coefficient
Administer test at time 1 and time 2 and
see if individuals have a similar rank order
at both time 1 and time 2
Mechanical Comprehension
at
Second Test Administration
at
First Test Administration
at
First Test Administration
at
Second Test Administration
Low
Low Reliability
Low
High
High
Low
High Reliability
Low
High
High
Parallel Forms Reliability

Parallel forms extent to which two
independent forms of a test are similar
measures of the same construct
Coefficient of equivalence
Two different forms of a final
Survey on paper and computer
Test for disabled applicants
Inter-Rater Reliability
Inter-Rater extent to which multiple
raters or judges agree on ratings made
about a person, thing or behavior
Examine the correlation between ratings of
two different judges rating the same person
Helps protect against interpersonal biases
Internal Consistency Reliability

Internal Consistency indication of
interrelatedness of items
Tells us how well items hang together
Split-half split test in half by odd and
even number questions
Inter-item look at relationships among
every item to test for consistency
(Cronbachs Alpha)
Rule of thumb for Reliability should be

greater than .70
Validity
Construct Validity extent to which a
test measures the underlying construct
it was intended to measure
Construct abstract quality that is not
observable and difficult to measure
Self-esteem, intelligence, cognitive ability
Validity Overview
Two types of evidence used to
demonstrate Construct Validity
Content Validity degree to which a test
covers a representative sample of
quality being assessed
Not established in a quantitative sense
Criterion-Related Validity degree to

which a test is a good predictor of
attitudes, behavior, or performance
Approaches to Criterion-Related
Validity
Predictive Validity extent to which scores
obtained at one point in time predict
criteria at some later time.
GREs, GPAs, research experience, etc.
predicting success in graduate school
Concurrent Validity extent to which a test

predicts a criterion that is measured at
same time as test
Want to see if newly developed selection tests
predict performance of current employees
Predictive Designs
Gather predictor data on all of the applicants.
Some of the applicants would be hired to fill the
open positions based on predictors that are
not part of our selection battery.
If we hire only high scorers on the new predictors,
then we will not be able to examine if low scorers are
unsuccessful on the job.
After months on the job, we gather performance

data, which serve as the criteria.
A validity coefficient is computed between the
predictor score and the criterion score that
indicates the strength of the relationship.
Concurrent Designs
Data on both predictors and criteria are
collected from incumbent employees at
the same time.
A validity coefficient is computed
between the predictor score and the
criterion score which indicates the
strength of the relationship
Excellent
Job
Performance
Cognitive Ability
Score
High
Low
High Validity
Excellent
Job
Performance
Cognitive Ability
Score
High
Poor
Low
Low Validity
Low
Element
Concurrent Design
Predictive Design
Participants
Incumbents
Applicants
Predictor
Time 1
Measurement
Time 2
Criterion
Time 1
Measurement
Time 2
Selection
Decision
Made prior to Time

1 and based on
other predictors
To be made between
Time 1 and Time 2
and to be based on
other predictors
Validity
Coefficient
Correlation
Between predictor
and criterion
Correlation between
predictor and criterion
Components of Construct Validity

Convergent Validity Measure is related
to other measures of similar constructs
Divergent Validity Measure is not
related to measures of dissimilar
constructs
These are demonstrated by using
concurrent and/or predictive validity
designs
Summary of Reliability Types and Validity Approaches

Test-Retest: Stability of test over time
Reliability
Parallel Forms: Equivalence of two test forms
Internal Consistency: Consistency among test items
Content: Test representative of domain

Construct
Validity
Predictive: Test scores predict future criterion

Divergent
Convergent
Criterion-Related
Concurrent: Test scores predict current criterion

Divergent
Convergent
Statistics Overview
Statistic An efficient device for
summarizing in numbers the values,
characteristics, or scores describing a
series of cases
Types of Stats
Measures of Central Tendency
Measures of Dispersion
Correlation and Regression
Meta-Analysis
Measures of Central Tendency

Characterize a typical member of the
group
Mode Most frequent score in a distribution
Best for categorical data
Median Score in the middle of a

distribution
Best when some numbers are outliers
Mean Arithmetic average of group of

scores
Most useful and common measure
Measures of Dispersion
Tell how closely scores are grouped
around the mean spreadoutedness
Range Spread of scores from low to
high
Variance More useful measure of
dispersion than the Range
Standard Deviation Square root of
variance, retains original metric of score
Shapes of Distributions
Normal Distribution Depicted by bellshaped curve. Most scores are around
the mean with fewer at the extremes of
the distribution
Lots of qualities/characteristics are
distributed normally (performance,
intelligence, height)
Use of Normal Distribution:
Calculate percentile score where person
ranks compared to population
2%
-3
13.5%
-2
34%
-1
34%
Mean
Number of standard deviations from the mean
13.5%
2%
Correlation
Correlation Coefficient Index of the
strength of relationship between two
variables (r)
Direction
Positive (elevator) vs. Negative (teeter-totter)
Magnitude
0 to +/- 1.00
J o b P e rf o rm a n c e
r = .00
r = -.70
r = .70
Regression
Regression Can predict one variable
from another
Regress the DV on the IV!!
Validity Coefficient just an r
Coefficient of Determination percentage
of variance in the criterion accounted for by
the predictor (r2)
Coefficient of Determination
(r 2 )
Mechanical
Comprehension
Job
Performance
r = .60
r 2 = .36
r
Figure 7.1: Selection Battery Example

Assessment
Center
Work
Sample
r2 = .25
r 2 = .36
Interview
r2 = .10
Criterion
Performance
Test Battery Multiple R = .74

Test Battery R
= .55
Meta-Analysis
Methodology used to do quantitative
literature reviews
Used to do simple narrative reviews
Combine the empirical findings to
quantify the relationship between two
variables
Not a panacea!!

Chapter Two: Methods, Measurement, and Stats in I/O Psychology

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chapter Two: Methods, Measurement, and Stats in I/O Psychology

Caricato da

Copyright:

Formati disponibili

Chapter Two

Scientists strive for an understanding of

What makes a good theory?

Chicken or Egg? Data or Theory?

The Cyclical Inductive-Deductive

Research Terminology and Basics

Independent and Dependent Variables

DV Variable of interest that we design

Extraneous Variable any other

Internal and External Validity

External Extent to which results obtained

Tradeoff between internal and external

Research Process Model

Design the Study

Collect the Data

Analyze the Data

Report the Findings

Try to balance both concerns

Field and Quasi-Experiments

Quasi-Experiment Field studies

Make use of available resources limits

Data Collection Techniques

Very beneficial in terms of description

Usually include both cross-sectional

Useful for three reasons:

Parallel Forms Reliability

Internal Consistency Reliability

Rule of thumb for Reliability should be

Criterion-Related Validity degree to

Concurrent Validity extent to which a test

After months on the job, we gather performance

Made prior to Time

Components of Construct Validity

Summary of Reliability Types and Validity Approaches

Parallel Forms: Equivalence of two test forms

Internal Consistency: Consistency among test items

Content: Test representative of domain

Predictive: Test scores predict future criterion

Concurrent: Test scores predict current criterion

Measures of Central Tendency

Median Score in the middle of a

Mean Arithmetic average of group of

Number of standard deviations from the mean

Figure 7.1: Selection Battery Example

Test Battery Multiple R = .74

Potrebbero piacerti anche