Sei sulla pagina 1di 15

Using a

Validation Set
After this video you will be able to..
• Describe how a validation set can be used
to avoid overfitting
• Articulate how training, validation, and test
sets are used
• List three ways that validation can be
performed
Training vs. Testing Phases
Training
Data Build
Model
Model
Learning
Algorithm Training Phase

Test
Data Apply
Results
Model
Model
Testing Phase
Avoiding Overfitting
Overly complex model
Overfitting
When to stop
training before
model gets
too complex?
Validation Set
Training Test
Data Data

Training Validation
Data Data

Used to determine when to stop


training to avoid overfitting
Training & Validation Errors

Validation
Error Rate

Error

Training
Error
Number of nodes
When to Stop Training

Validation
Error Rate

Error

Training
Error
Number of nodes
Stop training here
Ways to Create & Use Validation Set

• Holdout method
• Random subsampling
• K-fold cross-validation
• Leave-one-out cross-validation
Holdout Method
Training Data
All data available for building model
Validation Data

Used for Holdout set used to


training model determine when training
should stop
Repeated Holdout
Training Data

Validation Data
• Repeating holdout method
several times
• Randomly select different hold
out set each iteration
• Average validation errors over
all repetitions
K-Fold Cross-Validation
Training Data
Iter 1

Validation Data
Iter 2

Iter 3

Iter k
Leave-One-Out Cross-Validation
Training Data
Iter 1
Validation Data
Iter 2
N = #samples in dataset

Iter 3

Iter N
Uses of Validation Set
Validation
Data

• Uses:
• Address overfitting
• Estimate generalization
performance
Datasets Cannot be
used in any
Training Validation Test way in model
Data Data Data creation!

Adjust model Determine Evaluate


parameters when to stop performance
training (avoid on new data
overfitting)

Estimate
generalization
performance
Training
Validation Set Data
Summary Validation
Data
• Datasets: training, validation, test Test
Data
• Validation set: avoid overfitting,
estimate generalization
• Using validation: holdout,
repeated holdout, cross-
validation (k-fold, leave-one-out)

Potrebbero piacerti anche