Benvenuto in Scribd!

Ensemble Learning: David Sontag New York University

Caricato da

Il 0% ha trovato utile questo documento (0 voti)

10 visualizzazioni17 pagine

Ensemble Methods are methods that combine together many model predictions. For example, in Bagging (short for bootstrap aggregation), parallel models are constructed on m = many bootstrapped samples (eg., 50), and then the predictions from the m models are averaged to obtain the prediction from the ensemble of models. In this tutorial we walk through basics of three Ensemble Methods: Bagging, Random Forests, and Boosting.

Titolo originale

Lecture 13

Copyright

Formati disponibili

PDF, TXT o leggi online da Scribd

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Segnala questo documento

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Il 0% ha trovato utile questo documento (0 voti)

10 visualizzazioni17 pagine

Ensemble Learning: David Sontag New York University

Caricato da

isaias.prestes

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Salta alla pagina

Sei sulla pagina 1di 17

Cerca all'interno del documento

Ensemble learning

Lecture 13

David Sontag
New York University

Slides adapted from Navneet Goyal, Tan, Steinbach,

Kumar, Vibhav Gogate
Ensemble methods
Machine learning competition with a $1 million prize
Bias/Variance Tradeoff

Hastie, Tibshirani, Friedman “Elements of Statistical Learning” 2001

Reduce Variance Without
Increasing Bias
•  Averaging reduces variance:
(when predictions
are independent)

Average models to reduce model variance

One problem:
only one training set
where do multiple models come from?

4
Bagging: Bootstrap Aggregation
•  Leo Breiman (1994)
•  Take repeated bootstrap samples from training set
D
•  Bootstrap sampling: Given set D containing N
training examples, create D’ by drawing N
examples at random with replacement from D.

•  Bagging:
–  Create k bootstrap samples D1 … Dk.
–  Train distinct classifier on each Di.
–  Classify new instance by majority vote / average.
General Idea
Bagging
•  Sampling with replacement
Training Data
Data ID

•  Build classifier on each bootstrap sample

•  Each data point has probability (1 – 1/n)n
of being selected as test data
•  Training data = 1- (1 – 1/n)n of the original
data
The 0.632 bootstrap
•  This method is also called the 0.632 bootstrap
–  A particular training data has a probability of
1-1/n of not being picked
–  Thus its probability of ending up in the test
data (not selected) is:

–  This means the training data will contain

approximately 63.2% of the instances
9
decision tree learning algorithm; very similar to ID3

10
shades of blue/red indicate strength of vote for particular classification
Example of Bagging

Assume that the training data is:

+1 0.4 to 0.7: -1 +1

0.8 x
0.3

Goal: find a collection of 10 simple thresholding classifiers that

collectively can classify correctly.
- Each simple (or weak) classifier is:
(x<=K  class = +1 or -1 depending on
which value yields the lowest error; where K
is determined by entropy minimization)
Random Forests
•  Ensemble method specifically designed for
decision tree classifiers

•  Introduce two sources of randomness:

“Bagging” and “Random input vectors”
–  Bagging method: each tree is grown using a bootstrap
sample of training data
–  Random vector method: At each node, best split is
chosen from a random sample of m attributes instead
of all attributes
Random Forests
Methods for Growing the Trees
•  Fix a m <= M. At each node
–  Method 1:
•  Choose m attributes randomly, compute their information
gains, and choose the attribute with the largest gain to split
–  Method 2:
•  (When M is not very large): select L of the attributes
randomly. Compute a linear combination of the L attributes
using weights generated from [-1,+1] randomly. That is, new
A = Sum(Wi*Ai), i=1..L.
–  Method 3:
•  Compute the information gain of all M attributes. Select the
top m attributes by information gain. Randomly select one of
the m attributes as the splitting node.
Random Forest Algorithm:
method 1 in previous slide

16
Reduce Bias2 and Decrease
Variance?
•  Bagging reduces variance by averaging
•  Bagging has little effect on bias
•  Can we average and reduce bias?
•  Yes:

•  Boosting

Potrebbero piacerti anche

Bagging+Boosting+Gradient Boosting
Documento48 pagine
Bagging+Boosting+Gradient Boosting
Parimal Shivendu
100% (1)
Lecture 9 PDF
Documento28 pagine
Lecture 9 PDF
Sachin singh
100% (1)
Ensemble Classifiers
Documento37 pagine
Ensemble Classifiers
twinkz7
100% (1)
ML Cat 2 - 7
Documento30 pagine
ML Cat 2 - 7
Ved Prakash
Nessuna valutazione finora
Module 4
Documento30 pagine
Module 4
kobaya7455
Nessuna valutazione finora
DT-0 (3 Files Merged)
Documento143 pagine
DT-0 (3 Files Merged)
Qasim Abid
Nessuna valutazione finora
11 W11NSE6220 - Fall 2023 - Zeng
Documento43 pagine
11 W11NSE6220 - Fall 2023 - Zeng
rahul101056
Nessuna valutazione finora
WINSEM2021-22 CSE4020 ETH VL2021220502460 Reference Material I 02-03-2022 Ensemble Classifier Intro 22
Documento37 pagine
WINSEM2021-22 CSE4020 ETH VL2021220502460 Reference Material I 02-03-2022 Ensemble Classifier Intro 22
STYX
100% (1)
Decision Tree
Documento18 pagine
Decision Tree
Rithvik Dadapuram
Nessuna valutazione finora
Machine Learning: Classification & Decision Trees
Documento24 pagine
Machine Learning: Classification & Decision Trees
Nguyen Ba Quan
Nessuna valutazione finora
Ensemble V2 1
Documento19 pagine
Ensemble V2 1
Aradhya
Nessuna valutazione finora
ML QB Solutionss
Documento16 pagine
ML QB Solutionss
Kunj Trivedi
Nessuna valutazione finora
Lec 18
Documento34 pagine
Lec 18
ABHIRAJ E
100% (1)
Combining Classifiers: Outline
Documento15 pagine
Combining Classifiers: Outline
jawad_shah
Nessuna valutazione finora
Lecture 05 Random Forest 07112022 124639pm
Documento25 pagine
Lecture 05 Random Forest 07112022 124639pm
Misbah
Nessuna valutazione finora
Machine Learning
Documento33 pagine
Machine Learning
shobhit
Nessuna valutazione finora
NLP Chapter 2
Documento79 pagine
NLP Chapter 2
ai20152023
Nessuna valutazione finora
COMP 6930 Topic01 Classification Basics
Documento190 pagine
COMP 6930 Topic01 Classification Basics
Person
Nessuna valutazione finora
DS4 - CLS-Decision Tree
Documento32 pagine
DS4 - CLS-Decision Tree
Văn Hoàng Trần
Nessuna valutazione finora
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
Documento36 pagine
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
Pallav Anand
Nessuna valutazione finora
Basic Ensemble Learning (Random Forest, AdaBoost, Gradient Boosting) - Step by Step Explained
Documento8 pagine
Basic Ensemble Learning (Random Forest, AdaBoost, Gradient Boosting) - Step by Step Explained
Luciano
100% (1)
Module 1 ML Mumbai University
Documento47 pagine
Module 1 ML Mumbai University
2021.shreya.pawaskar
Nessuna valutazione finora
6.2 K Means
Documento23 pagine
6.2 K Means
Matrix Bot
Nessuna valutazione finora
Decision Trees
Documento37 pagine
Decision Trees
Dennis Angel
Nessuna valutazione finora
Class Basic
Documento75 pagine
Class Basic
Basant Yadav
Nessuna valutazione finora
Unit-6: Classification and Prediction
Documento63 pagine
Unit-6: Classification and Prediction
Nayan Patel
Nessuna valutazione finora
Lec 6
Documento15 pagine
Lec 6
Lyu Philip
Nessuna valutazione finora
Random Forest
Documento10 pagine
Random Forest
noname
Nessuna valutazione finora
DWDM PPT
Documento35 pagine
DWDM PPT
Rakesh Kumar
Nessuna valutazione finora
Data Mining Ensemble Learning: Gergely Lukács
Documento33 pagine
Data Mining Ensemble Learning: Gergely Lukács
Blazs
Nessuna valutazione finora
15 dm2 Imbalanced Learning 2022 23
Documento35 pagine
15 dm2 Imbalanced Learning 2022 23
nimra
Nessuna valutazione finora
Random Forest
Documento18 pagine
Random Forest
yaswanth kumar vitta
Nessuna valutazione finora
Week 7 - Tree-Based Model
Documento8 pagine
Week 7 - Tree-Based Model
Nguyễn Trường Sơn
100% (1)
BDT KSETA Freudenstadt
Documento32 pagine
BDT KSETA Freudenstadt
Picasbrancas
Nessuna valutazione finora
158 9 (Clustering)
Documento36 pagine
158 9 (Clustering)
Hendy Kurniawan
Nessuna valutazione finora
Introduction To Multiple Imputation: Francis Bursa
Documento16 pagine
Introduction To Multiple Imputation: Francis Bursa
Dorice Respick
Nessuna valutazione finora
Unit5 - Unsupervised Learning
Documento48 pagine
Unit5 - Unsupervised Learning
Soumya Mishra
Nessuna valutazione finora
w2 - Fundamentals of Learning
Documento37 pagine
w2 - Fundamentals of Learning
Swastik Sindhani
Nessuna valutazione finora
ML Unit 3
Documento14 pagine
ML Unit 3
aiswarya
Nessuna valutazione finora
ML (Interview)
Documento20 pagine
ML (Interview)
ratnadepp
Nessuna valutazione finora
Mod 3 AIML QB With Answers
Documento26 pagine
Mod 3 AIML QB With Answers
Dhathri Reddy
Nessuna valutazione finora
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
Documento34 pagine
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
Vaibhav Gupta
Nessuna valutazione finora
A Preliminary Idea On Machine Learning
Documento40 pagine
A Preliminary Idea On Machine Learning
Avijit Bose
Nessuna valutazione finora
Classification
Documento40 pagine
Classification
Syed Hamza Ibrar Shah
Nessuna valutazione finora
Random Forests 2
Documento43 pagine
Random Forests 2
sidikikalako
Nessuna valutazione finora
SVM ML
Documento29 pagine
SVM ML
Dilip Singh
Nessuna valutazione finora
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
Documento21 pagine
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
Harini
Nessuna valutazione finora
Decision Trees - 2022
Documento49 pagine
Decision Trees - 2022
Soubhav Chaman
Nessuna valutazione finora
E-Assessment & Learning Analytics: 10. Clustering, Classification & Prediction
Documento40 pagine
E-Assessment & Learning Analytics: 10. Clustering, Classification & Prediction
fabian plett
Nessuna valutazione finora
ML Notes
Documento79 pagine
ML Notes
Ina Vlad
Nessuna valutazione finora
Unit 2
Documento40 pagine
Unit 2
vamsi kiran
Nessuna valutazione finora
Lecture+Notes+-+Random Forests
Documento10 pagine
Lecture+Notes+-+Random Forests
samrat141988
Nessuna valutazione finora
Basic Concepts and K-Nearest Neighbors: Pekka Parviainen
Documento40 pagine
Basic Concepts and K-Nearest Neighbors: Pekka Parviainen
Val
Nessuna valutazione finora
CS109a Lecture16 Bagging RF Boosting
Documento48 pagine
CS109a Lecture16 Bagging RF Boosting
Teshome Mulugeta
Nessuna valutazione finora
Decision Tree
Documento68 pagine
Decision Tree
2K20CO345 Priti Meena
Nessuna valutazione finora
Artificial Intelligence: Slide 6
Documento42 pagine
Artificial Intelligence: Slide 6
fatima hussain
Nessuna valutazione finora
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
Documento49 pagine
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
Arun Kumar
Nessuna valutazione finora
Bagging and Boosting
Documento19 pagine
Bagging and Boosting
Ana-Cosmina Popescu
100% (1)
C4.5 and CHAID Algorithm: Pavan J Joshi 2010MCS2095 Special Topics in Database Systems
Documento30 pagine
C4.5 and CHAID Algorithm: Pavan J Joshi 2010MCS2095 Special Topics in Database Systems
Fidia Dta
Nessuna valutazione finora
Alternating Decision Tree: Fundamentals and Applications
Da Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
Nessuna valutazione finora
How To Weld - MIG Welding - 11 Steps (With Pictures) - Instructables
Documento19 pagine
How To Weld - MIG Welding - 11 Steps (With Pictures) - Instructables
isaias.prestes
Nessuna valutazione finora
STICK WELDING 101 - Getting Started With SMAW
Documento5 pagine
STICK WELDING 101 - Getting Started With SMAW
isaias.prestes
Nessuna valutazione finora
Bitbucket Branch Management - A Comprehensive Guide - by Goutam Das
Documento12 pagine
Bitbucket Branch Management - A Comprehensive Guide - by Goutam Das
isaias.prestes
Nessuna valutazione finora
Bagging and Boosting: Amit Srinet Dave Snyder
Documento33 pagine
Bagging and Boosting: Amit Srinet Dave Snyder
isaias.prestes
Nessuna valutazione finora
6 Welding Tips and Tricks - How To Weld The Right Way
Documento12 pagine
6 Welding Tips and Tricks - How To Weld The Right Way
isaias.prestes
Nessuna valutazione finora
Welding 101 For Hobbyists (And Nerds!) - Practical Engineering
Documento6 pagine
Welding 101 For Hobbyists (And Nerds!) - Practical Engineering
isaias.prestes
Nessuna valutazione finora
Beginner's Guide To Welding 101 - All You Need To Know PDF
Documento60 pagine
Beginner's Guide To Welding 101 - All You Need To Know PDF
isaias.prestes
Nessuna valutazione finora
09 Bayesian FIL2011May
Documento26 pagine
09 Bayesian FIL2011May
isaias.prestes
Nessuna valutazione finora
Lecture 20: Bagging, Random Forests, Boosting: Reading: Chapter 8
Documento53 pagine
Lecture 20: Bagging, Random Forests, Boosting: Reading: Chapter 8
isaias.prestes
Nessuna valutazione finora
Gradient Boosting From Scratch
Documento10 pagine
Gradient Boosting From Scratch
isaias.prestes
Nessuna valutazione finora
Bayesian Statistical Concepts Bielefeld2
Documento35 pagine
Bayesian Statistical Concepts Bielefeld2
isaias.prestes
Nessuna valutazione finora
Boosting and AdaBoost For Machine Learning
Documento18 pagine
Boosting and AdaBoost For Machine Learning
isaias.prestes
Nessuna valutazione finora
Ensemble Methods
Documento15 pagine
Ensemble Methods
isaias.prestes
Nessuna valutazione finora
Online Mini-Lecture:: DR Andy Evans
Documento10 pagine
Online Mini-Lecture:: DR Andy Evans
isaias.prestes
Nessuna valutazione finora
737 Flight Management Computer Ops Manual: SECTION 1 - General Information
Documento52 pagine
737 Flight Management Computer Ops Manual: SECTION 1 - General Information
Neeth
100% (3)
Statics: Content Summary
Documento4 pagine
Statics: Content Summary
isaias.prestes
Nessuna valutazione finora
Medical Advice Tutorial
Documento4 pagine
Medical Advice Tutorial
isaias.prestes
Nessuna valutazione finora
Software - AvaxHome
Documento11 pagine
Software - AvaxHome
isaias.prestes
Nessuna valutazione finora
Paper 233-30 An Introduction To SAS® Character Functions (Including Some New SAS®9 Functions) Ronald Cody, Ed.D
Documento15 pagine
Paper 233-30 An Introduction To SAS® Character Functions (Including Some New SAS®9 Functions) Ronald Cody, Ed.D
isaias.prestes
Nessuna valutazione finora
Nome Autor Website Boeing 787 Mcdonnell-Douglas D
Documento9 pagine
Nome Autor Website Boeing 787 Mcdonnell-Douglas D
isaias.prestes
Nessuna valutazione finora
Read First
Documento3 pagine
Read First
isaias.prestes
Nessuna valutazione finora
Time Series Analysis Homework Solutions
Documento6 pagine
Time Series Analysis Homework Solutions
afeuqtrir
100% (1)
BLAST Glossary With Highlights
Documento9 pagine
BLAST Glossary With Highlights
imran47
Nessuna valutazione finora
The Ubiquitous Matched Filter A Tutorial and Application To Radar Detection
Documento18 pagine
The Ubiquitous Matched Filter A Tutorial and Application To Radar Detection
lerhlerh
Nessuna valutazione finora
Maths - Jee Main (Jan) - 2024 - 27-01-2024 - F.N (S-1) Memory Based Questions
Documento12 pagine
Maths - Jee Main (Jan) - 2024 - 27-01-2024 - F.N (S-1) Memory Based Questions
Gopi Sowry
Nessuna valutazione finora
AI Handout
Documento26 pagine
AI Handout
Markapuri Santhoash kumar
Nessuna valutazione finora
Spring Element Best Practice
Documento9 pagine
Spring Element Best Practice
jvo917
Nessuna valutazione finora
2.8 Notes Rolle's Theorem and The Mean Value Theorem (MVT)
Documento2 pagine
2.8 Notes Rolle's Theorem and The Mean Value Theorem (MVT)
ratna777
Nessuna valutazione finora
Simo Särkkä and Arno Solin - Applied Stochastic Differential Equations (2019, Cambridge University Press)
Documento324 pagine
Simo Särkkä and Arno Solin - Applied Stochastic Differential Equations (2019, Cambridge University Press)
Ravi Verma
Nessuna valutazione finora
Ackerman Proof Caylay
Documento4 pagine
Ackerman Proof Caylay
zulqar8228
Nessuna valutazione finora
Applications in Insurance and Actuarial Science: Typesetting Date: August 25, 2020 211
Documento13 pagine
Applications in Insurance and Actuarial Science: Typesetting Date: August 25, 2020 211
Aschalew Ye Giwen Liji
Nessuna valutazione finora
Pre Lab
Documento13 pagine
Pre Lab
Axel Cespedes
Nessuna valutazione finora
Decision Theory Model - Lec03.1
Documento29 pagine
Decision Theory Model - Lec03.1
Libres Twin
100% (1)
Route Assignment: FIRST SEMESTER AY 2019-2020
Documento4 pagine
Route Assignment: FIRST SEMESTER AY 2019-2020
Rafaella Manalo
Nessuna valutazione finora
Algorithms PDF
Documento6 pagine
Algorithms PDF
Mulika Fatima
Nessuna valutazione finora
【2023】热点文章 Mamba Linear-Time Sequence Modeling with Selective State Spaces
Documento37 pagine
【2023】热点文章 Mamba Linear-Time Sequence Modeling with Selective State Spaces
meng775928yu
Nessuna valutazione finora
Unit 4 Machine Learning Tools, Techniques and Applications
Documento78 pagine
Unit 4 Machine Learning Tools, Techniques and Applications
Jyothi Pulikanti
Nessuna valutazione finora
Fcteg 02 632417
Documento5 pagine
Fcteg 02 632417
mda mps
Nessuna valutazione finora
Flexmix Intro
Documento18 pagine
Flexmix Intro
mostafa farouk
Nessuna valutazione finora
Btech Cs 5 Sem Machine Learning Techniques Kcs055 2023
Documento1 pagina
Btech Cs 5 Sem Machine Learning Techniques Kcs055 2023
Arun Choudhary
Nessuna valutazione finora
Positive Definite Matrix: Chia-Ping Chen
Documento33 pagine
Positive Definite Matrix: Chia-Ping Chen
Muhammad Kaleem
Nessuna valutazione finora
Chapter 2 - Solution: Part (A)
Documento8 pagine
Chapter 2 - Solution: Part (A)
Arjun Vk
100% (1)
Harris Hawks Optimization (HHO) Algorithm and Applications
Documento63 pagine
Harris Hawks Optimization (HHO) Algorithm and Applications
alih
Nessuna valutazione finora
Chapter 2 Analysis and Discription of Dds
Documento30 pagine
Chapter 2 Analysis and Discription of Dds
Anonymous AFFiZn
Nessuna valutazione finora
Segment 2 Collaberation Assignment AP Calculus BC
Documento5 pagine
Segment 2 Collaberation Assignment AP Calculus BC
Tom Piontek
Nessuna valutazione finora
Simulation Premium Nonlinear and Dynamics Package - Training - SOLIDWORKS PDF
Documento3 pagine
Simulation Premium Nonlinear and Dynamics Package - Training - SOLIDWORKS PDF
Ajay Kumthekar
Nessuna valutazione finora
Exploration and Prediction of Fluid Dynamical Systems Using
Documento22 pagine
Exploration and Prediction of Fluid Dynamical Systems Using
Ashwani Singh
Nessuna valutazione finora
PGP in Data Science and AI With Fellowship
Documento14 pagine
PGP in Data Science and AI With Fellowship
Nilesh Roy
Nessuna valutazione finora
Specifications: Class Class From To Constructor
Documento3 pagine
Specifications: Class Class From To Constructor
Debanik Majumder
33% (6)
ch04 UTILITY MAXIMIZATION
Documento38 pagine
ch04 UTILITY MAXIMIZATION
Siti Rahmayani Maja
Nessuna valutazione finora
An Improved Pegasus Method For Root Finding : Abstract
Documento5 pagine
An Improved Pegasus Method For Root Finding : Abstract
Anonymous F1UlyJbqQ
100% (1)