Decision Tree PDF

Decision Trees
Decision trees are a representation for classification.

• The root is labelled by an attribute.
• Edges are labeled by attribute values.
• Edges go to decision trees or leaves.
• Each leaf is labelled by a class.
windy
true false
outlook temp
sunny overcast rain hot mild cool
humidity good bad outlook outlook good
high normal sunny overcast rain s o r
bad good bad good ??? bad ??? good
TDIDT: Top-Down Induction of Decision Trees

Growth Phase: The tree is constructed top-down.
• Find the “best” attribute.
• Partition examples based on the attribute’s values.
• Apply the method to each partition.
Pruning Phase: The tree is pruned bottom-up.
• For each node, keep subtree or change to leaf.
• Choose by comparing estimated error.
Algorithm for Growing Decision Trees
Grow-DT(examples)
1. N ← a new node
2. N.class ← most common class in examples
3. N.test ← best attribute (or test)
4. if N.test is not good enough
5. then mark N as a leaf and return N
6. for each value vj of N.test
7. examplesj ← examples with N.test = vj
8. if examplesj is empty
9. then N.branchj ← N.class
10. else N.branchj ← Grow-DT(examplesj )
11. return N
Measuring Information
Information gain is a popular way to select an attribute.
Let I(p, n) be the information in p positive examples and
n negative examples.
p p n n
I(p, n) = − log − log
p+n 2p+n p+n 2p+n
Suppose there are pi positive and ni negative examples
for the ith value of an attribute. Then information gain
G(p1, n1, p2, n2) can be defined as:
G(p1, n1, p2, n2) = I(p1 + p2, n1 + n2) =
− p1+np11+n 1
+p2 +n2 I(p 1 , n 1 ) − p2 +n2
p1 +n1 +p2 +n2 I(p2 , n2 )
This graph shows I(p, n) assuming p + n = 100.
I(p, n=100-p)
1
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100
p
This graph shows G(p1, n1, p2, n2) assuming p1 + n1 = 50

and p2 + n2 = 50.
G(p1, n1=50-p1, p2, n2=50-p2)

1
0.5
50
40
30 50
20 30 40
p2 10 10 20
0 0
p1
Example of Attribute Selection
9 good, 5 bad 9 good, 5 bad
Outlook Temp
Sunny Rain Cool Hot
Overcast Mild
2 good 4 good 3 good 3 good 4 good 2 good
3 bad 0 bad 2 bad 1 bad 2 bad 2 bad
G(Outlook) ≈ 0.246 G(Temp) ≈ 0.029
b
9 good, 5 bad 9 good, 5 bad

Humidity Wind
Normal High True False
6 good 3 good 3 good 6 good

1 bad 4 bad 3 bad 2 bad
G(Humidity) ≈ 0.152 G(Wind) ≈ 0.048
Outlook has the highest gain.

Overcast branch is pure.
Need to construct DTs for Sunny and Rain branches.
Comments on Growing Decision Trees
Implicit preference for small trees.
Handling numeric attributes:
Find the test ≤ x that maximizes gain.
Handling missing values: Alternatives:
Treat missing values as separate values.
Weight example across branches.
Addressing “costs” of attributes:

Attribute might have different costs to obtain.
Include cost in attribute measure.
There are alternative attribute measures.

• Information Gain Ratio (for > 2 branches)
G(A)/I(p1 + n1, . . . , pv + nv )
• Gini Index (use this in place of I)

 
2  
2
p n
Gini(p, n) = 1 −   − 
   
  
p+n p+n

• Chi-Squared Statistic
2 2
v (p j − p s j ) (n j − n s j )
χ2 = Σ +
j=1 p sj n sj
where sj = (pj + nj )/(p + n)
Overfitting
A hypothesis h overfits the training data if there
is a hypothesis h0 that is worse on the training
data, but better over the whole distribution.
Reasons for overfitting:
• Noise
• Coincidence
• Lack of Data
• Boundary Approximation
This decision tree has no errors on the examples.

A Decision Tree for Glass2
2.5
1.5
Aluminum
0.5
0
1.51 1.515 1.52 1.525 1.53 1.535
Refractive Index
In this region, note the boxes with only one example.
A Decision Tree for Glass2
1.8
1.7
1.6
1.5
Aluminum
1.4
1.3
1.2
1.1
1
1.5155 1.516 1.5165 1.517 1.5175 1.518 1.5185 1.519 1.5195
Refractive Index
Avoiding Overfitting by Pruning

For decision trees, try to avoid overfitting by trading off
smaller trees for small increases in training error.
Preferring smaller trees is justified by Occam’s Razor.
Modifying trees to make them smaller is called pruning.
• Prepruning: Avoid creation of subtrees based on num-
ber of examples or attribute relevance.
• Postpruning: Create overfitting DT and substitute
subtrees with leaves if estimated error is reduced.
Prepruning Example (c4.5)
Prepruned Decision Tree for Glass2
2.5
1.5
Aluminum
0.5
0
1.51 1.515 1.52 1.525 1.53 1.535
Refractive Index
Unpruned Example (c4.5 -m 1)

Unpruned Decision Tree for Iris2
2.5
2
Petal Width
1.5
0.5
0
1 2 3 4 5 6 7
Petal Length
Pruned Example (c4.5 -m 1)
Pruned Decision Tree for Iris2
2.5
2
Petal Width
1.5
0.5
0
1 2 3 4 5 6 7
Petal Length
Postpruning Algorithm
Prune-DT(N : node, examples)
1. leaferr ← number of examples 6= N.class
2. revise leaferr upward if examples were training set
3. if N is a leaf then return leaferr
4. treeerr ← 0
5. for each value vj of N.test
6. examplesj ← examples with N.test = vj
7. suberr ← Prune-DT(N.branchj , examplesj )
8. treeerr ← treeerr + suberr
9. if leaferr ≤ treeerr
10. then make N a leaf and return leaferr
11. else return treeerr
Comments on Pruning
The training and validation set approach is:
Remove “validation” exs. from training exs.
Grow decision tree using training exs.
Prune decision tree using validation set.
Subtree raising is replacing a tree with one of
its subtrees.
Rule post-pruning as described in the book is
performed by the C4.5rules program, not C4.5.

Decision Tree PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Decision Tree PDF

Caricato da

Copyright:

Formati disponibili

Decision Trees

Decision trees are a representation for classification.

TDIDT: Top-Down Induction of Decision Trees

This graph shows G(p1, n1, p2, n2) assuming p1 + n1 = 50

G(p1, n1=50-p1, p2, n2=50-p2)

9 good, 5 bad 9 good, 5 bad

6 good 3 good 3 good 6 good

Outlook has the highest gain.

Addressing “costs” of attributes:

There are alternative attribute measures.

• Gini Index (use this in place of I)

This decision tree has no errors on the examples.

Avoiding Overfitting by Pruning

Unpruned Example (c4.5 -m 1)

Potrebbero piacerti anche