Sei sulla pagina 1di 17

Exam Preparation

Evgueni’s part
Question 1
1. As you probably know a learning
problem is said to be well-posed if and
only if the class of tasks T, the
performance measure P, and the
experience E are determined. Please
provide one learning problem described
in terms of T, P, and E. In addition,
provide a possible solution to the
problem.
A Possible Answer to Question 1
• Task T: To improve classification skills of
a medical doctor.
• Performance Measure P: the accuracy of
the doctor on new patient cases.
• Experience E: previous patient cases
considered by the doctor. Each case can
be: (1) positive if the doctor diagnosis was
correct, or (2) negative if the doctor
diagnosis was incorrect.
A Possible Answer to Question 1
• Experience E: previous patient cases
considered by the doctor. Each case can be: (1)
positive if the doctor diagnosis was correct, or
(2) negative if the doctor diagnosis was
incorrect.

No

Emergency C-Section was Class Incorrect


performed.
A Possible Answer to Question 1
Approach to the Learning Problem:
Given:
• Historical data D describing the
classification performance of the doctor
Find:
• A classifier C that predicts whether new
cases are correct or incorrect.
A Possible Answer to Question 1
• The classifier C can be used by the doctor
when s/he decides whether new cases are
correct or incorrect. If the doctor sees the
classifier probability of the class incorrect
is high he might re-consider her/his
classification.
Question 2
2. Consider two version spaces VS(D1) and
VS(D2). Derive the condition when:
VS(D1) ⊆ VS(D2).
Answer: VS(D1) ⊆ VS(D2) is equivalent to
the fact that any hypothesis h in VS(D1)
belongs to VS(D2). By the definition of
version spaces the latter is equivalent to
the fact that any hypothesis h in VS(D1)
is consistent with D2.
Question 3
The reduced-error pruning technique allows partially to
overcome the overfitting problem of decision trees. This
technique assumes that the training data is split randomly
into a growing set and a validation set. The growing set is
used for growing the decision tree and the validation set is
used for validating each step of decision tree pruning.
Please explain whether you expect the size of the pruned
decision tree to increase or decrease if we decrease the
size of the validation set (which also means that we
increase the size of the growing set)?
Results Known
+
Training set
Model
+
-
-
Builder
+

Tree Learning

Y N

Validation set
A Possible Answer to Question 3
The fact that the growing set is large
implies that the decision tree is large and
the validation set is small. Thus the
accuracy of classification nodes on the
validation set is low. This implies that we
prune more. Thus, the final decision tree
will be small. In addition the final decision
tree will be inaccurate since the
validation set is unrepresentative.
Question 4
Consider the following data table, describing people,
where ‘class’ (0 or 1) is the class of the instances for
training a classifier.
hair location children size SIN class
brown ottawa 3 big ‘650786281’ 0

blond toronto 3 small ‘568326546’ 1

brown ottawa 3 big ‘743284021’ 0

brown toronto 3 big ‘342140966’ 0

brown ottawa 3 big ‘167432928’ 0

brown toronto 3 small ‘789032643’ 1

blond ottawa 3 small ‘124780945’ 1

brown toronto 3 big ‘643826437’ 0

– Which attribute will be selected by the ID3 algorithm as the


root of the decision tree? Explain in your own words.
– Which attribute can be removed, and why? Explain in your
own words.
Answer to Question 4
• There are two possibilities for attribute choice:
(a) the attribute Size since it separates the
instances into the two classes and (b) the
attribute SIN since it has an unique value for
each training instance. Thus, ID3 has to make
an random choice in this case.
• The attribute Children can be safely removed
since it has the same value for all the attributes.
Question 5
4. Consider the ROC Space below. The classifiers that lie
on the diagonal (0,0)-(1,1) are random classifiers; i.e., their
accuracy rate is 0.5 if the number P of positive instances
equals the number N of negative instances . Prove that
analytically.

Predicted
True
pos neg
pos tpr fnr
neg fpr tnr
Answer to Question 5
The classifiers that lie on the diagonal (0,0)-(1,1) have the
property that tpr is equal to fpr. Since P = N, we have:
Ar 
tpr  P  tnr  N
 (tnr  1  fpr )
PN
tpr  P  (1  fpr )  N
 (tpr  fpr )
PN
tpr  P  (1  tpr )  N

PN
tpr  ( P  N )  N
 (P  N )
PN
N
 0 .5
PN
ROC Space
Question 6
Assume that we have an instance space described
by two discrete attributes and a training set
consisting of a single instance repeated 100 times.
In 80 of the 100 cases, the instance is labeled as
positive; in the other 20, it is labeled as negative.
What will be the posterior positive-class
probabilities that the Naïve Bayes classifier
provides for this instance, assuming that this
classifier has been trained using the 100-example
data set? Please explain your answer.
Answer to Question 6
• P(+) is 0.8; P(A1|+)=1.0; P(A2|+)=1.0
• P(-) is 0.2; P(A1|-)= 1.0; P(A2|-)= 1.0
Thus,
• P(+|A1,A2) =
= P(+) * P(A1,A2|+) / (P(A1,A2)
= P(+) * P(A1,A2|+) /
(P(A1,A2|+)*P(+)+ P(A1,A2|-)*P(-)) (total probs)
= P(+) * P(A1|+)*P(A2|+) /
(P(A1|+)*P(A2|+)* P(+) + P(A1|-)*P(A2|-)* P(-) )
= 0.8/(0.8+0.2)= 0.8

Potrebbero piacerti anche