Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
2
Introduction to machine learning, Pazmany University 2017-2018
3
Introduction to machine learning, Pazmany University 2017-2018
4
Introduction to machine learning, Pazmany University 2017-2018
5
Introduction to machine learning, Pazmany University 2017-2018
6
Introduction to machine learning, Pazmany University 2017-2018
7
Introduction to machine learning, Pazmany University 2017-2018
8
Introduction to machine learning, Pazmany University 2017-2018
10
Introduction to machine learning, Pazmany University 2017-2018
11
Introduction to machine learning, Pazmany University 2017-2018
12
Introduction to machine learning, Pazmany University 2017-2018
Supervised:
Unsupervised:
Supervised:
Classification and regression Classification: A classification problem is when the output variable is
a category, such as “red” or “blue” or “disease” and “no disease”.
Unsupervised:
Clustering: A clustering problem is where you want to discover the
Clustering and association inherent groupings in the data, such as grouping customers by
purchasing behavior.
Supervised Classification
The most important conquests of deep learning comes from supervised calssifcation.
15
Introduction to machine learning, Pazmany University 2017-2018
Supervised Regression
Our aim is to predict a target value (Y) based on our data
(X)
L1 error
L2 error 16
Introduction to machine learning, Pazmany University 2017-2018
One has to gather the large amount of data to ensure the coincidence by luck
17
Introduction to machine learning, Pazmany University 2017-2018
One has to gather the large amount of data to ensure the coincidence by luck
18
Introduction to machine learning, Pazmany University 2017-2018
1, Gather Data
3, Data preparation
4, Choosing a model
6, Parameter tuning
7, Go back to point 1
19
Introduction to machine learning, Pazmany University 2017-2018
Gathreing data
Lets have an example dataset:
20
Introduction to machine learning, Pazmany University 2017-2018
The dataset contains 506 rows (different data points) and 14 columns
per capita crime rate by town, proportion of residential land zoned for lots over 25,000 sq.ft., proportion of
non-retail business acres per town, Charles River dummy variable (= 1 if tract bounds river; 0 otherwise),
nitric oxides concentration (parts per 10 million), average number of rooms per dwelling, proportion of
owner-occupied units built prior to 1940, weighted distances to five Boston employment centers, index of
accessibility to radial highways, full-value property-tax rate per $10,000, pupil-teacher ratio by town,
1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town, % lower status of the population
Data preparation
Write out some values:
22
Introduction to machine learning, Pazmany University 2017-2018
Data preparation
Plot some values as histograms:
23
Introduction to machine learning, Pazmany University 2017-2018
Data preparation
Plot some values related to the average price:
24
Introduction to machine learning, Pazmany University 2017-2018
Data preparation
Plot correlation between the different features:
25
Introduction to machine learning, Pazmany University 2017-2018
The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)
The variance is error from sensitivity to small fluctuations in the training set. High variance can cause an
algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).
If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model
26
Introduction to machine learning, Pazmany University 2017-2018
The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)
The variance is error from sensitivity to small fluctuations in the training set. High variance can cause an
algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).
If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model
27
Introduction to machine learning, Pazmany University 2017-2018
The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)
The variance is error from sensitivity to small fluctuations in the training set. High variance can cause an
algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).
If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model
Let's reserve some of our data to see how general our model is:
Once we have a good model we should re-run the training and evaluation on differently divided dataset
and the accuracy should be consistent. This is called cross-validation
28
Introduction to machine learning, Pazmany University 2017-2018
Scaling feautres
The features in our data can have orders of different magnitudes
Even those which can learn these differences work faster and better with scaled data
Let's change all features to ensure that all of them has zero mean and variance of 1
29
Introduction to machine learning, Pazmany University 2017-2018
Detecting outliers
It really helps to have a look at the data:
The values are also thresholded in the test data in this case, so it is good if the model could learn this fact
30
Introduction to machine learning, Pazmany University 2017-2018
Linear regression
We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b
According to w and b
31
Introduction to machine learning, Pazmany University 2017-2018
Linear regression
We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b
32
Introduction to machine learning, Pazmany University 2017-2018
Linear regression
We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b
33
Introduction to machine learning, Pazmany University 2017-2018
We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b
34
Introduction to machine learning, Pazmany University 2017-2018
Ridge and ElasticNet uses more complex penalization on the parameters depending on the
data (Thikonov regularization)
35
Introduction to machine learning, Pazmany University 2017-2018
Non-linear regression
Where we can derivate the model according to all the parameters and we will have an equation
for every parameter
We could fit any model if it is known, but unfortunately in practice the problem is
that we do not know the model
How could we create a general model, which could approximate any possible
models
36
Introduction to machine learning, Pazmany University 2017-2018
Non-linear regression
37
Introduction to machine learning, Pazmany University 2017-2018
Ensemble regression
http://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html
38
Introduction to machine learning, Pazmany University 2017-2018
Ensemble regression
39
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
Linear regression:
C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)
40
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
Linear regression:
C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)
41
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
Linear regression:
C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)
f((w3f((w2f((w1x)+b1))+b2))+b3)
42
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
43
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
- Backpropagation algorithm
44
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
- Backpropagation algorithm
45
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
- Backpropagation algorithm
46
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
Regularization:
Using batches
Adding Dropout
47
Introduction to machine learning, Pazmany University 2017-2018
Neural networks
48