Sei sulla pagina 1di 29

What is machine learning..?

Machine Learning is the science of getting


computers to learn and act like humans do,
and improve their learning over time in
autonomous fashion, by feeding them data
and information in the form of observations
and real-world interactions.

“Machine Learning at its most basic is the


practice of using algorithms to parse data, learn
from it, and then make a determination or
prediction about something in the world.”

There are many algorithm for getting machines


to learn, from using basic decision trees to
clustering to layers of artificial neural networks
depending on what task you’re trying to
accomplish and the type and amount of data that
you have available.

There are three types of machine learning


1. Supervised Machine Learning
2. Unsupervised Machine Learning
3. Reinforcement Machine Learning
Supervised Machine Learning
It is a type of learning in which both input and
desired output data are provided. Input and
output data are labeled for classification to
provide a learning basis for future data
processing.This algorithm consist of a target /
outcome variable (or dependent variable) which
is to be predicted from a given set of predictors
(independent variables). Using these set of
variables, we generate a function that map inputs
to desired outputs. The training process
continues until the model achieves a desired
level of accuracy on the training data.

Algorithms for Supervised Machine Learning


1. Linear regression
2. Logistic regression
3. Support Vector Machines
4. Naive Bayes
5. K-nearest neighbour algorithm
6. Random Forest Algorithm

Application of Supervised Machine Learning


1. Bioinformatics
2. Quantitative structure
3. Database marketing
4. Handwriting recognition
5. Information retrieval
6. Learning to rank
7. Information extraction
8. Object recognition in computer vision
9. Optical character recognition
10. Spam detection
11. Pattern recognition
Unsupervised Machine Learning
Unsupervised learning is the training of an
algorithm using information that is neither
classified nor labeled and allowing the algorithm
to act on that information without guidance.The
main idea behind unsupervised learning is to
expose the machines to large volumes of varied
data and allow it to learn and infer from the data.
However, the machines must first be
programmed to learn from data.

Unsupervised learning problems can be


further grouped into clustering and
association problems.

Clustering: A clustering problem is where you


want to discover the inherent groupings in the
data, such as grouping customers by purchasing
behaviour.

Association: An association rule learning


problem is where you want to discover rules that
describe large portions of your data, such as
people that buy X also tend to buy Y.

Algorithm for Unsupervised Machine


Learning:
1. K-means Algorithm
2. Apriori Algorithm
3. Expectation–maximization algorithm (EM)
4. Principal Component Analysis (PCA)
Application of Unsupervised Machine
Learning
1. Human Behaviour Analysis
2. Social Network Analysis to define groups of
friends.
3. Market Segmentation of companies by
location, industry, vertical.
4. Organizing computing clusters based on
similar event patterns and processes.

Reinforcement Machine Learning


Reinforcement Learning is a type of Machine
Learning which allows machines to automatically
determine the ideal behaviour within a specific
context, in order to maximize its performance.
Simple reward feedback is required for the agent
to learn its behaviour; this is known as the
reinforcement signal.It differs from standard
supervised learning, in that correct input/output
pairs need not be presented, and sub-optimal
actions need not be explicitly corrected. Instead
the focus is on performance, which involves
finding a balance between exploration of
uncharted territory and exploitation of current
knowledge

Algorithms for Reinforcement Machine


Learning

1. Q-Learning Algorithm
2. State–action–reward–state–action Algorithm
(SARSA)
3. Deep Q Network Algorithm (DQN)
4. Deep Deterministic Policy Gradient Algorithm
(DDPG)
Application of Reinforcement Machine
Learning
1. Resources management in computer clusters
2. Traffic Light Control
3. Robotics
4. Web System Configuration
5. Personalized Recommendations
6. Deep Learning

Linear Regression
It is a basic and commonly used type of
predictive analysis. These regression estimates
are used to explain the relationship between one
dependent variable and one or more
independent variables.
Y = a + bX
where
Y – Dependent Variable
a – intercept
X – Independent variable
b – Slope

Example:
University GPA' = (0.675)(High School GPA) +
1.097
Code for Linear Regression
x_train <-
input_variables_values_training_datasets
y_train <-
target_variables_values_training_datasets
x_test <-input_variables_values_test_datasets
x <- cbind(x_train,y_train)
linear <- lm(y_train ~ ., data = x)
predicted= predict(linear,x_test)

Logistic Regression
It’s a classification algorithm, that is used where
the response variable is categorical. The idea of
Logistic Regression is to find a relationship
between features and probability of particular
outcome.

odds= p(x)/(1-p(x)) = probability of event


occurrence / probability of not event occurrence

Example
When we have to predict if a student passes or
fails in an exam when the number of hours spent
studying is given as a feature, the response
variable has two values, pass and fail.
Code for Logistic Regression
x <- cbind(x_train,y_train)
logistic <- glm(y_train ~ ., data =
x,family='binomial')
predicted= predict(logistic,x_test)

Support Vector Machine


Support Vector Machines are perhaps one of the
most popular and talked about machine learning
algorithms.It is primarily a classier method that
performs classification tasks by constructing
hyperplanes in a multidimensional space that
separates cases of different class labels. SVM
supports both regression and classification tasks
and can handle multiple continuous and
categorical variables

Example:
One class is linearly separable from the others
like if we only had two features like Height and
Hair length of an individual, we’d first plot these
two variables in two dimensional space where
each point has two co-ordinates
Code for Support Vector Machine
x <- cbind(x_train,y_train)
fit <-svm(y_train ~ ., data = x)
predicted= predict(fit,x_test)

Naive Bayes Algorithm


A naive Bayes classifier is not a single algorithm,
but a family of machine learning algorithms
which use probability theory to classify data with
an assumption of independence between
predictors It is easy to build and particularly
useful for very large data sets. Along with
simplicity, Naive Bayes is known to outperform
even highly sophisticated classification methods

Example:
Emails are given and we have to find the spam
emails from that.A spam filter looks at email
messages for certain key words and puts them in
a spam folder if they match.
Code for Naive Bayes
x <- cbind(x_train,y_train)
fit <-naiveBayes(y_train ~ ., data = x)
predicted<-predict(fit,x_test)

K-Nearest Neighbour Algorithm


It can be used for both classification and
regression but in mostly cases we use it as
classification. It does not learn any model. and
stores the entire training data set which it uses
as its representation.

KNN for Regression


When KNN is used for regression problems the
prediction is based on the mean or the median of
the K-most similar instances.

KNN for Classification


When KNN is used for classification, the output
can be calculated as the class with the highest
frequency from the K-most similar instances.
Each instance in essence votes for their class
and the class with the most votes is taken as the
prediction.
Example:
Should the bank give a loan to an individual?
Would an individual default on his or her loan? Is
that person closer in characteristics to people
who defaulted or did not default on their loans?

Code for KNN Algorithm


library(knn) x <- cbind(x_train,y_train) #
Fitting model fit <-knn(y_train ~ ., data =
x,k=5) summary(fit) #Predict
Output predicted= predict(fit,x_test)

Random Forest
Random forest is collection of tress(forest) and it
builds multiple decision trees and merges them
together to get a more accurate and stable
prediction.It can be used for both classification
and regression problems.
Example:
Suppose we have a bowl of 100 unique numbers
from 0 to 99. We want to select a random sample
of numbers from the bowl. If we put the number
back in the bowl, it may be selected more than
once.

Code for Random Forest


x <- cbind(x_train,y_train)
fit <- randomForest(Species ~ .,
x,ntree=500)
predicted<- predict(fit,x_test)

K-Means Algorithm
K-means clustering is a type of unsupervised
learning, which is used when you have unlabeled
data and the goal of this algorithm is to find
groups in the data
Steps to use this algorithm:-.
1-Clusters the data into k groups where k is
predefined.
2-Select k points at random as cluster centers.
3-Assign objects to their closest cluster center
according to the Euclidean distance function.
4-Calculate the centroid or mean of all objects in
each cluster.

Examples:
Behavioral segmentation like segment by
purchase history or by activities on application,
website, or platform
Separate valid activity groups from bots

Code for K-Mean Algorithm


model <- kmeans(X, 3)
predicted= model.predict(x_test)
Apriori Algorithm
It is a categorisation algorithm attempts to
operate on database records, particularly
transactional records, or records including
certain numbers of fields or items.It is mainly
used for sorting large amounts of data. Sorting
data often occurs because of association rules.
Example:
To analyse data for frequent if/then patterns and
using the criteria support and confidence to
identify the most important relationships.

Code for Apriori Algorithm


data <- as(data, "transactions")
tl <- as(data, "tidLists")
rules <- apriori(data,parameter = list(supp
= = 0.001, conf = 0.80))
varules::itemFrequencyPlot(data,topN=20,
col=brewer.pal(8,'Pastel2'),main='Relative
Item Frequency
Plot',type="relative",ylab="Item
Frequency (Relative)")
Expectation Maximization (EM)
It's an algorithm for maximum likelihood
estimation when your data is incomplete, has
missing data points.More complex EM algorithm
can find model parameters even if you have
missing data. It works by choosing random
values for the missing data points, and using
those guesses to estimate a second set of data.
The new values are used to create a better
guess for the first set, and the process continues
until the algorithm converges on a fixed point but
it has a limitation that it is very slow, even on the
fastest computer.
Example:
To find the expectation maximisation in coin toss

Code for Expectation Maximisation Algorithm


x <- data
pi1<-0.5
pi2<-0.5
mu1<--0.01
mu2<-0.01
sigma1<-sqrt(0.01)
sigma2<-sqrt(0.02)
loglik<- rep(NA, 1000)
loglik[1]<-0
loglik[2]<-
mysum(pi1*(log(pi1)+log(dnorm(dat,mu1,
sigma1))))+mysum(pi2*(log(pi2)+log(dno
rm(dat,mu2,sigma2))))
mysum <- function(x)
{ sum(x[is.finite(x)])}
logdnorm <- function(x, mu, sigma)
{ mysum(sapply(x, function(x)
{logdmvnorm(x, mu, sigma)})) }
while(abs(loglik[k]-loglik[k-1]) >=
0.00001) { tau1<-
pi1*dnorm(dat,mean=mu1,sd=sigma1)/(p
i1*dnorm(x,mean=mu1,sd=sigma1)+pi2*
dnorm(dat,mean=mu2,sd=sigma2)) tau2
<-
pi2*dnorm(dat,mean=mu2,sd=sigma2)/(p
i1*dnorm(x,mean=mu1,sd=sigma1)+pi2*
dnorm(dat,mean=mu2,sd=sigma2))
tau1[is.na(tau1)] <- 0.5 tau2[is.na(tau2)]
<- 0.5 pi1<-mysum(tau1)/length(dat)
pi2<-mysum(tau2)/length(dat) mu1<-
mysum(tau1*x)/mysum(tau1) mu2<-
mysum(tau2*x)/mysum(tau2) sigma1<-
mysum(tau1*(x-mu1)^2)/mysum(tau1)
sigma2<-mysum(tau2*(x-
mu2)^2)/mysum(tau2) loglik[k+1]<-
mysum(tau1*(log(pi1)+logdnorm(x,mu1,s
igma1)))+mysum(tau2*(log(pi2)+logdnor
m(x,mu2,sigma2))) k<-k+1}
gm<-
normalmixEM(x,k=2,lambda=c(0.5,0.5),m
u=c(-0.01,0.01),sigma=c(0.01,0.02))
Principal Component Analysis (PCA)
It's an important method for dimension
reduction.It extracts low dimensional set of
features from a high dimensional data set with a
motive to capture as much information as
possible and to visualise high-dimensional data,
it also reduces noise and finally makes other
algorithms to work better because we are
injecting fewer inputs.
Example:
When we have to bring out strong patterns in a
data set or to make data easy to explore and
visualize

Code for Principal Component Analysis (PCA)


train.data <- data.frame(Item =
train$Item, prin_comp$x)
train.data <- train.data[,1:31]
rpart.model <- rpart(Item_Outlet_Sales
~ .,data = train.data, method = "anova")
test.data <- predict(prin_comp, newdata
= pca.test)
test.data <- as.data.frame(test.data)
test.data <- test.data[,1:30]
rpart.prediction <- predict(rpart.model,
test.data)
Algorithms for Reinforcement Machine
Learning

1. Q-Learning Algorithm
2. State–action–reward–state–action Algorithm
(SARSA)
3. Deep Q Network Algorithm (DQN)
4. Deep Deterministic Policy Gradient Algorithm
(DDPG)

Q-Learning

It is model-free Reinforcement Learning


algorithm based on the Bellman Equation. The
goal of the algorithm is to learn a policy, which
tells an agent what action to take under what
circumstances. It does not require a model of the
environment and can handle problems with
stochastic transitions and rewards, without
requiring adaptations.
Algorithm for Q-Learning

Code for Q-Learning Algorithm:


learnEpisode <- function(s_0, s_terminal, epsilon,
learning_rate, Q) { state <- s_0 # set cursor to
initial state
while (state != s_terminal) { # epsilon-greedy
action selection if (runif(1) <= epsilon) { action <-
sample(actions, 1) # pick random action } else
{ action <- which.max(Q[state, ]) # pick first best
action }
response <- simulateEnvironment(state, action)
# update rule for Q-learning Q[state, action] <-
Q[state, action] + learning_rate *
(response$reward + max(Q[response$state, ]) -
Q[state, action])
state <- response$state # move to next state
}
return(Q)
}
State-Action-Reward-State-Action
(SARSA)

It is an algorithm for learning a Markov decision


process policy.It is also called Modified
Connectionist Q-Learning" (MCQ-L).The major
difference between it and Q-Learning, is that the
maximum reward for the next state is not
necessarily used for updating the Q-values.
Instead, a new action, and therefore reward, is
selected using the same policy that determined
the original action

Algorithm for SARSA Algorithm


Code for SARSA Algorithm

devtools::install_github("nproellochs/Reinforcem
entLearning")
devtoos::install_local("ReinforcementLearning_1.
0.0.tar.gz")
library(ReinforcementLearning)
data("tictactoe")
head(tictactoe, 5)
environment <- function(state, action) {
...
return(list("NextState" = newState,
"Reward" = reward))
}
env <- gridworldEnvironment
print(env)
# Define state and action sets
states <- c("s1", "s2", "s3", "s4")
actions <- c("up", "down", "left", "right")
# Sample N = 1000 random sequences from the
environment
data <- sampleExperience(N = 1000, env = env,
states = states, actions = actions)
head(data)
# Define reinforcement learning parameters
control <- list(alpha = 0.1, gamma = 0.5, epsilon
= 0.1)
# Perform reinforcement learning
model <- ReinforcementLearning(data, s =
"State", a = "Action", r = "Reward",
s_new = "NextState", control = control)
# Print result
print(model)
Deep Q Network (DQN)

DQN leverages a Neural Network to estimate the


Q-value function. The input for the network is the
current, while the output is the corresponding Q-
value for each of the action. It takes a stack of
four frames as an input. These pass through its
network, and output a vector of Q-values for
each action possible in the given state. We need
to take the biggest Q-value of this vector to find
our best action.

Algoritham for DQN Algorithm


Code for DQN Algorithm
source("Agent.R")
source("Memory.R")
source("train.R")
N_EPISODE <- 500
BATCH_SIZE <- 32
gym <- import("gym")
env <- gym$make("CartPole-v0")
env <- gym$wrappers$Monitor(env, "monitor",
force = TRUE)
tf$reset_default_graph()
agent <-
Agent$new(
input_shape = 4,
output_dim = 2,
epsilon_last_episode = 100
)
memory <- Memory$new(capacity = 50000)
rewards <- c()
with(tf$Session() %as% sess, {
init <- tf$global_variables_initializer()
sess$run(init)
for (episode_i in 1:N_EPISODE) {
done <- FALSE
s <- env$reset()
total_reward = 0
while (!done) {
a <- agent$get_action(state_ = s, step =
episode_i)
ret <- env$step(action = a)
s2 <- ret[[1]]
r <- ret[[2]]
done <- ret[[3]]
memory$push(s, a, r, done, s2)
if (memory$length > BATCH_SIZE) {
batch <- memory$sample(BATCH_SIZE)
train(agent, batch)
}
s <- s2
total_reward <- total_reward + r
}
cat(
sprintf(
"[Episode: %4d] Reward: %4d, Epsilon: %.3f\n",
episode_i,
total_reward,
agent$epsilon
)
)
rewards <- append(rewards, total_reward)
if (length(rewards) > 100) {
rewards <- rewards[2:length(rewards)]
if (mean(rewards) > 195) {
cat("Game Cleared")
break
}
}
}
})
env$close()

Deep Deterministic Policy Gradient


Algorithm (DDPG)

DDPG algorithm continue control with deep


learning and it relies on the actor-critic
architecture with two eponymous elements, actor
and critic.It uses a stochastic behaviour policy for
good exploration but estimates a deterministic
target policy, which is much easier to learn
Algorithm for DDPG

Code for DDGP Algorithm


library(tensorflow)
library(gym)
source("helpers/tensorflow_graph.R")
source("helpers/main_methods.R")
remote_base <- "http://127.0.0.1:5000"
client <- create_GymClient(remote_base)
env_name <- "CartPole-v0"
instance.id <- env_create(client, env_name)
log_dir <- file.path(getwd(), "log", "train")
env_monitor_start(client, instance.id, log_dir,
force = T, resume = F)
episode_count <- 500
max_steps <- 2000
model_path <- 'model/model.ckpt'
tf$reset_default_graph()
policy.graph <- PolicyGraphBuilder(lr=0.01,
hidden_dim = 8L)
value.graph <- ValueGraphBuilder(lr=0.01,
hidden_dim = 8L)
saver <- tf$train$Saver()
sess <- tf$InteractiveSession()
init <- tf$global_variables_initializer()
sess$run(init)
for (i in 1:episode_count) {
output <- RunEpisode(client, instance.id,
policy.graph, sess, bad.reward=-10,
timestep=max_steps)
# $obs, rewards, actions
result <- ProcessMemory(output, discount.rate =
0.9, value.graph, sess)
# $advantages, values.true
# Train Value
value.input.states <- value.graph$input$states
value.input.true.values <-
value.graph$input$true_values
sess$run(value.graph$op$train_op,
feed_dict=dict(value.input.states=output$obs,
value.input.true.values=result$values.true))
# Train Policy
input.states <- policy.graph$input$states
input.advant <- policy.graph$input$advantages
input.action <- policy.graph$input$actions
sess$run(policy.graph$op$train_op,
feed_dict=dict(input.states=output$obs,
input.advant=result$advantages,
input.action=output$actions))
episode_reward <- sum(output$rewards)
}
test_log_dir <- file.path(getwd(), "log", "test")
env_monitor_start(client, instance_id, directory =
test_log_dir, force = T, resume = F)
N <- 200
result <- numeric(N)
for(i in 1:N){
out = RunEpisode(client,
instance_id,
policy.graph,
sess,
timestep = 300,
bad.reward = 0,
render=T)
cat(sprintf("\nReward (total): %d",
sum(out$rewards)))
result[i] <-sum(out$rewards)
}
env_monitor_close(client, instance_id)
env_close(client, instance_id)

View publication stats

Potrebbero piacerti anche