Sei sulla pagina 1di 24

Route Optimization in

Logistics Operations by using of Data Science tools

Deepak Kumar Tiwary – BE/10529/15

B.E Project : (Production Engineering)
Mohit Baranwal – BE/10301/15
8th Semester
Prerna Esther Kiro – BE/10314/15

 Objectives of the Project
 Methodology
 Results and discussions
Logistics is often defined as the art of bringing the right amount of the right product to the
right place at the right time.

The efficiency of the logistic system is influenced by many factors; one of them is to decide
the number of DCs, and find the good location to be opened, in such a way that the
customer demand can be satisfied at minimum DCs’ opening cost and minimum shipping

Supply chain management goals include transportation network design, plant/DC location,
production schedule streamlining, and efforts to improve order response time.
Transportation network design is one of the most important fields of SCM. It offers great
potential to reduce costs and to improve service quality.

The problem aims to find out the optimum location of the Distribution Center.
Introduction to Problem

The problem of project is mainly about the cost reduction through route
optimization of the logistics operation. Mainly we are using the application
of Data science in this field for optimization. For understanding the
problem, a brief description is below.

We have a Data set which consists the data about the destination and its
pin code. The destination is the location of customer. So, we are using the
simple regression analysis for finding the best fitted line in the data points.
After that, Genetic algorithm is used for getting the optimal solution from
the set of solution. The whole genetics algorithm analysis gives the parent
and the generated offspring from parent which results to the optimality.

The objective of the project is to find a location of distribution center

such that the Cost of transportation is minimum using the regression
analysis and genetic algorithm analysis on the data set.

So the problem constitutes 2 major steps :

1. Implementation of Regression Analysis: The best fitted line is
obtained on the data set which gives the range of probable
locations of Distribution Center.
2. Optimization by using Genetic Algorithm: The probable locations
so obtained is then passed through the algorithm which iterates
and gives the optimum location in the end.

Data Gaining

Analysis Tools(Excel, R, Python)

Location Analysis(DC): Cost Optimization:

Regression Analysis Genetic Algorithm

Results, Discussions And


Schematic Diagram of Methodology

Summary of Data
The data set consists of the information about the total delivery happened between a warehouse and different location of
the India. The raw data contain many information but the relevant are only about the destination pin code. The raw data
has 189986 observation.
The raw data summary is obtained in R:

The above figure gives the details about the raw data. It has destination, destination pin code,
mode of transport, city, state etc.

After removing the missing pin code from the data, we have 123276 observation in which various pin code have many
deliveries. Considering single pin code for the important data point we have 6780 distinct data points which are actually
the pin code of the destination.
Summary of Data
Summary of Data
Location Analysis: Simple Regression
This concept is used to get the best line between the scattered data set which gives the best
relationship between these points. This is a simple regression analysis, the data on the charts are
scattered. These points may or may not form a straight line which would have the least distance
from other scattered points. If no such line is visually generated, regression analysis uses the least
square method to generate a best fit line among the points. The line so formed has the minimized
squared distance of all the points to the line. The equation of line can be obtained by getting the
relationship between the dependent and independent variable. Since, our data set mainly has
latitude and longitude of the destination. So, the basic structure of the formula of the line is:

Equation of best fit line is:

Longitude = 0.305928*Latitude + 78.7166.

Location Analysis: Simple Regression
Optimization: Genetic Algorithm
This concept is used to generally obtain the global optima instead of local one and in our case
in each generation the fitness value is monitored and the offsprings are generated by using
crossover and mutation and Initial population is selected using roullete wheel selection

After generating some random coordinates on best fit line their fitness are calculated and
members having good fitness are selected and then the offsprings are generated.

And instead of using a single line only we have taken a bandwidth to increase the probability
of valid offsprings and the value of bandwidth is 0.2 in our case.
Optimization: Genetic Algorithm

Where x axis is latitude

y axis is longitude
Optimization: Genetic Algorithm
For the generation of offspring every parent coordinate is converted into it’s binary
1. An array of individual distances was created from each point on line to every point of
2. Now demand is multiplied to their respective distances(cost) and then summed up to
give total cost.
3. A fitness function is defined which is inverse of cost function for giving the idea of
goodness of quality of result.
4. Fitness value is calculated for each of the cost and sorted in descending order.
5. Now for the initial selection of parents the roulette wheel method is used.
6. We calculated the relative fitness value which is = fitness [ i ] /sum(fitness).
7. Now created a bin of cumulative relative fitness.
8. Now a random number is generated between (0,1).
9. Correspond to each random no. there is a fitness value which ultimately gives us the
coordinate of the parent responsible for it.
Optimization: Genetic Algorithm
Now the Crossover begins:
For the crossover to happen every parent should be on the same page and for that to happen following
processes are followed:
1. Conversion of each coordinates of parent from decimal to whole number.
2. Now that whole number is converted into its binary equivalent.
3. Then binary of x and y coordinates are concatenated with each other.
4. A random number is generated in between 0 to length of concatenated pair.
5. A cut is made in the binary string in pair of concatenated binary number and from that cut the right-
hand elements were swapped to generate the offspring.
6. From that offspring binary values, the coordinates of corresponding offspring are generated.

The constrained conditions:

For an offspring to be a valid one the conditions are as follows:
1. New coordinates ordinate must be in range of latitude values of Jharkhand data.
2. It must lie in between the bandwidth.
3. For 2nd condition to satisfy we used simple property of line that if a point is in between 2 lines then
the values coming from 2 lines putting that point must be of opposite sign.
Model Formulation
Objective Function:

Min Z = ∑∑𝑡𝑖 𝑥𝑖𝑗

𝑡𝑖 = Demand at the location i

𝑥𝑖𝑗 = Distance between i and j

We are minimizing the cost of transportation but we didn’t have anything as such cost but we do have distances
between the location so we took the distances as disguise of cost and It depends directly and varies linearly with
Model Formulation

Crossover Technique : The crossover is done to explore new solution space and crossover operator corresponds to
exchanging parts of strings between selected parents.

Mutation: Done to prevent the premature convergence and explore the new solution space .We are trying to
implement and compare 2 types of mutation technique Insert mutation and Swap Mutation.
Results and Discussions
With the implementation of simple linear regression and Genetic Algorithm to the general set of
solution, the optimal solution obtained is by iterating for next two generations.
Fitness Value:

Fig: Fitness value of the offspring. Fig: Fitness Value of the parent.

The Fitness value of offspring and the offspring determines how good the solution with respect to the
problem is. Individual having higher fitness value is termed good as it gets prevented from the gene
convergence. Most of the times the offspring fitness value is higher than the parent individual.
Results and Discussions
Cost of Offspring and Parent.

Fig: Cost of the Offspring(child) i.e. the cost Fig: Cost of the Parent i.e. the cost
of the probable Distribution Center to destiantion of the probable distribution center
point after final iteration to destination point after initial iteration.

The above two figure is the minimum cost of the offspring and the parent individual in the
ascending order. With the crossover, the offspring so obtained has the solution better than the
parent individual.
Results and Discussions
As from the figure the optimum cost of the parent chromosomes is 1135.88 unit and
the optimum cost of the offspring is 1115.80 unit. Hence, this shows that the iteration
done results to the optimality.
Since we are iterating only for few generations, the result obtained is only optimum
among the all iterations. The more iteration takes place, the solution reaches more
closer to optimality.
Results and Discussions
In the recent times the computational power has increased exponentially which opens a door for better
optimization problems in field of supply chain management by the use evolutionary optimization
Our Project is just a small demonstration of that and lot of companies currently started to implement these
technologies in their problems.
There is lot of scope of further work in our project and there are many other variables left that one should
consider before implementing it to real world like:
1. Mode of Transport.
2. Weight of parcel that is being transported.
3. Order in which transfer is being made.
4. Variable Demand etc.
In our project we iterated it to 2 generations and one can increase the number of generations to get more
optimal result.
One can also change the method of crossover and mutation and initial selection as well to see various
changes in result.
One can also make it to 2 stage multiple distribution center problem like more than 1 DC to optimize the cost
and 2 stage means from warehouse to distribution center then from Distribution center to Customer.
[1] . T. G. Bosonoa and G. Gebresenbet (2011) Cluster building and Logistics network integration of local food supply
chain. Research paper. Department of energy and technology, Swedish university of agricultural sciences, Box 7032, 750 07
Uppsala Sweden.
[2] . Min Huang, Wei Tong, Qing Wang, Xin Xu and Xingwei Wang (2006) Immune Algorithm based routing optimisation
in fourth party logistics. Sheraton Vancouver wall centre hotel, Vancouver B. C. Canada.
[3] . N. H. Moin and H. Salhi (2006) Inventory routing problem: A logistical overview. University of Malaya, Kuala
lumpur, Malaysia and the university of Kent, Kent, UK.
[4] . Erhan Kutanoglu and Divi Lohiya (2007). Integrated inventory and transportation mode selection: A service parts
logistics system. Graduate program in operations research and industrial engineering, department of mechanical engineering,
the University of Texas at Austin, United States.
[5] Vishv Jeet, Erhan Kutanoglu and Amit Partani (2008) logistics network design with inventory stockings for low
demand parts: Modelling and optimization. Axioma, Inc., Atlanta, GA, USA, Graduate program in operations research and
industrial engineering, University of Texas at Austin, United States and Wellington Management company, Boston, USA.
[6] Mahmoud Moustafa El- Sherbiny (2012) Alternate mutation based artificial immune algorithm for step fixed charge
transportation problem. Operations research department, Institute of statistical studies and research (ISSR), Cairo University
[7] Ann Campbell, Lloyd Clarke, Anton Kleywegt, Martin Savelsbergh (1997) The Inventory Routing problem. The
logistics Institute, School of Industrial and systems engineering, Georgia Institute of technology, Atlanta, GA 30332-0305.