Sei sulla pagina 1di 5

Big Data Analytics for Supply Chain Management

Jens Leveling¹, Matthias Edelbrock², Boris Otto³


¹ Software Engineering, Fraunhofer-Institute for Material Flow and Logistics IML, Dortmund, Germany
² Audi-Endowed Chair of Supply Net Order Management, Technical University of Dortmund, Dortmund, Germany
³ Information Management and Engineering, Fraunhofer-Institute for Material Flow and Logistics IML, Dortmund,
Germany
(jens.leveling@iml.fraunhofer.de, matthias.edelbrock@tu-dortmund.de, boris.otto@iml.fraunhofer.de)

Abstract - A high number of business cases are supported by IT systems. Disturbances are treated
characterized by an expanded complexity. This is based on reactively in most cases. An early identification of
increased collaboration between companies, customers and upcoming or future risks is typically missing. According
governmental organizations on one hand and more to this, the required reaction time for almost all actions is
individual products and services on the other hand. Due to
that, companies are planning to address these issues with Big
not available. Even the transparency of the first level of
Data solutions. This paper deals with Big Data solutions the supply chain – the direct suppliers and customers - is
focusing on Supply Chains, which represents a key discipline not always given. All these issues are in the focus of
for handling the increased collaboration next to vast companies with different emphasises due to individual
amounts of exchanged data. Today, the main focus lays on business strategies, manufactured products and offered
optimizing Supply Chain Visibility to handle complexity and services. Therefore, data becomes more important and it is
to support decision making for handling risks and turning to be one main part of new business models in the
interruptions along supply chains. Therefore, Big Data future as it has been shown by [1].
concepts and technologies will play a key role. This paper Big Data solutions have to be adapted to the
describes the current skituation, actual solutions and
presents exemplary use-cases for illustration. A classification
company’s requirements and environment. Instead, data
regarding the area of application and potential benefits of every enterprise is individual due to different semantic
arising from Big Data Analytics are also given. Furthermore, meanings and structure changings not only between
this paper outlines general technologies to show capabilities different companies, but also within departments and
of Big Data analytics. divisions. Big Data activities are starting often by setting
up a (Big) Data strategy for the entire company.
Keywords—supply chain management, supply chain Therefore, the value drivers and targets, which have to be
visibility, supply chain risk management, big data archived with the new IT solutions, should be clarified
[2].For these issues a general classification of (Big) Data
strategy targets for companies and a potential usage based
I. INTRODUCTION on the categorization are illustrated by two exemplary use
cases as shown by [2].
Nowadays, many companies are planning to extend This paper is structured as follows: In the next chapter
the existing IT-Infrastructure with Big Data solutions. big data is defined and a general description of Big Data
This is needed due to the challenges result on the on- technologies and data sources is given. In the 3rd section
going globalization, increasing data volumes and related solutions for Supply Chain Management (SCM)
customer needs in more individual products and service are characterized. The 4th chapter shows categorization in
configurations. The globalization enables acquisition of terms of the area of application and shows exemplary use-
new customers and requires new and changing business cases of big data analytics. Finally, a conclusion of the
models due to changing market requirements. New explained approach and an outlook of future work are
suppliers and distributors have to be integrated into the given.
supply chain. Additionally, the supply chain operations
and processes have to be constructed for global
application by considering new product variants. The II. BIG DATA
products and offered solutions themselves are more and
more custom-designed as a result of adapted A. Big Data term definition
requirements related to region specifics and competitors’
offerings. This economic competition leads to short Big data describes a way of collecting, managing and
product life-cycles, not only in the mobile phone and IT analysing large amounts of data. Therefore, big data is
sector. mostly referenced with the three Vs as described by [3, 4].
Next to the direct operational aspects, the focus of They are defined as volume, velocity and variety.
each company lays more and more in the company’s The volume characterizes the vast amounts of data
context and the supply chain environments. Both are not stored within the IT infrastructure. In general, storing a
only influenced by the market itself, but also by risks or large amount of data already represents a big issue for the
bottlenecks that might occur along the supply chains or data storage itself. One of the main challenges for IT
within production lines. For example, supply chain risk infrastructures dealing with big data is to ensure the
management is in focus of many companies, but is limited availability of storage space and an efficient accessibility.

978-1-4799-6410-9/14/$31.00 ©2014 IEEE 918


Proceedings of the 2014 IEEE IEEM

Velocity describes the large amounts of data that Data.com is a service platform focussing on Business
arrive in real-time irregularly. The fast arriving data has to Data management like contacts and company profiles.
be handled, if a further usage is planned. The service offers access to millions of company profiles
The last one – variety – relates to the various structure and allows current updates on the customer data.
of information handled within the big data environment.
The stored data can either be based on a structure (data C. Data Management and Analytics
type) or consist of unstructured information. The three Vs
including their challenges for IT infrastructures and big New architectural designs are required due to various
data architectures are considered (presented) in the next structures and a high amount of data. Conventional
sections. The value of big data is often described as the 4th architectural components, like SQL databases, are no
V. This is related to the fact that the data needs to be longer able to handle these amounts of data. Additionally,
processed and analysed for further usage. existing enterprise architectures, which focus on the
Handling large amounts of data requires both, business processes themselves, are often designed without
concepts for managing these datasets and concepts for any data-centred characteristic. However, the modern
processing the amounts of data. In this section, actual enterprise architecture has to focus on data to enable big
concepts for both challenges are described. data analytics [3].
Data are mostly structured too individually to store
B. Data Sources them within the given structures of relational databases.
Therefore, several new database concepts have been
Mainly data sources are divided into internal and developed to store and manage individually structured
external data. Internal data is mostly available in business data. These concepts are generally known as NoSQL (Not
IT-systems and databases, for example from an ERP- only SQL). These databases have no or limited schema
system. The internal communication of production restrictions compared to relational concepts. The benefit is
systems is also available as data streams, for example to apply an update of data structures without database
from radio-frequency identification (RFID) devices. table rewrites [5]. Indeed, the NoSQL-Databases focus
External data sources are also available as data lies on distributed accessibility and scalability. For supply
streams, for example from Social Media like Facebook chain modelling, NoSQL Graph Databases are very
and as Datasets from data portals. Both are described in attractive as a result of their close relation to applications
this subchapter. Social Media sources are mostly like transport schedule optimization, navigation systems
unstructured and the data semantic is various and or social networks [12]. The data model of these databases
frequently changing. Social Media data is mostly not organizes the data within a graph structure, comparable to
directly accessible. For example, Twitter only provides a road networks. The information in Graph Databases is
data stream that contains a limited amount of tweets. stored within the nodes. The additional information is
Therefore, companies like DataSift offer purchasable mostly defined as properties to describe the node [12].
streams of social media data [5]. On the other hand Finally, a link organizes the connection between different
external sources like search engine trends are also less database entries.
structured, but the provided APIs, like Google Search The right data management concept for handling big
API, are mostly free to use. Some external data sources data is linked to the data analytic approach. Large
can be searched and captured via technologies like distributed data storages are required by batch analytics.
Sematic Web (WS) as shown in [6, 7 and 8] or Linked In general, batch analytics are working with distributed
Open Data (LOD) as described in [1]. tasks on large data storages to search and extract
Open Data sources, like Eurostat information [5]. Batch analytics are executed and
(epp.eurostat.ec.europa.eu), are free for commercial and managed through large scale processing frameworks.
non-commercial use. For example they contain statistics, Today, the Apache Hadoop framework is widely used for
geo and political information about regions and countries. batch analytics. It consists of two main components: The
These sources can be accessed through open data portals Hadoop File Systems (HDFS) for managing vast amounts
like the European Union Open Data Portal, which of data and a MapReduce based framework for executing
represents a catalogue for further open data sources [10]. the analytics as characterized by [4, 11].
Next to the open data portals, there are many other A challenge within the use of data analytics is to
platforms that offer closed data. Closed data means that analyse irregular arriving data streams next to existing
datasets have to be purchased or licenced for access. large data sets. Data Streams are often characterized as
Example platforms are: real-time data. Therefore, the Lambda Architecture has
Factual, is an open data platform founded in 2009. been designed to enable Data Stream analytics in addition
Factual offers different data services like data mapping or to the batch analytics as described by [3, 13]. Data
ad targeting. All services are based on their global Streams can be processed by Technologies like Complex
location database containing data for more than 65 million Event Processing (CEP) as shown by [3, 14]. CEPs are for
local businesses and places of interest spread all over the example used to integration and processing events from
world. RFID devices, barcode readers and sensors.
Microsoft Azure Data Market is a cloud computing
infrastructure. Close data and open data can be accessed
through this data market platform. The datasets are
provided by many different companies and organizations.

919
Proceedings of the 2014 IEEE IEEM

III. RELATED WORK The mentioned transformation is often too time intensive
and the stored data is limited for later usage. The
A. Big Data for Supply Chain Management distributed processing is needed to extract required
information shortly from large datasets. This is not yet
Nowadays, in SCM the optimization of the supply implemented due to the restricted accessibility of different
chain visibility (SCV) is one of the core developments. internal data sources. Furthermore, security polices mostly
The SCV as shown by [15] is a complex problem due to complicate the integration of external data in existing
the interaction between involved people, processes, enterprise IT infrastructures.
technologies and information flows. The key target of Current solutions for increasing the SCV are
SCV is to show current activities and involvements along optimizing the collaboration within the supply chain.
a supply chain. This can be used to collect information for Additionally, 44% of enterprises improved the internal
decision makers [16] in many cases, for example, if an visibility and 40% are optimizing their operations to
interruption occurs. improve monitoring, usability or efficiency as the [15]
Today, available data sets are often outdated, but reports. This leads to reconsidering security policies to
actual information is required for decision making [12, 16, enable the access to different data sources for analytic
17 and 18]. This issue has to be addressed by new IT methods. The understanding that data is valuable has
solutions for SCM. Some Big Data based solutions for highly increased throughout the industry. But the valuable
Supply Chain Management are presented below. information needs to be processed.
In [9] LOD and SW are used to increase SCV. Here, Following the Gartner survey [20], nowadays around
business transactions are evaluated and analysed to collect 26-28% of manufacturing companies and retailers invest
external data and to describe the supply chain network. in big data solutions. In the transportation sector only 20%
Furthermore, existing supply chain data is used. The of the asked companies have already invested, but with
analytic results are compared with the existing data to 50% there is the highest value of planned invests within
update the supply chain network information. Afterwards, the next two years. The problem addressed with big data
the up to date supply chain information is used to search (summarized over all industries) is about 32% in
for upcoming risks and interruptions for all sites and improving risk management.
transport routes within the supply chain. For this solution
an ontology network as described by [19] is needed. The
Background is that in a Supply Chain many companies are IV. CATEGORIZATION AND POTENTIAL BENEFITS
exchanging data, but every company has an individual
semantic understanding of the given information. A. Classification
The RiskVis framework as developed by [16]
combines available solutions for visualization of supply A categorization regarding the analysed type of data
chains based on Google Maps and Baidu Maps with delivers seven potential scenarios. These scenarios use
analysing approaches for internal and external data. both, structured and unstructured data, and can be further
Therefore, an analysing module extracts problems and categorized in click stream-, social media-, server-log-,
disruptions. Additionally, the existing information about sensor-, location-, text-, and video-/audio-data [21].
how to handle unplanned disruptions, along with the The Application of Big Data Analytics can contribute
supply chain network configuration, is available for the to different targets and provide benefits in various fields.
system through accommodated interfaces. Furthermore, The following sections can be divided into operational
the data for the risk extraction is stored next to the efficiency, customer experience and new business models.
analytics module on a server cluster. The analysis is Big Data Analytics can enable new business models and
focused on real-time data to provide a real-time view of generate new revenue streams. Operational efficiency
the supply chain for risk monitoring to enable both: based on Big Data capabilities uses data for better
Reducing complexity and providing necessary decision-making and improvement in terms of process
information for decision makers [16]. Heretofore an quality, performance or resource consumption. The
interruption forecast or early risk identification is not section of customer experience is mainly assigned to
designed. marketing and e.g. focusing on more precise customer
assessment, which also supports a company’s SCM [2,
B. Industrial Awareness of Big Data Solutions 22]. By focusing on logistics and supply chains,
Automatic identification and data capture (Auto-ID)
As of 2011 there has been already a high interest or technologies like RFID and bar codes are widely used to
new solutions to increase the supply chain visibility based track handling units and the transported goods. Therefore,
on big data analytics. But the existing security policies many read-points have to be shared along the supply
stopped the implementation and integration of different chain.
data sources. The industry spotted high capabilities for Figure 1 shows the overarching model of data
solutions based on HDFS and MapReduce in 2011. categorization. Data in the outer circle is of higher
The situation about dealing with large amounts of data fuzziness, volume and change frequency. A wider
has not been changed. Existing data is often stored in distance to the circle´s center implicates less control and
large files like web logs or in complex XML documents. increasing ambiguity of both the data and its source. The
All those files are transformed into existing structured inner circle consists mostly of data that is owned by the
data as they are demanded by conventional databases. The company and administrated within the companies IT
transformation is necessary, if a further usage is planned. infrastructure, e.g. ERP systems.

920
Proceedings of the 2014 IEEE IEEM

transparency of freight deliveries between companies.


When a handling unit is detected, the read points are
generating events, which are available in real-time and
enabling better transparency and sped up processes. This
is achieved due to planning optimization of deliveries and
supported detection of bottlenecks and risks. Latencies
can also be avoided or at least reduced. The project
finished 2013 with a pilot implementation for a candy
product supply chain with 18 read points shared between
the manufacturer, a supermarket and handling unit
pooling operator. The implementation focused in tracking
2500 handling units, which are transported in a loop
between the three companies. Around 90.000 read events
had been created. The outcome of the pilot
Fig. 1. Overarching model of data categorization implementation was a complete transparent handling unit
flow between the three participants. All companies see a
Measuring the benefits of Big Data is possible with benefit by a roll out of the smaRTI results on a complete
specific IT related Key Performance Indicators. These supply chain, not only for one product. Every handling
KPIs refer to qualitative and quantitative aspects, and can unit, and the transported goods could be directly located
either be time-, quality- or financially oriented. The five as soon they reached a read point location. The next step
highest recommended KPIs are lead-time reduction, is to analyze the smaRTI data besides the business
increase of customer satisfaction, decrease of product processes data to detect not only bottlenecks or risks along
costs and time to market and contribution to sales increase the supply chain, but also to find action within the
[22]. process, which can be optimized or excluded. An example
Both categorizations show, that there is a high would be some manual quality measurement for counting
interlinking from an organisational point of view. The the received goods and comparing them with the expected
application of Big Data Analytics for SCM affects other numbers. In summary the smaRTI approach can decrease
departments, e.g. marketing and sales. For that reason, the loose of handlings units by 50% and can reduce the
KPIs have to consider various facets for a holistic goods in stock up to 10% in the consumer goods domain.
evaluation of use cases. Use cases with regard to anticipatory shipping
intention – as described in [24] - will speed up delivery
B. Potential Benefits times of goods and increase the utilization ratio of
distribution capacities. One use case example represents
A Capgemini study [22] identifies machine-to- DHLs volume forecast with predictive capacity utilization
machine-communication (M2M) as the issue with the and planning. The parcel volume analytics helps to
most gained importance in the last year. M2M data is improve the prediction accuracy of expected parcels and
derived from sensors attached on machines or products, freight within their network. This is realized by
which can collect various types of data like temperature, correlating data from different sources and with different
usage data or positioning data. In 2013 11% of the degree of privacy protection. Some input data could be the
surveyed companies quoted, that they already use data company’s internal historic shipment data and external
generated from machine-to-machine communications. events, public holidays, weather condition and forecast,
This increased significantly as in 2014 23% of the asked Google search terms, and the shopping behavior of online
companies use M2M data. Additionally, 12% of the customers.
companies are in the status of implementation and another Another example [25] is Amazons US Patent for
13% plan to implement. M2M enables automatic Anticipatory shipping from December 2013. The aim of
information exchange between different objects, e.g. this patent is to ship goods prior to the customers’ order to
vending machines, cameras, transport vehicles, containers reduce delivery time. A prediction of upcoming orders is
and their corresponding database. Possible use cases can the key element of the patent. This patent enables several
contain activities like monitoring areas and machines, applications. First, a shipment is transported to a
increasing the maintenance of facilities and an automatic destination area without knowing the complete delivery
ordering if demand is recognized. The use case of address. During the shipment to the specific geographical
automatic ordering can be fully automated up to a self- location, this address will be completed depending on
distribution of necessary goods. Regarding to the above- placed orders in the meantime. This optimized distribution
mentioned classification, machine-to-machine improves the lead time as well as the customer satisfaction
communication will enable new business models and has and can help to increase sales. Secondly, Amazon tries to
the potential to highly increase operational efficiency. match some of the goods while they are already in transit
The goal of the smaRTI research project [23] was to to a specific geographical location with current customer
increase the intelligence of the material flow. Therefore, orders. During the transport, the goods are added to
the identification and localization of handling units, like orders, which have been placed – as expected - by a
palettes, is achieved through the usage of Auto-ID specific customer in this area. The motivation is to use the
technologies. A wide spreading of read points, like disadvantage of lower-cost transportation (non-expedited
barcode or RFID reader, leads to an increasing delivery) for buffering the speculatively selected items. If
one of the items in transit fits to a customers demand, the

921
Proceedings of the 2014 IEEE IEEM

item will be delivered within a short delivery time. This [7] S. K. Rajapaksha, et al., “Internal Structure and Semantic
Web Link Structure Based Ontology Ranking”, ICIAFS
results in reduced transportation costs for both, Amazon 2008. 4th International Conference on Information and
and the customer. The shopping process itself has major Automation for Sustainability, pp. 86-90, 2008
influence on the forecast data generated for shipment and [8] D. Pfisterer, et al., “SPITFIRE: Toward a Semantic Web of
delivery. Some information influencing the forecast could Things”, Communications Magazine, IEEE (Volume: 49,
be the specific web pages viewed and duration of views, Issue: 11 ), pp. 40-48, 2009
links hovered over with the mouse arrow and duration of [9] W. Hofman, “supply chain visibility with Linked Open
Data for supply chain Risk Analysis”, Workshop on IT
hovering, shopping cart, wish lists and the relatedness to Innovations Enabling Seamless and Secure supply chains,
previous purchases, e.g. in the case of new product pp. 20-31, 08.2011
releases, where no historical buying patterns from similar [10] supply chain Council (SCC), supply chain Operations
customers exist. Refernce (SCOR) Overview Version 10.0, 2010,
https://supply-chain.org/f/Web-Scor-Overview.pdf
V. CONCLUSION AND OUTLOOK [11] K. Bakshi, “Considerations for big data: architecture and
Approach “, Aerospace Conference, 2012 IEEE, ISBN 978-
1-4577-0556-4, pp 1 – 7, 2012
The supply chain management requires new solutions [12] C. Sooriaarachchi, T. Gunawardena, B. Kulasuriya, T.
due to the increasing complexity. Existing solutions and Dayaratne, “A study into the capabilities of NoSQL
IT Systems are not able to address this problem. databases in handling a highly heterogeneous tree”, IEEE
6th International Conference on Information and
Therefore, new solutions have been developed in the last Automation for Sustainability (ICIAfS), ISBN 978-1-4673-
years. As the use cases show, there are already systems 1976-8 , pp. 106-111 ,2012
that not only allow a better handling of the vast amounts [13] N. Marz and J. Warren, “Big data. Principles and best
of data, but also show how Big Data Analytics can practices of scalable realtime data systems”. Manning
influence business models. This can also lead to new Publications, MEAP Edition, Manning Early Access
Program big data version 7, 2012.
business models as shown by the Amazon patent.
[14] J. Xingyi, “Efficient Complex Event Processing over RFID
In this paper, we presented ongoing research work Data Stream”, Seventh IEEE/ACIS International
about developing solutions for increasing the supply chain Conference on Computer and Information Science, ISBN
visibility based on a data source classification and 978-0-7695-3131-1, pp. 75-81, 2008
potential benefits. The expectation for the future is that [15] B. Heaney, “supply chain visibility – A Critical Strategy to
Optimize Cost and Service”, [Online]
various companies from different industries will form Big http://aberdeen.com/Aberdeen-Library/8509/RA-supply-
Data ecosystems for gaining new business models and chain-visibility.aspx, 2013
providing new services to customers. This will lead to [16] R. Siow Mong, Z. Wang, et.al., ”RiskVis: supply chain
even more rapidly increasing complexity for the SCM. Visualization with Risk management and Real - time
Depending on this, our future work will be about Monitoring”, IEEE International Conference on
Automation Science (CASE), 207-212, 2013
designing solutions and IT systems including various
[17] M. Barratt, and A. Oke, “Antecedents of supply chain
external sources. Therefore, we need the classification of visibility in retail supply chains: A resource-based theory
data sources to design a rule catalog for data cleansing - perspective,” Journal of Operations management, vol. 25,
including data quality measurement - and a terminology pp. 1217-1233, 2007
for both, data cleansing and data analytics. The smaRTI [18] J. L. Griffiths, A. Phelan, K. A. Osman, and A. Furness,
project shows that using Auto-ID technologies to track “Using item attendant information and communications
technologies to improve supply chain visibility,” Agile
handling units can optimize the material flow Manufacturing, 2007. ICAM 2007, pp. 172 – 180, 2007
transparency. Here the next step is to analyze these data [19] P. Haase, D1.1.5 –Updated version of the networked
with business process and external data, like weather ontology model, EU F6 NeOn: Lifecycle support for
information, to detect further risks and bottlenecks along networked ontologies, 2009
supply chains. Our next research work will focus on this. [20] L. Kart, N. Heudecker, F. Buytendijk, „Survey Analysis:
big data Adoption in 2013 Shows Substance Behind the
Hype“, Gartner, 2013
REFERENCES [21] BITKOM-Arbeitskreis Big Data, "Big-Data-Technologien
– Wissen für Entscheider", BITKOM, 2014
[1] Smart Service Welt Working Group, “Smart Service Welt: [22] Capgemini Deutschland Holding GmbH, "Studie IT-Trends
Recommendations for the Strategic Initiative Web-based 2014: IT-Kompetenz im Management steigt"(in German),
Services for Businesses”, acatech, Berlin, 2014 Capgemini Deutschland Holding GmbH, 2014
[2] M. Jeseke, M. Grüner, F. Wieß, "BIG DATA IN [23] Effizienzcluster Management GmbH, Smart Reusable
LOGISTICS: A DHL perspective on how to move beyond Transport Items (smaRTI) - Intelligent Material Flow,
the hype", DHL Customer Solutions & Innovation, 12.2013 (online) http://www.effizienzcluster.de/en/leitthemen_
[3] S. Robak, B. Franczyk, M. Robak, “Applying big data and projekte/projekt.php?proPid=5, 2014
linked data concepts in supply chains management,” 2013 [24] N. Bubner, Ni. Bubner, R. Helbig, M. Jeske, „LOGISTICS
Federated Conference on Computer Science and TREND RADAR: Delivering insight today. Creating value
Information Systems (FedCSIS), pp. 1215 – 1221, 201 tomorrow!”, DHL Customer Solutions & Innovation, 2014
[4] A. Katal, M. Wazid, R. H. Goudar, “big data: Issues, [25] J. R. Spiegel, M. T. McKenna, G. S. Lakshman, P. G.
Challenges, Tools and Good Practices”, IEEE Sixth
International Conference on Contemporary Computing Nordstrom, “Amazon US Patent Anticipatory Shipping”,
(IC3), pp. 404-409 , 08.2013 Amazon Technologies Inc., 12.2013
[5] S. Ghemawat, H. Gobioff, S.T. Leung, “The Google File
System”, ACM SIGOPS Operating Systems Review, ACM,
pp. 29-43, 08.2003
[6] T. Berners-Lee, et al., “W3C Semantic Web Activity”
[Online]. Available: http://www.w3.org/2001/sw, 2001

922

Potrebbero piacerti anche