Sei sulla pagina 1di 10

DATA QUALITY SUMMARY

Effectively harnessing
the potential of Big Data,

Data Quality Management for cyber-physical systems


(CPSs), and the Internet of

Industry 4.0: A Survey


Things (IoT) to drive success-
ful transformation in manufac-
turing organizations requires
Diamond Williams and Herman Tang high-quality data. These
elements of Industry 4.0 intro-
duce new challenges related
to increasing dependence on

INTRODUCTION DATA QUALITY IN INDUSTRY 4.0 enormous quantities of data,


including minimizing biases in
Industry 4.0 introduces cyber-physical
systems (CPSs) into production systems,
SCENARIOS data collection, ensuring data
accuracy, and standardizing
increasing opportunities for intelligence, In manufacturing, Industry 4.0 technol- processes to remove noise.
automation, and enhanced human-ma- ogies can be used to transform ordinary This article surveys 51 papers
chine interaction. This enables identity machines to self-aware machines that from the academic and prac-
and communication for every entity in the learn. This improves their overall per- titioner literature to identify
value stream and can facilitate IT-enabled formance while easing maintenance how select applications of
mass customization in manufacturing management and enabling life cycle Industry 4.0 technologies
(Sanders, Elangeswaran, and Wulfsberg optimization of assets (Imani and can be impacted by data
2016). When processes can more effec- Radziwill 2020). These changes create quality. The authors identified
tively use and incorporate data, workers an open, smart manufacturing platform relevant characteristics and
and managers are better equipped to for industrial-networked information, frameworks that reduce the
make sound, thorough, data-driven to include real-time data monitoring, potential for bad or poor data.
business decisions. tracking status and positions of product, This illustrates how essential it
Industry 4.0 allows for faster response and maintaining instructions to control is for manufacturing organiza-
to customer needs by improving the flex- production processes (Vaidya, Ambad, tions to invest in data quality
ibility, speed, productivity, and quality of and Bhosle 2018). assurance while integrating
the production process. Similarly, it lays This article examines scenarios cata- Industry 4.0 technologies.
the foundation for the adoption of new lyzed by Industry 4.0 that require greater
business models, production processes, attention to data quality management: 1)
and other innovations. This will enable a the integration of Industry 4.0 into lean;
new level of mass customization as more 2) smart factories powered by CPSs; 3) IoT KEY WORDS
industrial producers invest in Industry 4.0 asset and maintenance optimization; and
Big Data, data assessment,
technologies to enhance and customize 4) improving the efficiency of the global
data quality, Industry 4.0,
their offerings (Rüßmann et al. 2015). supply chain.
Internet of Things (IoT), quality
This article focuses on three elements
in use, SQuaRE
of Industry 4.0: Big Data, CPS, and the
Internet of Things (IoT). These tech- INTEGRATION OF INDUSTRY 4.0
nologies make it possible to collect data
about production environments that have
INTO LEAN
previously been inaccessible. This leads to Lean is an established concept in
new opportunities to improve the quality manufacturing. Many organizations
of production processes, for example, by have adopted elements of the Toyota
reliably forecasting errors from material Production System (TPS), which empha-
handling machines and other processing sizes reducing waste, just-in-time (JIT)
equipment, and defects in products (Illés production, kanban, and other prob-
et al. 2017). At the same time, data quality lem-solving approaches. Lean processes
issues can impede these goals. This article are more standardized and more trans-
examines 51 papers from the academic parent, and they reduce activities to only
and practitioner literature to provide the essential work (Kolberg and Zühlke
actionable recommendations for ensuring 2015). As Industry 4.0 begins to transform
data quality in Industry 4.0. manufacturing organizations, lean tools
26 | SQP VOL. 22 | NO. 2 | ©2020, ASQ
Data Quality Management for Industry 4.0: A Survey

are being used as entry points to lower the risks associ- CPSs can be generally characterized as “physical and
ated with change. For example: engineered systems whose operations are monitored,
JIT: JIT reduces overproduction by ensuring the controlled, coordinated, and integrated by a computing
necessary inventory for a process is available when and communicating core” (Monostori et al. 2016). These
it is needed. Automated guided vehicles (AGV), for researchers conceptualize a CPS in terms of a five-level
instance, can transport objects within the material architecture (5C) to guide deployment:
flow automatically. The use of AGVs can minimize Smart connection: This is the foundation for a
both human error and empty trips. CPSs and IoT CPS. It requires the manufacturing organization to
monitor the condition of the transported goods and acquire reliable data from the machinery. Data can be
navigate the AGV efficiently, leveraging self-orga- retrieved directly from sensors, from fixed assets like
nization to help build robust logistics networks for CNC machines, or from software systems, including
production (Mayr et al. 2018). enterprise resource planning (ERP). A seamless and
Kanban: Kanban promotes continuous flow of tether-free method to manage data acquisition and
materials and information. In manufacturing, con- a reliable method for transferring data to the central
tinuous material flow is achieved by maintaining server is required (Lee, Bagheri, and Kao 2015).
a predefined stock level to guarantee an unin- Data-to-information conversion: This is the
terrupted supply. Through simulation or virtual second step. Raw data from the machinery are only
real-time representation of physical objects based as valuable as the actionable insights that can be
on digital twins, new kanban cycles can be planned gained from them.
with greater foresight and seamlessly integrated Connectivity (cyber): This is the third step of
into the existing production environment. Constant deploying a CPS. The connected level acts as a cen-
monitoring of work-in-process is made possible, tral information hub. Information is pushed to the
while transparency of material movements is hub from every connected machine on the network,
increased (Mayr et al. 2018). and analytics are used to extract additional insights
Total productive maintenance (TPM): Several that describe the status and projected states of
Industry 4.0 technologies can be used to empower individual machines within the system (Lee, Bagh-
operators. For example, virtual reality (VR), aug- eri, and Kao 2015).
mented reality (AR), and head-mounted displays Cognition: This is the fourth step. At this level,
can facilitate training and be used to deliver clear, proper presentation of analytic information is
context-rich maintenance instructions. Since provided to users to make decisions. Maintenance
maintenance can involve nonrecurring activities and tasks can be more easily determined and prioritized
tasks that are sensitive to changing conditions, these due to the availability of comparative information
technologies can also be used to facilitate interaction and individual machine statuses (Ahmadi et al. 2017).
with maintenance experts (Mayr et al. 2018). Configuration: This is the fifth level of deploying
The aim for lean integration within Industry 4.0 is to a CPS. Configuration provides feedback from the
introduce data reliant technologies that optimize the per- cyber part to the physical part and performs the
formance of existing systems. This transition warrants supervisory control to help machines self-configure
companies to address the pedigree of the data quality and adapt. This layer acts as a “resilience control
being used. Without proper data quality management, system” to apply the decisions that are made at the
even minor errors can accumulate, resulting in revenue cognition level to the operation of the machines
loss, process inefficiency, and failure to comply with (Ahmadi et al. 2017).
industry and government regulations (Kiel et al. 2017). Implementing a CPS within manufacturing trans-
forms how data for decision-making are obtained,
analyzed, and shared, creating opportunities for
SMART FACTORY POWERED BY CYBER- managers to make better decisions based on analysis of
real-time performance. Making this possible is the intel-
PHYSICAL SYSTEMS ligent network that supplements physical assets with a
CPSs are collections of collaborating computational enti- digital representation of that asset (the “digital twin”).
ties. They are embedded into the surrounding physical The digital twin is meant to be provided with intelligence
world and its ongoing processes, and they can provide to enable self-awareness, self-comparison, self-man-
and use services to access and process information that is agement, and self-adaptivity at an operational and
available on local networks or the broader internet. CPSs procedural level. Hence, the physical and cyber contexts
are the foundational building blocks for Industry 4.0’s are integrated to act synergistically (Longo, Nicoletti, and
“smart factory,” and existing technologies can be retrofit- Padovano 2017). These systems generate large amounts
ted with CPS components. of data, requiring sophisticated technology to handle
www.asq.org | 27
Data Quality Management for Industry 4.0: A Survey

processing and streaming. When all the details of the problem-solving techniques to decrease the time
production process are transparent, it can be controlled needed to perform tasks. When radio-frequency
and managed more efficiently (Yan et al. 2017). identification (RFID) is used, operations to be per-
The digital twin also enables simulation-based optimi- formed on a part can be carried with the part. As it
zation (SBO), which can be used to perform experimental reaches a machine, it communicates which actions
trials on equipment and policies without negatively are required. This results in quicker changeover of
impacting production. Nearly optimal solutions can be machine parameters according to the instructions
found, reducing the time-consuming aspects of creating read from the part (Sanders, Elangeswaran, and
many different what-if scenarios (Goienetxea Uriarte Wulfsberg 2016).
et al. 2018). New manufacturing facilities are being The enhanced connectedness of Industry 4.0 will
designed from the very beginning using SBO to support make it possible to gather and analyze data across
the transition from mass production to mass customi- machines, enabling faster, more flexible, and more effi-
zation while preserving profitability (Zúñiga, Moris, and cient decision making. This in turn will reduce costs (that
Syberfeldt 2017). can be passed along to the customer via price), increase
Standards for metadata are particularly critical for manufacturing productivity, foster industrial growth, and
these scenarios. Without standards for metadata, it is a enhance the competitiveness of companies and regions
challenge to incorporate diverse data repositories with (Rüßmann et al. 2015).
semantics for advanced data analytics (Vaidya, Ambad, High-quality data are required for intelligent planning
and Bhosle 2018). and control of machines due to the real-time nature of
the tasks (Vaidya, Ambad, and Bhosle 2018). For mainte-
nance activities in particular, it can be difficult to collect,
INTERNET OF THINGS ASSET AND clean, integrate, and obtain the necessary high-quality
data within a reasonable time frame to generate
MAINTENANCE OPTIMIZATION actionable insights for operations (Cai and Zhu 2015).
Smarter machine maintenance may be one of the biggest While it is more difficult (and sometimes impossible) to
advantages of introducing IoT and other CPSs into enhance aging equipment with Industry 4.0 technologies,
manufacturing environments. Corrective maintenance, operators know how the older machinery works. New
preventive maintenance, and setup time can all be machinery may be better able to support intelligent algo-
improved when tools, assets, and software are connected rithms and connectivity; however, the time and effort to
to the internet directly or through external adapters train and retool operators with the requisite data literacy
(Mourtzis, Vlachou, and Milas 2016). could be prohibitive.
With Industry 4.0 technologies, the time required to
recognize issues and mobilize for corrective main-
tenance can be greatly reduced. In a smart factory, IMPROVING EFFICIENCY OF THE GLOBAL
when a machine breaks down, it automatically sends
alerts to relevant shop-floor and maintenance per-
SUPPLY CHAIN
sonnel, and sometimes autonomously takes action to Collecting information about all the components in a
correct the failure. production system (for example, products, machines,
Preventive maintenance can increase the longevity material handling equipment, workers) using IoT can
and reduce life cycle costs of machinery and other be used to improve the efficiency of logistics (Illés et al.
assets. With more advanced analytics employed on the 2017). When deployed in conjunction with intelligent
Big Data from the machines, machines are equipped systems, IoT can learn from tracking past behaviors.
to assess their own health and degradation and can Improvement is accomplished by greater levels of
utilize data from other machines to avoid potential tracking and control of inventories, which in turn leads
maintenance issues (Sanders, Elangeswaran, and to decreased delays in fulfillment. This can impact the
Wulfsberg 2016). This may be particularly useful in entire supply chain, from procurement of raw materials
resource constrained environments, where preventive to production of the product, logistics, and distribution
maintenance can be neglected. to customers (Jayaram 2016). In traditional supply
Setup time can introduce added costs and variabil- chain management, overstocking, delivery delays, and
ity to manufacturing processes, especially during stock outs are common. Software systems and Industry
gauging. Manufacturers rely on methodologies 4.0 solutions reduce the complexity and uncertainty in
like single minute exchange of die (SMED) (also real supply chains by making information more visible
referred to as “quick changeover”) to shorten the (Abdel-Basset, Manogaran, and Mohamed 2018).
time and identify non-value-added exchanges that Pull systems, which help companies produce based
took place during shift and product changes and use on actual customer demand instead of forecast demand,
28 | SQP VOL. 22 | NO. 2 | ©2020, ASQ
Data Quality Management for Industry 4.0: A Survey

are greatly enabled by Industry 4.0. Physical inventory to work in an integrated manner to ensure success.”
levels can be controlled using visual management (for Mahanti (2019)
example, number of parts below threshold), triggers in Without proper data quality management, even minor
ERP systems, or e-kanban systems (that can recognize errors can accumulate, potentially resulting in revenue
missing and empty bins automatically from sensor data). loss, process inefficiency, and failure to comply with
Response time becomes shorter with software triggers, industry and government regulations (Kiel et al. 2017).
and shorter yet again with the Industry 4.0 solution. Data quality management thus focuses on providing
Charging levels of the bin also can be monitored, and data superior value to the customer while indirectly improv-
can be transmitted wirelessly to an inventory control ing the efficiency of processes. Continuous improvement
system in real time. If the real inventory and value in leads to increased revenues through product reliability
manufacturing execution system matches, faults in and reduced costs through process efficiency. As a result,
production control can be avoided due to lost kanban customer satisfaction leads to increased revenues that
(Sanders, Elangeswaran, and Wulfsberg 2016). enable the firm to gain a competitive advantage (Barrera,
Data acquisition is a particularly challenging task in Navarro-Garcia, and Moreno 2015).
these scenarios due to data coming from a broad variety Production processes often operate on input obtained
of technologies, machines, sensors, CPS, IoT devices, and from external sources, which are frequently unknown
communication networks. in advance. Consequently, the overall quality of the data
within a complex information system can disintegrate
due to a lack of process controls. Because this potential
DATA QUALITY MANAGEMENT increases as the variety of data types expands, process
For Industry 4.0, the aforementioned scenarios illustrate controls on data quality are critical. Most significantly, to
that data are the vehicles that drive and sustain its ensure high-quality data, companies need to gain broad
impact. Therefore, the greatest challenge for Industry 4.0 commitment at all levels of the organization to data
is ensuring the integrity of the data themselves. None of quality management principles and develop processes
the insights made possible by a highly connected, intel- and programs that reduce data defects over time (Brous,
ligent system will be valuable unless the data that are Janssen, and Vilminko-Heikkinen 2016).
recorded from the manufacturing system and its IoT or
CPS endpoints are sound.
The Big Data that are produced in manufacturing DATA QUALITY METHODOLOGY
environments are by nature heterogeneous, and data If decisions based on the given data are to be trusted by
structures can be complex. Errors, missing information, the organization, the data quality must be acceptable
inconsistencies, noise, and related issues covered in given the organization’s risk appetite (Marchildon et al.
previous sections can impact the quality of the insights 2018). There are four core activities that are used to make
that are generated from that data. Data must meet their this happen: 1) state reconstruction and data profiling;
users’ needs and be fit for purpose, that is, of high enough 2) data quality measurement or assessment; 3) data
quality to generate useful insights (Taleb, Serhani, and cleansing or improvement; and 4) the establishment of
Dssouli 2018). According to ISO 25012, data quality is continuous data quality monitoring.
defined as “the capability of data to satisfy stated and State reconstructions and data profiling: These
implied needs when used under specified conditions” are the activities and processes to determine the
(ISO 2008). context and metadata of a given dataset (Abedjan,
Data quality management is, as a result, becoming Golab, and Naumann 2017). State reconstruction
increasingly important. A systematic, quality-oriented describes the collection of contextual information
approach to data management treats data as an asset and on the observed data as well as on the organization
values them through its life cycle stages from provision- where a data quality project is carried out (Ehrlinger,
ing to archival and disposal (Silvola et al. 2016). Data Rusz, and Wolfram 2019). Data profiling is essential
quality management is a holistic practice that only begins prior to any data quality measurement or monitoring
with initial data quality assessment: activity. Information gathered during data profiling
“While measurement is an integral part of the data includes distinct or missing (null) values, data types
quality journey, data quality management involves of attributes, or occurring patterns and their fre-
much more than measurement. It also involves the quency (Ehrlinger et al. 2019).
management of people, processes, policies, technol- Data measurement or assessment: This focuses
ogy, standards, and data within an enterprise. Data on measurement capabilities of data quality tools,
quality management is data, people, process, and independent of how the results are interpreted by a
technology intensive, with data being at the core, and user. This extends the concept of measurement by
as such to succeed, would need all these elements evaluating the measurement results and drawing a
www.asq.org | 29
Data Quality Management for Industry 4.0: A Survey

conclusion about the object of assessment. This stage Semi-structured data neither conform to the
provides an initial estimation of data quality and an formal structure of data models associated with
impact analysis of data quality problems (Ehrlinger, relational databases nor are readied for compu-
Rusz, and Wolfram 2018). tation through tags or metadata. This includes
Data cleansing: Data cleansing (also known as natural language texts, newspaper articles, con-
cleaning, scrubbing, or wrangling) is the process versation records, comments, books, and similar
of detecting and correcting incorrect, inaccurate, (Batini et al. 2016).
irrelevant, or improperly formatted records from Because the proportion of unstructured to structured
a record set, table, or database. This may require data in industrial applications is very high, it will take a
replacing, modifying, or deleting dirty or coarse data lot of time to transform unstructured types into struc-
(Patel 2015). The purpose is to detect and remove tured types and further process the data. This is a great
errors and inconsistencies (Cai and Zhu 2015). challenge to existing techniques for data processing. Data
Data quality monitoring: This describes contin- change very fast, and the timeliness of data is very short,
uous monitoring of data according to user defined which necessitates more intensive requirements for
constraints (Patel 2015). processing (Cai and Zhu 2015).
The suitability and usefulness of data must be eval-
uated in the context in which the data are supposed to
be used (Gudivada, Apon, and Ding 2017; Juddoo 2015). DATA QUALITY DIMENSIONS
Since inadequate and inaccurate data at facilities that are Data quality is a factor throughout the life cycle of the
data driven can create highly visible failures in manufac- data, including collection, processing, storage, processing,
turing processes and issues with product quality, without internal use, external disclosure, and use in downstream
data quality assessment, organizations cannot have a applications (Hoeren 2018). Each quality dimension
proper understanding of how data quality impacts their needs different measurement tools, techniques, and
ability to satisfy the market (Mirzaie, Behkamal, and processes, which leads to differences in assessment times,
Paydar 2019). costs, and human resources.
A data quality methodology that includes these four When data are fit for use, they are able to meet the
stages is important for Industry 4.0 because it ensures requirements of users and stakeholders. For this to occur,
data and metadata best reflect the current-state con- different quality dimensions must be met. The dimen-
ditions of processes and machinery. By having sound sions that are most relevant to the Industry 4.0 scenarios
information, it becomes easier to understand anomalies described previously are (Cappiello, Sama, and Vitali
within systems proactively, thus evolving the overall 2018; Cai and Zhu 2015):
efficiency and performance. Accuracy refers to data being correct and concise,
defined as the degree to which a value is correct.
Completeness refers to data being comprehensive.
DATA TYPES It is evaluated by assessing the ratio between the
Due to the large variety of sources from which data amount of data that have been collected compared to
are collected and integrated, coupled with their sheer the data that could be collected.
volume and changing nature, it is often impossible Timeliness refers to ensuring data are current and
to globally specify data quality rules. Big Data come have not become irrelevant to the problem at hand.
with a major promise: having more data allows the Readability means the data can be easily explained
“data to speak for themselves” instead of relying and understood.
on unproven assumptions, weak correlations, or Consistency refers to the violation of semantic
inference (Saha and Srivastava 2014). Data types rules defined over a set of data items. Therefore,
are usually either structured, unstructured, and this dimension can be calculated only if a set
semi-structured. of rules that represent dependencies between
Unstructured data have a structure that is fluid or attributes is available.
flexible and may change from record to record. They Integrity means that data have a complete structure,
may be textual or non-textual, human or machine and values are standardized according to a data
generated, and stored in a non-relational database model and/or data type. All characteristics, including
like NoSQL (Taleb, Serhani, and Dssouli 2018). business rules, relations, dates, and definitions, must
Structured data are stored in regular, predictable be correct for the data to have integrity.
formats that ease retrieval and problem solving. Distinctness is related to the absence of duplicates
Relational databases store data in formats specified and measures the percentage of unique or distinct
by data models. attributes in a dataset.

30 | SQP VOL. 22 | NO. 2 | ©2020, ASQ


Data Quality Management for Industry 4.0: A Survey

Precision can be calculated only for numerical attri- environments, where workers may not be as accustomed
butes and can be defined as the degree with which the to working extensively with software and data. Establish-
values of an attribute are close to each other. ing a solid foundation through governance helps identify
Accessibility refers to the ease with which stake- the key process owners and changes to reduce the
holders can obtain data. potential of data quality issues and ensure the pedigree of
It was seen of highest importance that item data are the data remains consistent.
securely maintained, easily accessible, originate from Khatri and Brown (2010) proposed a data governance
a credible source, and are accurate and free of errors, framework with five concerns, which are further elabo-
while still providing advantages from their use (Saha and rated by Ibrahim, Sammonk, and Daly (2016):
Srivastava 2014). Still, data quality dimensions of com- Data principles: Establish ground rules for all other
pleteness, interpretability, reputation, and timeliness decisions. The principles set the boundary require-
were rated nearly as important in relation to item data by ments for acceptable use of data assets, which in turn
Silvola et al. (2016). informs standards for data quality.
For assessing Big Data quality, it is not sufficient Data quality: Establishes practices for assessing
to consider the traditional data quality dimensions data quality dimensions.
alone: additional dimensions need to be considered. Metadata: Describes information about who, what,
For instance, the large number of sources makes trust, when, where, and how the data were collected, or any
credibility, and data provenance important. Considering contextual information that illuminates how the data
data providers and all the operations or applications that can be used.
manipulate data is essential for evaluating data quality in Data access: Identifies who should have access to
context (Cappiello et al. 2019). the data.
Data life cycle: Defines the production, retention,
and retirement of data assets, which plays a
DATA QUALITY GOVERNANCE fundamental role in operationalizing the data
To properly improve and maintain the quality of their quality principles.
data, organizations should implement data governance. Since Khatri’s initial proposal of this data governance
Data governance is a set of best practices and controls framework, the Data Governance Institute (DGI) has
to actively manage and improve data quality. In partic- expanded the concept of governance. Table 1 shows its
ular, it aims to guide processes, policies, practices, and recommended structure.
structures to orchestrate people, processes, and tech-
nologies toward optimizing the collection, storage, use,
and dissemination of data (Ibrahim, Sammon, and Daly DATA QUALITY STANDARDS
2016). This process-oriented framework embeds data The ISO 25000 series of standards (referred to as
quality activities such as planning, cleaning, profiling, SQuaRE, for Software product Quality Requirements
assessing, issue tracking, and monitoring into all rele- and Evaluation) can be used “to define and evaluate data
vant operational processes (Marchildon et al. 2018) and quality requirements in data production, acquisition and
establishes the extent to which data are considered an integration processes; to identify data quality assurance
enterprise-wide asset (Brous, Janssen, and Vilminko- criteria... for reengineering, assessment and improve-
Heikkinen 2016). ment of data; [and] to evaluate the compliance of data
This step is distinct from data quality management. with legislation and/or requirements” (ISO 2008; Basson
Governance refers to decision-making and strategic own- et al. 2016).
ership of data quality to ensure effective management The data quality model in ISO/IEC 25012 is par-
and use of resources, while management implements ticularly well suited for Big Data in manufacturing
those decisions (Ibrahim, Sammon, and Daly 2016). Thus, environments because it is sensitive to the relative
governance defines roles, decision areas, and responsi- nature of data quality evaluated for different production
bilities, and identifies whether specialists must be hired, processes. While some data may be of high quality for
trained, and integrated into the organization to meet its one process, they may not be well suited to drive insights
data quality goals (Brous, Janssen, and Vilminko-Heikkinen from another. In ISO/IEC 25012, inherent data quality
2016). Communities of practice for governance, stew- refers to the degree to which quality characteristics of
ardship, and information management should also be data have the intrinsic potential to satisfy stated and
established at this time (Koltay et al. 2016). implied needs when data are used under specified condi-
Data governance reduces operational friction and tions. System-dependent data quality addresses fitness
encourages the adoption of common approaches to data for use in the context of a given process (Merino et al.
issues. This is particularly important in Industry 4.0 2015). As an extension, ISO/IEC 25024 provides general

www.asq.org | 31
Data Quality Management for Industry 4.0: A Survey

Table 1 Data Governance Institute (DGI) framework.

Area Application
1. Policy/Standards/Strategy Review, approve, monitor policy
Collect, choose, review, approve, monitor standards
Align policies and standards
Contribute to business rules
Contribute to data strategies
Identify stakeholders and establish decision rights
2. Data Quality Set direction for data quality
Monitor data quality
Report status for quality-focused initiatives
Identify stakeholders, establish decision rights, clarify accountabilities
3. Privacy/Compliance/ Help protect sensitive data through support for access management and security requirements
Security Align frameworks and initiatives
Help assess risk and define controls to manage risk
Help enforce regulatory, contractual, architectural compliance requirements
Identify stakeholders, establish decision rights, clarify accountabilities
4. Architecture/Integration Ensure consistent data definitions
Support architectural policies and standards
Support metadata programs, service-oriented architecture (SOA), Master data management (MDM),
enterprise data management (EDM)
Bring cross-functional attention to integration challenges
Identify stakeholders, establish decision rights, clarify accountabilities
5. D
 ata Warehouse & Establish rules for data usage and data definitions.
Business Intelligence Identify stakeholders, establish decision rights, clarify accountabilities
Identify SDLC embedded governance steps and loop-outs for projects
Clarify the value of data assets and data-related projects.
6. M
 anagement Support Measure the value of data and data-related efforts.
Align frameworks and initiatives
Identify stakeholders, establish decision rights, clarify accountabilities
Identify SDLC embedded governance steps and loop-outs for projects
Monitor and report on data-related projects
Promote data-related messages, positions, and principles
©2020 ASQ

measures to quantify the external and internal quality of preliminary, the scale of these systems and the potential
the data with compliance to the characteristics from ISO/ for poor data to have widespread impacts should compel
IEC 25012 (Merino et al. 2015). organizations to adopt relevant standards (Laranjeiro,
Because business value can only be assessed with Soydemir, and Bernadino 2015).
respect to context of use in Industry 4.0, Caballero,
Serrano, and Piattini (2014) propose a “3Cs” model to
evaluate the quality-in-use of big datasets: contextual CONCLUSIONS
consistency, operational consistency, and temporal This article examined the relationship between Industry
consistency. For each of these categories, they select 4.0 scenarios and data quality management approaches.
appropriate data quality dimensions to evaluate the By exploring 51 papers, the authors identified key charac-
quality of Big Data in terms of their volume, velocity, and teristics and frameworks that are applicable for ensuring
variety (see Table 2). data quality in modern manufacturing organizations. As
By design, the definitions in ISO/IEC 25012 are these facilities continue to become more reliant on data
quite generic as well, leaving space for an organization’s to monitor and streamline operations, optimizing the
specific context of use and opening space for further performance and longevity of their assets, they should be
research. Because research into the practical applications cognizant of how information is acquired, processed, and
of quality for Big Data in manufacturing environments is used to make business decisions.
32 | SQP VOL. 22 | NO. 2 | ©2020, ASQ
Data Quality Management for Industry 4.0: A Survey

Table 2 Linking data quality dimensions to ISO/IEC 25012 and characteristics of Big Data. (Caballero, Serrano, and Piattini 2014)

Velocity Volume Variety


Contextual Consistency, Credibility, Completeness, Credibility Accuracy, Consistency,
Consistency Confidentiality Understandability

Temporal Consistency, Credibility, Availability Consistency, Currentness,


Consistency Currentness, Availability Compliance

Operational Completeness, Accessibility, Completeness, Accessibility, Accuracy, Accessibility,


Consistency Efficiency, Traceability, Efficiency, Availability, Compliance, Efficiency,
Availability, Recoverability Recoverability Precision, Traceability,
Availability, Recoverability
©2020 ASQ

Some specific recommendations from this survey include: manufacturing processes, data flow should be just as
Data
�  quality management is an ongoing process. important as Industry 4.0 technologies continue to
Companies need to gain broad commitment at be integrated.
all levels of the organization to data quality man- � Manufacturing facilities should consider applying
agement principles and develop processes and standards for data quality like ISO 25012 or the 3C
programs that continually reduce data defects model derived from this standard. These have tradi-
over time. tionally only been used in software organizations, and
� Establishing a framework to ensure the quality of data while they are unproven, they hold great potential.
at its source (IoT/CPS endpoints) is critical to reduce Poor data quality has a cascading effect within all orga-
the likelihood of poor data quality throughout the nizations that can spread through the global supply chain.
production system. The compounding impact of poor data quality cannot only
� Process controls on data quality are also import- lead to erroneous business decisions and low efficiency but
ant. In addition to studying the material flow in also to reduction of credibility and loss of reputation.

REFERENCES

Abedjan, Z., L. Golab, and F. Naumann. 2017. Data profiling: A tutorial. Conference Interoperability for Enterprise Systems and Applications,
SIGMOD ’17: Proceedings of the 2017 ACM International Conference on Guimarães, Portugal.
Management of Data, 1747-1751.
Batini, C., A. Rula, M. Scannapieco, and G. Viscusi. 2016. From data
Abdel-Basset, M., G. Manogaran, and M. Mohamed. 2018. Internet of quality to big data quality. In Big Data: Concepts, Methodologies, Tools,
Things (IoT) and its impact on supply chain: A framework for building and Applications, 1934-1956. Hershey, PA: IGI Global.
smart, secure and efficient systems. Future Generation Computer
Systems 86, 614-628. Brous, P., M. Janssen, and R. Vilminko-Heikkinen. 2016. Coordinating
decision-making in data management activities: A systematic review
Ahmadi, A., C. Cherifi, V. Cheutet, and Y. Ouzrout. 2017. A review of CPS of data governance principles. In International Conference on Elec-
5 components architecture for manufacturing based on standards. In tronic Government, 115-125. Springer, Cham.
2017 11th International Conference on Software, Knowledge, Information
Management and Applications (SKIMA). New York: IEEE, 1-6. Caballero, I., M. Serrano, and M. Piattini. 2014. A data quality in use
model for big data. In International Conference on Conceptual Model-
Barrera, R., A. Navarro-García, and M. Moreno. 2015. The relation- ing, 65-74. Springer, Cham.
ship between the online consumer’s profile and the type of service
encounter in the online travel agencies. In Integrated Management Cai, Li, and Y. Zhu. 2015. The challenges of data quality and data qual-
Systems. New York: Springer. ity assessment in the big data era. Data Science Journal 14.

Basson, H., M. Bouneffa, M. Matsuda, A. Ahmad, and D. Chung, et al. Cappiello, C., M. Comuzzi, F. Daniel, and G. Meroni. 2019. Data quality
2016. Qualitative Evaluation of Manufacturing Software Units Interop- control in blockchain applications. In International Conference on
erability using ISO 25000 Quality Model. I-ESA 2016, 8th International Business Process Management, 166-181. Springer, Cham.

www.asq.org | 33
Data Quality Management for Industry 4.0: A Survey

Cappiello, C., W. Samá, and M. Vitali. 2018. Quality awareness for a Laranjeiro, N., S. N. Soydemir, and J. Bernardino. 2015. A survey on data
successful big data exploitation. IDEAS. quality: Classifying poor data. In 2015 IEEE 21st Pacific rim international
symposium on dependable computing (PRDC). New York: IEEE, 179-188.
Ehrlinger, L., E. Rusz, and W. Wolfram. 2019. A survey of data quality
measurement and monitoring tools. Lee, J., B. Bagheri, and H. A. Kao. 2015. A cyber-physical systems
architecture for industry 4.0-based manufacturing systems.
Goienetxea Uriarte, A., A. H. Ng, and M. Urenda Moris. 2018. Support- Manufacturing Letters 3, 18-23.
ing the lean journey with simulation and optimization in the context
of Industry 4.0. In Procedia Manufacturing 25, 586-593. Longo, F., L. Nicoletti, and A. Padovano. 2017. Smart operators in
industry 4.0: A human-centered approach to enhance operators’
Gudivada, V., A. Apon, and J. Ding. 2017. Data quality considerations capabilities and competencies within the new smart factory context.
for big data and machine learning: Going beyond data cleaning and Computers and Industrial Engineering 113, 144-159.
transformations. International Journal on Advances in Software 10, 1-20.
Marchildon, P., S. Bourdeau, P. Hadaya, and A. Labissière. 2018. Data
Hoeren T. 2018. Big data and data quality. In Big Data in Context, T. governance maturity assessment tool: A design science approach.
Hoeren and B. Kolany-Raiser, eds. SpringerBriefs in Law. Springer, Cham. Projectics / Proyéctica / Projectique 20, no. 2:155-193.

Ibrahim, A., D. Sammon, and M. Daly. 2016. Data governance activities: Mahanti, R. 2019. Data quality and data quality dimensions. Software
An analysis of the literature. Journal of Decision Systems 25. 64-75. Quality Professional 22, no. 1.

Illés, B., P. Tamás, P. Dobos, and R. Skapinyecz. 2017. New challenges for Mayr, A., M. Weigelt, A. Kühl, S. Grimm, A. Erll, M. Potzel, and J.
quality assurance of manufacturing processes in industry 4.0. In Solid Franke. 2018. Lean 4.0-A conceptual conjunction of lean management
State Phenomena 261, 481-486. Switzerland: Trans Tech Publications. and Industry 4.0. Procedia Cirp 72, 622-628.

Imani, M., and N. Radziwill. 2020. Computerized maintenance and Merino, J., I. Caballero, B. Rivas, M. Serrano, and M. Piattini. 2015. A data
asset management: Improving quality through systems integration. quality in use model for big data. Future Generation Computer Systems.
Software Quality Professional 22, no. 2 (March).
Mirzaie, M., B. Behkamal, and S. Paydar. 2019. Big Data Quality: A
ISO. 2008. ISO/IEC 25012:2008. Software engineering — Software systematic literature review and future research directions.
product quality requirements and evaluation (SQuaRE) — Data qual-
ity model. Available at: https://www.iso.org/standard/35736.html. Monostori, L., B. Kádár, T. Bauernhansl, S. Kondoh, S. Kumara, G.
Reinhart, et al. 2016. Cyber-physical systems in manufacturing. Cirp
Jayaram, A. 2016. Lean six sigma approach for global supply chain Annals 65, no. 2: 621-641.
management using industry 4.0 and IIoT. In 2016 2nd International
Conference on Contemporary Computing and Informatics (IC3I). New Mourtzis, D., E. Vlachou, and N. Milas. 2016. Industrial big data as a
York: IEEE, 89-94. result of IoT adoption in manufacturing. Procedia cirp 55, 290-295.

Juddoo, S. 2015. Overview of data quality challenges in the context of Patel, J. 2015. Data quality management – Tools and techniques.
Big Data. International Conference on Computing, Communications, International Journal of Computer Science and Communications 7, no.
and Security (December). 1. Available at: http://csjournals.com/IJCSC/PDF7-1/29.%20Jigna.pdf.

Khatri, V., and C. V. Brown. 2010. Designing data governance. Com- Rüßmann, M., M. Lorenz, P. Gerbert, M. Waldner, J. Justus, P. Engel, and
munications of the ACM, 53, no. 1:148-152. M. Harnisch. 2015. Industry 4.0: The future of productivity and growth
in manufacturing industries. Boston Consulting Group 9, no. 1:54-89.
Kiel, D., J. M. Müller, C. Arnold, and K. I. Voigt. 2017. Sustainable indus-
trial value creation: Benefits and challenges of industry 4.0. Interna- Saha, B., and D. Srivastava. 2014. Data quality: The other face of big
tional Journal of Innovation Management 21, 08. data. In Proceedings of the International Conference on Data Engi-
neering, 1294-1297.
Kolberg, D., and D. Zühlke. 2015. Lean automation enabled by
industry 4.0 technologies. IFAC-PapersOnLine 48, no. 3:1870-1875. Sanders, A., C. Elangeswaran, and J. P. Wulfsberg. 2016. Industry
4.0 implies lean manufacturing: Research activities in industry 4.0
Koltay, T. 2016. Data governance, data literacy and the management function as enablers for lean manufacturing. Journal of Industrial
of data quality. IFLA 42, no. 4:303-312. Engineering and Management 9, no. 3:811-833.

34 | SQP VOL. 22 | NO. 2 | ©2020, ASQ


Data Quality Management for Industry 4.0: A Survey

When processes can


more effectively use and
incorporate data, workers
and managers are better
equipped to make sound,
thorough, data-driven
business decisions.
Image: AzmanL via iStock.com

Silvola, R., J. Harkonen, O. Vilppola, H. Kropsu-Vehkapera, and H. Yan, J., Y. Meng, L. Lu, Lei, and L. Li. 2017. Industrial big data in an
Haapasalo. 2016. Data quality assessment and improvement. Interna- industry 4.0 environment: Challenges, schemes and applications for
tional Journal of Business Information Systems 22, no. 1, 62-81. predictive maintenance. IEEE Access 1-1.

Taleb, I., M. A. Serhani, and R. Dssouli. 2018. Big data quality assess- Zhou, K., T. Liu, and L. Zhou. 2015. Industry 4.0: Towards future
ment model for unstructured data. In 2018 International Conference industrial opportunities and challenges. In 2015 12th International
on Innovations in Information Technology (IIT). New York: IEEE, 69-74. conference on fuzzy systems and knowledge discovery (FSKD). New
York: IEEE, 2147-2152.
Thomas, G. 2014. The DGI data governance framework. White
paper. Available at: http://www.datagovernance.com/wp-content/ Zúñiga, E. R., M. U. Moris, and A. Syberfeldt. 2017. Integrating
uploads/2014/11/dgi_framework.pdf. simulation-based optimization, lean, and the concepts of industry
4.0. In 2017 Winter Simulation Conference (WSC). New York: IEEE,
Vaidya, S., P. Ambad, and S. Bhosle. 2018. Industry 4.0 – A Glimpse. 3828-3839.
20. 233-238.

BIOGRAPHIES

Diamond Williams works as a data engineer for ENGIE and operates ZaDor Consulting, LLC. She graduated from Missouri University of Science &
Technology with a bachelor’s degree in engineering management/industrial engineering and is completing a master’s degree in quality manage-
ment from Eastern Michigan University. Diamond is a member of NSBE, ASQ, and OSDC. She can be reached by email at dwill267@emich.edu.

Herman Tang is an associate professor at the School of Engineering, Eastern Michigan University. His research experiences are in the areas of
mechanical, manufacturing, and quality engineering. Tang has published three technical books and many scholarly journal papers. He has also
served as an associate editor and a reviewer for several scholarly journals and conferences, and a panelist for NSF. Tang earned his doctorate
degree in mechanical engineering from the University of Michigan–Ann Arbor, a master’s degree and a bachelor’s degree in mechanical engi-
neering from Tianjin University, and an MBA in industrial management from Baker College. Tang is a member of SAE, SME, ASME, and ASQ. He
can be reached by email at htang2@emich.edu.

www.asq.org | 35

Potrebbero piacerti anche