Sei sulla pagina 1di 4

Data Quality Cockpit using SAP Information Steward and Data

Services - Part 1

Setting up Data Quality Cockpit - Understanding the data


quality
In our engagement with Customer's we often come across data quality with master data
as a major issue and they are keen to implement solutions to tackle this by
implementing master data solutions. It is frequently believed that MDM/MDG solves all
problems related to master data and implementing it is sufficient to manage data
quality. While it has solved the problem to some extent in terms of managing and
governing master data, data quality issues are not completely resolved. This is as tools
meant for managing master data are not good at cleaning and enriching the data. SAP
has introduced two other tools, SAP Information Steward and SAP Data Services,
primarily to tackle this issue of data quality. These two tools can very well complement
the master data solution or work independently in providing a comprehensive solution
for data quality management.
Below is an example of a customer situation and how we helped them solve their
problem. They had issues with Master data being unclean, redundant and consisting of
duplicates and did not have industry standard master data package but a homegrown
application to manage master data processes
Business User complained of
1. Rules insufficient to identify bad data
2. No holistic view of data quality in the productive database
3. Data in productive database has gone bad over a period of time
4. Period DQ checks on productive database not possible due to tool limitations
5. Despite rules - duplicates exist
Technical Support complained of
1. Lack of scalability and extensibility
2. Lot of effort goes into building custom tools, no funding
3. No periodic upgrades and chance of technology getting outdated
4. No out of box features, everything has to be coded as needed
A standard MDM package would have done only half the job as MDM is meant to consolidate,
centralize and harmonize master data between various connected systems, but lacks in-depth
data cleansing, transformations, enrichment and ongoing data quality assessment capabilities.
The crux of the problem was in enabling business to control, monitor and maintain the quality
of data on an continuous basis and at the same time leverage existing master data setup of the
Client.
The solution implemented was a combination of SAP's Information Steward and SAP's
Data Services which could be leveraged in the existing client's landscape with minimal
disruption to the established business processes. SAP Info Steward provided the right
tools to understand the fundamental areas of a problem in order to know where to
focus the solution. This along with ETL capabilities of SAP Data Services provided the
complete solution.
The profiling features together with DQ Scorecard was the first step to solving the DQ problem.
The out of box profiling feature was used to understand the quality of data in terms of fill rate,
dependencies, address validity, uniqueness and redundancies in the source data files. Rules
gathered from existing process and additional business rules provided the definition for what
qualified as good quality data. These rules were configured into the SAP IS rule engine using
the rule definition and binding features. Weightage was provided to different quality
dimensions like the completeness, uniqueness, conformity, etc. By connecting SAP IS to the
data staging environment and running the rules on the source data we arrived at a score card
for in stage As Is quality of data. The score was low with many failed records which did not
meet the criteria defined for standardization and cleanliness. It also showed which dimension
of data had the most issues.
This gave data stewards, analysts and information governance experts a very good
understanding of where their data quality stands and where to fix to get maximum benefits.
Continuation Part 2: We will discuss in next blog how to fix the issues identified with data and
get better ROI

Data Quality Cockpit using SAP Information Steward and Data


Services - Part 2
Setting up Data Quality Cockpit - Fixing data for better ROI
Previous blog: We discussed how we identified issues with data and where to focus our
attention to fix
The data flow was setup within SAP BO Data Services with workflow steps containing the rules
imported from SAP Info Steward. Additional workflow steps were included to enrich the data
using transforms like address and company name transforms. The workflow was branched to
apply different cleansing and transformation rules to data that belonged to different regions.
The last step of the data flow was a matching step that allowed for grouping and scoring of the
matched records. The higher the score and nearer to the upper threshold, the more was the
chance of the record being a duplicate. The cleaned and matched record file was provided to
business to verify and identify the actual duplicates based on business actuals. This completed
the cleansing, transformation & matching of the data.
This data when loaded back to SAP IS and the same rules that were applied on the initial
unclean data were applied on the cleansed and enriched data. The score card showed a great
degree of improvement and gave a sense of how much reliable data had become after
performing some simple cleansing and enrichment steps. This process of doing a health check
on the quality of data is a continuous process and needs to be done periodically. By binding the
rules & transforms into a web service and using it at the point of creation of data also ensures
that data entering the system is clean and validated.
To understand the ROI by investing on tool and additional processes a simple impact analysis
feature of the IS tool could be leveraged. By identifying the key attributes that define data,
determining the cost of each bad attribute and its effect on the record, analyzing the impact of
bad data on business and extrapolating it to the universe gives a sense of magnitude the bad
data can have on the overall business. This when translated into potential savings and
presented in a form understandable by management provides answers to questions around
ROI.

Data Quality Cockpit using SAP Information Steward and Data


Services - Part 3
Setting up Data Quality Cockpit - Challenges and Key
advantages
Previous blog: We discussed how to fix the issues identified with data and get better ROI
To make good use of the DQ cockpit, a proper data quality or governance organization is
required. Data Stewards and owners need to continuously engage with business users to
understand the changing needs in data and its validation to continuously build or keep the rules
updated. As data matures and expires with time, the same rules may not always hold good and
the rules or validations that govern the data needs to continuously change and refine as
business demands. Analysts can run daily, weekly or periodic jobs to generate reports that can
help business or information stewards understand the state of data quality and take continuous
measures to keep data clean and reliable.
Below are some key advantages and benefits of the tool
1. An easy plug in solution that could extract in-process data at particular stage of
business process, cleanse/enrich and put data back into the productive database
2. Rules that could be easily coded by data admins/stewards for data validation and
ensure reuse in for actual cleansing. Rules can be exchanged between SAP IS
and SAP DS
3. Usage of external services for enriching data like the address directory services
for different countries
4. Periodic health check of the productive database and dashboard view to
governance/business to ensure data standards and quality are maintained at
appropriate levels
5. Identifying duplicates , determination of survivor and maintaining history of
merged records
6. Financial impact analysis and calculating ROI
The SAP Information Steward and SAP Data Services are very important tools for the Stewards,
Analysts and Business Users to regularly conduct a health check on the quality of data and take
timely corrective actions. It is imperative that data quality management is not a one-time
exercise but a continuous one. Continuous improvement in the quality of data enabled by SAP
IS and SAP DS provide the key foundational elements that enable governance and improve trust
in the data infrastructure of the Organization.

Potrebbero piacerti anche