Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
To cite this article: Henning Baars & Hans-George Kemper (2008) Management Support with Structured and Unstructured
DataAn Integrated Business Intelligence Framework, Information Systems Management, 25:2, 132-148, DOI:
10.1080/10580530801941058
To link to this article: http://dx.doi.org/10.1080/10580530801941058
Abstract In the course of the evolution of management support towards corporate wide Business
Intelligence infrastructures, the integration of components for handling unstructured data comes into
focus. In this paper, three types of approaches for tackling the respective challenges are distinguished. The
approaches are mapped to a three layer BI framework and discussed regarding challenges and business
potential. The application of the framework is exemplified for the domains of Competitive Intelligence
and Customer Relationship Management.
Motivation
The concept of Business Intelligence (BI) is increasingly
gaining in visibility and relevance within the business
realm (Gartner Group, 2006). Originally coined by Gartner
Group as a collective term for data analysis tools
(Anandarajan, Anandarajan, & Srinivasan, 2004), Business
Intelligence is now commonly understood to encompass
all components of an integrated management support
infrastructure. The increased importance of such infrastructures reflects three interacting trends: more turbulent, global business environments, additional pressures to unveil
valid risk and performance indicators to stakeholders, and
aggravated challenges of effectively managing the more
and more densely interwoven processes (Kemper, Mehanna,
& Unger, 2004). To meet the respective requirements, traditional management support systems have evolved to enterprise-spanning solutions that support all managerial levels
and business processes: Envisioned are infrastructures for
business performance management approaches that
involve strategic, tactical, and operational managers alike.
This calls for seamlessly interconnected functionality that
enables continuous business process monitoring, in-depth
data analysis, and efficient management communication.
(Kohavi, Rothleder, & Simoudis, 2002; Golfarelli, Rizzi, &
Cella, 2004; Eckerson, 2006; Kimball, & Ross, 2002).
Address correspondence to Henning Baars, Betriebswirtschaftliches Institut, Lehrstuhl fr ABWL und Wirtschaftsinformatik I,
Breitscheidstr. 2c, 70174 Stuttgart, Germany. E-mail: baars@
wi.uni-stuttgart.de
132
133
2.
2.
Layer 3: Access
1.
3.
Layer 2: Logic
1.
Layer 1: Data
3.
Section 2:
Integration
Approaches
Section 3:
Framework
Introduction
Figure 1.
Section 4:
Layer-by-Layer
Discussion
CRM
Competitive
Intelligence
Section 5:
Application
Scenarios
Section 6:
Conclusions and
Further Research
134
Integrated Presentation
Coupled Navigation
and Search Functions
Structured
Data
Figure 2.
Unstructured
Data
Integrated Presentation
In this approach, structured data and unstructured content are simultaneously accessed via an integrated user
interface. This basic idea leads to a wide spectrum of integration possibilities: Starting with a simple side-by-side
presentation of contents up to firmly coupled systems
with elaborately combined search and presentation functions (Becker et al., 2002; Priebe et al., 2003).
Example: When navigating in sales data with an OLAP
application the selection of analysis dimensions (e.g.
time and product group), of the data subset (e.g. only
East-Asian outlets), and of the granularity (e.g. results
per quarter) automatically triggers a parallel search for
fitting content in a document repository (e.g. documents
with results from market research on customer requirements in the Pacific-Asian region). The OLAP data and the
selected documents are presented side-by-side.
Figure 2 visualizes the approach by depicting the independent systems with the integrated presentation layer
on top of them.
The main benefits of this approach can be traced back
to a more convenient handling of the respective functionalities to bolster their combined usage: Functions to
access structured and unstructured data can be used
together in an efficient and straightforward manner and
users have to get accustomed to one system with one
user interface only. Moreover, an automatically generated juxtaposition of search results can uncover and
visualize otherwise neglected interrelations between
structured and unstructured content (Klesse et al., 2005;
Baars, 2005).
Structured
Data
incl. Metadata on
Unstructured Content
Figure 3.
Content
Extraction
Collections
of Metadata (Unstructured Data)
Content can be based on metadata for trouble tickets collected at a help desk: What types of problems
arise with what kinds of products in what regions
at what times? Can certain types of complaints be
linked to recent profit developments? A
Content can be based on service reports: Are there
clusters of problems that can be traced back to
certain product configurations?
135
136
Analysis
Report
Q1
Q2
Q3
Q4
UmsatzderFiliale23
nachProduktgruppenfr
dasGeschftsjahr07
Die Umsatz-entwicklung
frdasProdukt"Markgeschrei2000" weicht
deutlichvomPlan abDie
Analyseergebnisse
mssenjedochrelativert
werden
50
40
30
20
10
0
10
Q1 Q2 Q3 Q4
Analysis
Template
OLAP-Analyse/ Tool A200
Datenbasis: S_Cube
Faktentabelle: VertriebUmsatz
Dimensionen:
Produktgruppen: Kleinelektro
bFiliale einzugeben
Jahr einzugeben
Aufbereitung:
Balkendiagramm
keineLegende
Verkufe
Schriftart"Garamond light" 16 Pt
Anmerkung:
Saisonbereinigungnotwendig
Data
Figure 4.
Distribution for
Direct Use
Distribution to Enable
the Execution of
a Similar Analysis
137
Logic
Layer
BI Portal
Meta Data
Systems for
Data Analysis
Content
&
Document
Mgmt.
Data Mart
Core Data Warehouse
Operational Data Store
Data
Layer
Operational Systems
with structured and
unstructured data
Knowledge
Distribution
SCM
E-Proc.
ERP
CRM
Value Chain
Figure 5.
External
Data
138
Meta Data
1. Integrated Presentation
Logic
Layer
2. Analysis
of Content
Collections
3. Distribution
of Results
and Templates
Structured
Data
Data
Layer
Content
Collections
incl. Metadata on
unstructured Content
Operational Systems
with structured and
unstructured data
Figure 6.
SCM
E-Proc.
ERP
CRM
Value Chain
External
Data
Data Layer
A consistent base of semantically and syntactically
harmonized data pertinent for management support is a
main pillar for a solid BI solution (Kemper, 2000). Regarding
structured data, the central component to accomplish this
task in larger BI installations is the data warehouse.
According to Inmon (2005), a data warehouse is
defined as a subject-oriented, integrated, time-variant,
and non volatile collection of data in support of managements decision-making process. Many current realizations of data warehouses are based on so called core
data warehouses, which are dedicated components for
the storage of all management support data. core data
Data Warehouse
incl. Metadata on
Unstructured Content
Tools to transform
meta data and to
include it in a
Data Warehouse
139
Logic Layer
The Logic Layer focuses on the compilation, processing,
and distribution of management support data. Two types
of systems are distinguished at this layer: systems for
data analysis and components for knowledge distribution. Systems for data analysis are collecting and processing data from the Data Layer and convert them into a
form pertinent for a presentation to the user. Here two
groups of systems are distinguished (Kemper et al., 2004).
Directly Extracted
Metadata (e.g.
manually entered,
log files)
Generation of
Metadata Based
on Content Analysis
Tools to Analyze
Content and to Generate
Metadata according
Predefined Dimensions
Figure 7.
Content and
Document Management
with Collections
of Content Items
(Unstructured Data)
140
Options to Link Systems for Data Analysis with the Data Layer
Classical Data Warehousing
Real-time Data Warehousing
Closed-loop Data Warehousing
Active Data Warehousing
Concept oriented systems
Balance Scorecard
Planing and Budgeting
Financial Consolidation
Value Based Management
Figure 8.
141
Access Layer
Regarding the exchange of results and templates
between Analysis Systems and the knowledge distribution components, efficiency gains may be attained by the
introduction of a respective middleware. Such a middleware can act as a hub between several systems and
include functions to convert and translate formats. This
idea is shown in Figure 9. Reports and templates that are
The Data Access Layer is responsible for bringing all relevant components and functions from the Logic Layer
together and to present them to the user in an integrated
and personalized fashion. It is common to use portal
systems for the implementation of this layer. Portals
provide an integrated user interface for different content
OLAP-System
Report
Extraction
Data Mining
System
Template
.
.
.
Report
Mo
de
l
Model
Report
Inclusion
Analysis System
Template
Document
Document
Document
Document
Document
Document
Te
mpla
te
Template
Bericht
Template
Analysis System
Figure 9.
A middleware hub for the exchange of analysis results and analysis templates.
142
Application Scenarios
Use and relevance of the presented framework and the
three integration approaches are illustrated for two
application domains:
CRM; and
CI.
Integrated Portal
Portlet to
Access and Analyze
Structured Data
(e.g. with OLAP
Functionality)
Portlet to
Access and Navigate
Unstructured Data
(e.g. with Information
Retrieval Functionality)
Component to
Simultaneously
Access Structured
and Unstructured Data
Translation of User
Actions in the
Structured Data Portlet
into Procedure Calls for
the Component to Find
Unstructured Data
Analysis System
(e.g. a Data Mart
based OLAP System)
Figure 10.
Translation of User
Actions in the
Unstructured Data Portlet
into Procedure Calls for
the Analysis System
Components
to Find and Distribute
Unstructured Data
(e.g. from a
Document ManagementSystem)
143
CRM is subject to BI, as it needs an effective and integrated management support infrastructure to cover the
Mailings
Telephone
Internet
Face-toFace
Marketing
Automation
TV/Radio
Sales
Automation
Data
Mining
Service
Automation
Operational
CRM
Front
Office
WAP
Customer
Interaction
Center
Collaborative
CRM
OLAP
Analytical
C RM
Back
Office
Customer Data Warehouse
Part of Management Business Intelligence
Figure 11.
Customer relationship management and business intelligence (based on Hippner & Wilde, 2004).
144
While analyzing sales results with an OLAP application, corresponding market reports are selected and
presented to the user. This way she can distinguish
more easily between those that can be attributed to
general market trends and those that are more
likely the results of sales performance (Approach 1).
All customer letters arriving at an organizational
unit to handle complaints are scanned and stored
digitally in a document management system.
Relevant metadata is entered (manually) for each
document, e.g. affected product, relevant product
parts, occurrence time, problem type. The metadata
is extracted and fed into the Customer Data Warehouse. Using analytical CRM applications an accumulation of defects in certain usage environments
can be identified and be associated with a decline in
sales. Based on these results sales representatives
can optimize their selection of recommended products (Approach 2).
Customer suggestions regarding product enhancements from a web form are classified with text
mining software. Afterwards suggestions can be
browsed and selected according to these classifications. Moreover, the classification metadata is fed
into the customer data warehouse; thereby, it
becomes analyzable with data mining software. For
example, customer clusters can be generated under
consideration of their suggestions (Approach 2).
A data-mining expert conducts a complex analysis
regarding up-selling possibilities based on a basket
analysis. The respective result is adequately formatted, enriched with annotations, and made available
to the sales and service staff in document form with
components for knowledge distribution. This
involves an editorial workflow to ensure intelligibility of the document from a target users perspective
and storage in a content management system. Sales
and service reps can use effective search and navigation options that facilitating finding relevant
reports whenever needed. Because the analysis will
be obsolete in the foreseeable future due to changes
in the product portfolio, the expert also generates a
template which allows for a quick renewal of the
analysis (Approach 3).
Competitive Intelligence
Despite the resemblance of their names, Competitive
Intelligence (CI) and Business Intelligence have
evolved independently of each other. The origins of CI
can be traced back over two decades when Porter introduced the concept Competitor Intelligence (Porter,
1980). Nowadays CI is commonly understood to focus on
all processes for gathering and analyzing information
about the competition and the general market environment (Ghoshal, & Westney, 1991; Vedder et al., 1999;
SCIP, 2005).
Up until recently, CI practitioners and researchers did
not stress technological and infrastructural support
(Vedder et al., 1999). That does not come as a surprise
considering the heterogeneous and mostly external
nature of the information sources used in CI; e.g.,
product brochures, conventions and conferences, customer and Delphi surveys, or patent databases (Dugal,
1998; Meier, 2004).
With the increasing importance of the internet and of
electronic documents, a trend towards integrated IT
infrastructures for CI can be observed (Vedder et al.,
1999). It becomes more and more important to understand CI applications as part of a wider management
support IT infrastructure. CI is therefore nowadays seen
as an application domain of business intelligence
(Kemper, & Baars, 2006).
Considering its broad subject it does not come as a surprise that one finds a wide spectrum of heterogeneous CI
analysis applications which differ widely in contents, frequency of usage, urgency of need, analysis depth, and target groups (Dugal, 1998). In CI, this is even more the case
than in CRM, which, also subsumes highly different systems, is driven much more by an integrative orientation.
Because of the strong focus on external information,
some important commonalities among all CI applications do exist, though. Most relevant CI information is
coming from outside company borders: financial reports
of competitors, government publications, patent databases, research publications etc. (Lux, & Peske, 2002).
Most of those are nowadays available in electronic form
and can be accessed via Internet technologies (Vedder
et al., 1999). A particularity of these sources is that they
usually vary in format, quality, and structure, and have
to be gathered from a huge number of sites. Efficient
data retrieval, quality assurance, and data consolidation
is of central importance in CI.
When designing CI applications one needs to explicitly address the dominance of qualitative and unstructured data. Particularly data published on the Internet is
(still) mostly only available in document form, e.g. as
HTML or PDF files (Kemper, & Baars, 2006).
The discussed requirements result in dominance of
systems for handling unstructured data, namely content
145
BI Portal
Meta Data
Logic
Layer
Knowledge
Distribution
Systems for
Data Analysis
Data Mart
Core Data Warehouse
Operational Data Store
Data
Layer
Operational Systems
with structured and
unstructured data
Figure 12.
SCM
E-Proc.
ERP
ValueChain
CRM
Content
&
Document
Mgmt.
A middleware hub to
exchange results and
templates between data
analysis systems and
knowledge distribution
components
Proposed tools and components for the realization of the three approaches.
146
Table 1.
Integrated presentation
Complexity of
implementation
Costs of usage,
maintenance
and Support
Benefits
Author Bios
HENNING BAARS received his doctorate degree from
the University of Cologne. He currently lectures and
researches at the University of Stuttgart at the Chair
of Information Systems I. His research focus lies on
Business Intelligence applications in the industrial
sector. He can be reached atbaars@wi.uni-stuttgart.de.
HANS-GEORG KEMPER is a Full Professor of Information
Systems and holder of the Chair of Information
Systems I at the University of Stuttgart. He has conducted extensive research in the fields of Business
Intelligence and Management Support.
References
Alter, A. (2003). Business IntelligenceAre Your BI Systems
Making You Smarter. CIO Insight, 05/2003, 7785.
Anandarajan, M., Anandarajan, A., & Srinivasan, C. A. (2004).
Business Intelligence Techniques. Berlin: Springer.
Ariyachandra, T., & Watson, J. (2006). Which Data Warehouse
Architecture is the Most Successful. Business Intelligence Journal,
11(1), 24.
Baars, H. (2005). Knowledge Management und Business-Intelligence. In WM 2005 - Professional Knowledge Management - Experiences and Visions. Contributions to the 3rd Conference Professional
Knowledge Management Experiences and Visions, Althoff, K.D.,
Dengel, A., Bergmann, R., Nick, M. & Roth-Berghofer, T., Eds.
(pp. 429433). Heidelberg: Physica.
Baars, H. (2006). Distribution von Business-Intelligence-Wissen
Diskussion eines Ansatzes zur Nutzung von Wissensmanagement-Systemen fr die Verbreitung von Analyseergebnissen
und Analysetemplates. In, Analytische Informationssysteme
Business Intelligence-Technologien und -Anwendungen P. Chamoni &
P. Gluchowski, Eds. (3rd ed., pp. 409424). Berlin: Springer.
Becker, J., Knackstedt, R., & Serries, T. (2002). Informationsportale
fr das Management Integration von Data-Warehouse- und
Content-Management-Systemen. In Vom Data Warehouse zum
Corporate Knowledge CenterProceedings der Data Warehousing 2002,
E. Maur & R. Winter, Eds. (pp. 241261). Heidelberg: Physica.
Boulding, W., Staelin, W., Ehret, M., & Johnston, W.J. (2005). A
Customer Relationship Management Roadmap: What Is
Known, Potential Pitfalls, and Where to Go. Journal of Marketing,
69(4), 155166.
Codd, E. F., Codd, S. B., & Salley, C. T. (1993). Providing OLAP to
User-Analysts: An IT Mandate. Retrieved September 16 2006 from
http://dev.hyperion.com/download_files/resource_library/
white_papers/providing_olap_to_user_analysts.pdf
Cody, W. F., Kreulen, J. T., Krishna, V., & Spangler, W.S. (2002).
The Integration of Business Intelligence and Knowledge
Management. IBM Systems Journal, 41(4), 697713.
Davydov, M. (2001). Corporate Portals and e-Business Integration.
New York: McGraw-Hills.
Devlin, B. (1996). Data Warehouse: From Architecture to Implementation. Boston: Addison-Wesley Professional.
Drre, J., Gerstel, P., & Seiffert, R. (1999). Text Mining. Finding
Nuggets in Mountains of Textual Data. In Proceedings of the Fifth
ACM SIFKDD International Conference on Knowledge Discovery and
Data Mining,, San Diego, 1999 (pp. 398401). New York. Association for Computing Machinery.
Dugal, M. (1998). CI Product LineA Tool for Enhancing User
Acceptance of CI. Competitor Intelligence Review, 9(2), 1725.
Eckerson, W.W. (2006). Performance Dashboards. Measuring, Monitoring,
and Managing Your Business. Hoboken, NJ: John Wiley & Sons.
Fan, W., Wallace, L., Rich, S., & Zhang, Z. (2006). Tapping the
Power of Text Mining. Communications of the ACM, 49(9), 7782.
Furness, P. (2004): Techniques for Customer Modelling in CRM.
Journal of Financial Services Marketing, 5(4), 293307.
Gartner Group (2006). Gartner Survey of 1,400 CIOs Shows
Transformation of IT Organisation is Accelerating. Retrieved
June 19 2006 from http://www.gartner.com/press_releases/
asset_143678_11.htm.
Ghoshal, S. & Westney, D. E. (1991): Organizing Competitor
Analysis Systems. Strategic Management Journal, 12, 1731.
Giovinazzo, W. A. (2002): Internet-Enabled Business Intelligence.
Upper Saddle River, NJ: Prentice-Hall.
Golfarelli, M., Rizzi, S., & Cella, I. (2004). Beyond Data WarehousingWhats Next in Business Intelligence. In Proceedings
of the DOLAP 2004, pp. 16, New York, November 2004. Association for Computing Machinery.
Gtzer, K., Schneiderach, U., Maier, B., Boehmelt, W., & Komke,
T. (2001): DokumentenmanagementInformationen im Unternehmen
effizient nutzen (2nd ed.) Heidelberg: Computerwoche.
Gregorzik, S. (2002). Multidimensionales Knowledge Management. In: U. Hannig (Ed.), Knowledge Management und Business
Intelligence (pp. 4351). Berlin: Springer.
Gronau, N., Dilz, S., & Kalisch, A. (2004): Anwendungen und
Systeme fr das Wissensmanagementein aktueller berblick. Berlin:
GITO Verlag.
Hand, D., Mannila, H., & Smyth, P. (2001): Principles of Data
Mining. Cambridge, Massachusets: MIT Press.
Hippner, H., & Wilde, K. D. (2004): IT-Systeme im CRM, Aufbau und
Potenziale. Wiesbaden: Gabler.
Inmon, W. H. (1999). Building the Operational Data Store (2nd ed.).
J Hoboken, NJ: John Wiley & Sons.
Inmon, W. H. (2005). Building the Data Warehouse (5th ed.). J Hoboken,
NJ: John Wiley & Sons.
Inmon, W. H. (2007). DW 2.0The Architecture for the Next Generation of Data Warehouse. Retrieved July, 24 2007 from the Corporate Information Factory website: http://www.inmoncif.com/
Kantardzic, M. (2003). Data Mining. Concepts, Models, Methods, and
Algorithms. J Hoboken, NJ: John Wiley & Sons.
Kaplan, R. S., & Norton, D. P. (1996): The Balanced Scorecard
Translating Strategy into Action, Boston: Harvard Business
School Press.
Keith, S., Kaser, O., & Lemire, D. (2005). Analyzing Large Collections of Electronic Text Using OLAP. In: 29th Conf. Atlantic
Provinces Council on the Sciences (APICS 2005), Wolville (Canada),
October, 2005.
Kemper, H. G. (2000): Conceptual Architecture of Data WarehousesA Transformation-oriented View. In Proceedings of the
2000 American Conference On Information Systems, pp. 108118,
147
148