Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
(DSS)
Ashis Talukder
Assistant Professor,
Department of MIS
Dhaka University
Chapter 5
DATA WAREHOUSING
Learning Objectives
Understand the basic definitions and concepts
of data warehouses
Understand data warehousing architectures
Describe the processes used in developing and
managing data warehouses
Explain data warehousing operations
Explain the role of data warehouses in decision
support
Learning Objectives
Explain data integration and the
extraction, transformation, and load (ETL)
processes
Describe real-time (active) data
warehousing
Understand data warehouse
administration and security issues
Data Warehousing
Definitions and Concepts
Data warehouse
A physical repository where relational data are
specially organized to provide enterprise-wide,
cleansed data in a standardized format
Current and historical data
Data are structured to be available in the form
ready for analytical processing (OLAP), mining,
querying, reporting
Subject oriented, integrated, time-variant, nonvolatile collection of data for managements
decision making.
Ashis Talukder, MIS, DU
Data Warehousing
Definitions and Concepts
Characteristics of data warehousing
Subject oriented
Integrated
Time variant (time series)
Nonvolatile
Web based
Relational/multidimensional
Client/server
Real-time
Include metadata
Data Warehousing
Definitions and Concepts
Data mart
A departmental data warehouse that stores only
relevant data
Subset of DW
Data Warehousing
Definitions and Concepts
Operational data stores (ODS)
A type of database often used as an interim area for a data
warehouse, especially for customer information files
Recent form of customer info file (CIF)
Used for short term decision involving mission-critical application
Like short term memory with recent info unlike DW which is long
term memory with permanent info
Oper marts
An operational data mart. An oper mart is a small-scale data mart
typically used by a single department or functional area in an
organization
Data comes from ODS
Data Warehousing
Definitions and Concepts
Enterprise data warehouse (EDW)
A technology that provides a vehicle for pushing data from source
systems into a data warehouse
Large scale DW, integrates data from many sources
Used to provide data for many types of DSS:
Metadata
Data about data. In a data warehouse, metadata describe the
contents of a data warehouse and the manner of its use
Ashis Talukder, MIS, DU
Data Warehousing
Process Overview
Organizations continuously collect data,
information, and knowledge at an
increasingly accelerated rate and store
them in computerized systems
The number of users needing to access the
information continues to increase as a
result of improved reliability and availability
of network access, especially the Internet
Ashis Talukder, MIS, DU
10
Data Warehousing
Process Overview
11
Data Warehousing
Process Overview
Organizations need to create
data warehouses massive
data stores of time series data
for decision support.
Data are imported form
various internal and external
sources, cleansed and
organized in a manner
consentient with the
organizations need.
Then Data mart can be loaded
for any department
Ashis Talukder, MIS, DU
12
Data Warehousing
Process Overview
The major components
of a data warehousing
process
Data sources
Data extraction
Data loading
Comprehensive database
Metadata
Middleware tools
Ashis Talukder, MIS, DU
13
Data Warehousing
Process Overview
Data sources
Data extraction:
By custom-written or commercial software
Data loading
Loaded into staging area where transformed and
cleansed
Then data are ready to load Ashis
in DW
Talukder, MIS, DU
14
Data Warehousing
Process Overview
Comprehensive database
This is EDW to support all decision analysis by
providing relevant summarized and detailed information
from different sources.
Metadata
Includes software programs about data and rules for
organizing data summaries that are easy to index or
search, with web tools
Middleware tools
Enables access to DW
May use SQL queries, Front end application
Ex: data reporting, mining, OLAP, reporting tools,
visulizing tool
Ashis Talukder, MIS, DU
15
Data Warehousing
Architectures
Three tire architecture
Two Tire Architecture
One tire Architecture
16
Data Warehousing
Architectures
Three parts of the data warehouse
The data warehouse that contains the data and
associated software
Data acquisition (back-end) software that
extracts data from legacy systems and external
sources, consolidates and summarizes them,
and loads them into the data warehouse
Client (front-end) software that allows users to
access and analyze data from the warehouse
Ashis Talukder, MIS, DU
17
Data Warehousing
Architectures
18
19
20
Data Warehousing
Architectures
Issues to consider when deciding which
architecture to use:
Which database management system (DBMS)
should be used?
Will parallel processing and/or partitioning be
used?
Will data migration tools be used to load the
data warehouse?
What tools will be used to support data retrieval
and analysis?
Ashis Talukder, MIS, DU
21
Data Warehousing
Process Overview
22
Data Warehousing
Process Overview
23
Data Warehousing
Process Overview
24
Data Warehousing
Process Overview
25
Data Warehousing
Process Overview
26
Data Warehousing
Process Overview
27
Data Warehousing
Process Overview
28
Data Warehousing
Architectures
Ten factors that potentially affect the architecture selection
decision (Ariyachandra & Watson 2005):
1.
2.
3.
4.
Information
interdependence
between organizational
units
Upper managements
information needs
Urgency of need for a
data warehouse
Nature of end-user tasks
5. Constraints on resources
6. Strategic view of the data
warehouse prior to
implementation
7. Compatibility with existing
systems
8. Perceived ability of the inhouse IT staff
9. Technical issues
10.
Social/political factors
Ashis Talukder, MIS, DU
29
Data integration
Integration that comprises three major processes:
data access,
data federation, and
change capture.
30
31
32
Transformation:
converting the extracted data from its previous form
into the form in which it needs to be so that it can be
placed into a data warehouse or simply another
database, and
Load:
putting the data into the data
warehouse
Ashis Talukder, MIS, DU
33
34
35
36
37
38
Financial strength
ERP linkages
Qualified consultants
Market share
Industry experience
Established partnerships
39
40
41
Data Warehouse
Development
A Central Fact Table
Contains
42
Data Warehouse
Development
A Central Fact Table
Surrounded by number of
dimension tables.
Dimension table contains
Classification and
aggregation information
Data that describe data
in the fact tables
One-to-many relationship
43
Grain
A definition of the highest level of detail that is
supported in a data warehouse
Whether DW is highly summarized or have
detailed transaction data
Drill-down
The process of probing beyond a summarized
value to investigate each of the detail
transactions that comprise the summary
Ashis Talukder, MIS, DU
44
45
Data Warehouse
Development
2.
3.
4.
5.
Establishment of
service-level
agreements and datarefresh requirements
Identification of data
sources and their
governance policies
Data quality planning
Data model design
ETL tool selection
6. Relational database
software and platform
selection
7. Data transport
8. Data conversion
9. Reconciliation process
10. Purge (Cleaning) and
archive planning
11. End-user support
Ashis Talukder, MIS, DU
46
47
48
49
50
51
52
Real-Time Data
Warehousing
53
Real-Time Data
Warehousing
1.
2.
3.
4.
5.
54
Real-Time Data
Warehousing
55
Real-Time Data
Warehousing
56
Data Warehouse
Administration and Security
Issues
Data warehouse administrator (DWA)
A person responsible for the administration
and management of a data warehouse
57
Data Warehouse
Administration and Security
Issues
Effective security in a data warehouse
should focus on four main areas:
Establishing effective corporate and security
policies and procedures
Implementing logical security procedures and
techniques to restrict access
Limiting physical access to the data center
environment
Establishing an effective internal control review
process with an emphasis on security and
privacy
Ashis Talukder, MIS, DU
58