Sei sulla pagina 1di 15

DATA WAREHOUSE

Definition
• A data warehouse is a repository of an
organization's electronically stored data.

In simple language,

A collection of data, from a variety


of sources, organized to provide
useful guidance to an
organization's decision makers
Purpose of Data Warehousing
Keeping Analysis/Reporting (non production use data)
and Production Separate.
Information Integration from multiple systems- Single
point source for information.
DW purpose for Data Consistency and Quality.
High Response Time- Production Databases are tuned
to expected transaction load.
Data Warehouse objective of providing an adaptive and
flexible source of information.
Establish the foundation for Decision Support.
Process
Detailed Process
Accurately identifying the business information

Identifying and prioritizing subject areas to be included

Managing the scope of each subject area which will be implemented into the Warehouse on an iterative basis

Developing a scalable architecture to serve as the Warehouse’s technical and application foundation, and identifying and
selecting the hardware/software/middleware components to implement it

Extracting, cleansing, aggregating, transforming and validating the data to ensure accuracy and consistency
Contd…..
Defining the correct level of summarization to support business decision making

Establishing a refresh program that is consistent with business needs, timing and cycles

Providing user-friendly, powerful tools at the desktop to access the data in the Warehouse

Educating the business community about the realm of possibilities that are available to them through Data Warehousing

Establishing a Data Warehouse Help Desk and training users to effectively utilize the desktop tools

Establishing processes for maintaining, enhancing, and ensuring the ongoing success and applicability of the Warehouse
Types
 Offline Operational Data Warehouse:
Offline Operational Data Warehouses are data warehouses where data is usually
copied and pasted from real time data networks into an offline system where it
can be used. It is usually the simplest and less technical type of data warehouse.
 Offline Data Warehouse:
Offline Data Warehouses are data warehouses that are updated frequently, daily,
weekly or monthly and that data is then stored in an integrated structure, where
others can access it and perform reporting.
 Real Time Data Warehouse:
Real Time Data Warehouses are data warehouses where it is updated each
moment with the influx of new data. For instance, a Real Time Data Warehouse
might incorporate data from a Point of Sales system and is updated with each sale
that is made.
 Integrated Data Warehouse:
Integrated Data Warehouses are data warehouses that can be used for other
systems to access them for operational systems. Some Integrated Data
Warehouses are used by other data warehouses, allowing them to access them to
process reports, as well as look up current data.
Architecture
Operational database layer
The source data for the data warehouse — An organization's Enterprise
Resource Planning systems fall into this layer.
Data access layer
The interface between the operational and informational access layer —
Tools to extract, transform, load data into the warehouse fall into this
layer.
Metadata layer
The data dictionary — This is usually more detailed than an operational
system data dictionary. There are dictionaries for the entire warehouse
and sometimes dictionaries for the data that can be accessed by a
particular reporting and analysis tool.
Informational access layer
The data accessed for reporting and analyzing and the tools for reporting
and analyzing data — This is also called the data mart. Business
intelligence tools fall into this layer.
There are two strategies to build a data
warehouse namely:
Top - Down Approach (Suggested by Bill
Inmon)
Bottom - Up Approach (Suggested by
Ralph Kimball)
Top -
Down
Design
Bottom -
Up Design
Data Mart
 A data mart (DM) is the access layer of the data
warehouse (DW) environment that is used to get data out to
the users.
 The DM is a subset of the DW, usually oriented to a
specific business line or team.
 Easy access to frequently needed data.
 Creates collective view by a group of users.
 Improves end-user response time.
 Ease of creation.
 Lower cost than implementing a full Data warehouse.
 Potential users are more clearly defined than in a full Data
warehouse.
Advantages
 Enhanced access to data and information.
 Easy report creation.
 Ease of access and flexibility of use for key corporate
data.
 Provide retrieval of data without slowing down
operational systems.
 Inconsistencies are identified and resolved. This
greatly simplifies reporting and analysis.
 Facilitate decision support system applications such as
trend reports, exception reports, and reports that show
actual performance versus goals.
 DW can record historical information for data source
tables that are not set up to save an update history.
Disadvantages
 Preparation may be time consuming.
 Compatibility with existing systems.
 Security issues.
 Cost.
THANK YOU
Presented by:
• Aditi Sharma
• Ajay Jain
• Anuj Atrey
• Ashwani Kumar
• Bharat Sharma
• Deepshikha Sharma

Potrebbero piacerti anche