Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
warehous
e
Submitted By: RITESH
Guided by:
Registration no.:123456789
Dr. ABC XYZ Branch/Roll no.:IT/12345
Definition
“ A data warehouse is a subject
oriented, integrated, timevariant,
and nonvolatile collection of data in
support of management’s decision
making process”
W. H. Inmon
2
Subject-Oriented
Datais arranged and
optimized to provide answer
to questions from diverse
functional areas
ØData is organized and
summarized by topic
-Sales / Marketing / Finance /
Distribution / Etc. 3
Integrated
The data warehouse is a
centralized, consolidated
database that integrated data
derived from the entire
organization
Multiple Sources
Diverse Sources
Diverse Formats
4
Time-Variant
The Data Warehouse represents
the flow of data through time
Can contain projected data from
statistical models
Data is periodically uploaded then
time-dependent data is
recomputed
5
Nonvolatile
Once data is entered it is NEVER
removed
Represents the company’s entire
history
Near term history is continually
added to it
Always growing
Must support terabyte databases
and multiprocessors
Read-Only database for data analysis 6
7
Why Data Warehousing?
Which
Whichare
areour
our
lowest/highest
lowest/highest
margin
margin
customers
customers?? Who
Whoarearemy
my
What customers
Whatisisthe
the customers
and
most
most andwhat
what
effective products
products
effective are
distribution
distribution are theybuying?
they buying?
channel?
channel?
What
Whatproduct
product Which
Which
prom-
prom- customers
customers
-otions
-otionshave
havethe are
the aremost
mostlikely
likely
biggest to
biggest What
Whatimpact
impact togo
go
impact
impacton to
on will
will tothe
the
revenue? competition
revenue? new
new competition??
products/servic
products/servic
es 88
es
have
haveon on
revenue
revenue
DATA WAREHOUSE DEPENDENTS
Data warehouse dependents are
those parts of data warehouse,
which are not directly connected
with the warehouse functioning.
The commonly used dependents
are data marts and meta data.
.
9
Data Warehouse Functionality
Relational
Databases
Optimized
Extractio Loader
ERP n
Systems Cleansin
g Data
Warehouse Analyze
Purchased
Engine Query
Data
Legacy
10
Data Metadata Repository
data warehouse architecture
GO TO DIAGRAM
GO TO DIAGRAM
GO TO DIAGRAM
11
Data Warehouse Components
•Staging Area
•A preparatory repository where
transaction data can be transformed for
use in the data warehouse
•Data Mart
•Traditional dimensionally modeled set of
dimension and fact tables
•Per Kimball, a data warehouse is the
union of a set of data marts
•Operational Data Store (ODS)
•Modeled to support near realtime 12
reporting needs.
Very Large Data Bases
WAREHOUSES ARE VERY LARGE
DATABASES
Geographic Information
Petabytes -- 10^15 Systems
bytes: National Medical Records
Exabytes -- 10^18 bytes:
Weather images
Incomplete errors
Missing Fields
Records or Fields That, by Design, are
not Being Recorded
Incorrect errors
Wrong Calculations, Aggregations
Duplicate Records
Wrong Information Entered into Source
14
System
SUCCESS & FUTURE OF DATA WAREHOUSE
The Data Warehouse has successfully supported
the increased needs of the State over the past
eight years.
Theneed for growth continues however, as the
desire for more integrated data increases.
The Data Warehouse has software and tools in
place to provide the functionality needed to
support new enterprise Data Warehouse projects.
Thefuture capabilities of the Data Warehouse can
be expanded to include other programs and 15
agencies.
Data Warehouse Pitfalls
16
Data Warehouse Pitfalls…
17
Best Practices
Complete requirements and design
Prototyping is key to business understanding
Utilizing proper aggregations and detailed data
Training is an ongoing process
Build data integrity checks into your system.
18
Useful URLs
21
BACK TO
ARCHITECTURE
Bottom-Up Architecture
22
BACK TO
ARCHITECTURE
HYBRID Data Mart Architecture
23
BACK TO
ARCHITECTURE