Sei sulla pagina 1di 41

Introduction

to DSS
&
Data Warehousing
Concepts
Atul Gandre
TATA
INFOTECH Ltd

Contents
What is DSS ?
DSS architecture and its components
Extraction, Transformation & Loading
Data Access & Analysis
Data marts
Data Mining

TATA
INFOTECH Ltd

What is DSS?

Management
Objectives

Increased profits
Improved margins
Reduced overheads
Larger market share

TATA
INFOTECH Ltd

Typical Questions a Decision


may askof...
Maker
Give peformance
all TVs, over the past 3
years
Show Sales by volume, value and margin contribution
By different time periods
By product model
By region

Compare 3 years sales monthwise


Compare sales v/s type of promotion, by region
Best distributors / Worst distributors
Margin Analysis, Cost Breakup
Capacity utilisation over time

TATA
INFOTECH Ltd

Typical Questions a Decision


Maker may ask ...

Best Products / Worst Products by

Sales Value
Volume
Profit / Margins
Market Share
% growth

TATA
INFOTECH Ltd

Typical Questions a Decision


Maker may ask ...

Top 10 / Bottom 10 Sales men this year by

Sales value
% of target met
% over last year
By region
By product

TATA
INFOTECH Ltd

Typical Questions a Decision


Maker may ask ...
Show the Top 20 Customers

By month / quarter / year


By product / product group / all products
For a region / all regions
Ranking over the last 12 months
Recovery of dues

TATA
INFOTECH Ltd

Can Transaction
systems answer such
The problems :
?
queries
Highly normalised structures
make queries more complex
Increase in complexity of queries due to :

Aggregation, Summarisation, Ranking, Cumulations, Running totals,


Comparison
Various dimensions

Little historical data stored on-line for comparison


High resource utilisation will result in slow response to
complex queries
Difficulty in making ad-hoc queries
TATA
INFOTECH Ltd

DSS - a definition
Decision Support Systems use computers to
facilitate the decision making process of semi
structured tasks. These systems are designed not
to replace managerial judgement but to support
it and make the decisions more effective. DSS
helps managers react quickly to changing needs.
- W H Inmon

TATA
INFOTECH Ltd

10

DSS - a paradigm for


analysis
Transactional Applications DSS Applications

Operational

Run the Business

Long development cycles

Detailed data

No redundancy

Data is normally updated


Amount of data used in a process
is small
Serves the clerical community

TATA
INFOTECH Ltd

Analytical
Gather strategic information
Constant prototype mode
Detailed and summarized data
Redundancy allowed
Data is normally loaded
Amount of data used in a process is
large
Serves the managerial community

- W H Inmon

11

DSS architecture and its


components
Data warehouse architecture
Data extraction, transformation and loading
Data access and analysis.

TATA
INFOTECH Ltd

12

DSS Architecture
External
Sources

Data Mining

EIS

OR

Data
Extraction,
Scrubbing, &
Transformation

OLAP

AI

Data Sources
(Operational
Systems)
TATA
INFOTECH Ltd

Data
Warehouse

Data Access

13

...DSS Architecture
The DSS Architecture consists of...
Extraction of data from various operational systems
on different platforms, then transforming and
loading to the Data Warehouse
The Data Warehouse contains historical data as well
as current data.
The data in the Data Warehouse is accessed by the
front-end tools
TATA
INFOTECH Ltd

14

Data Warehouse - the


heart of DSS
The Data Warehouse is that portion of an
overall architected data environment that
serves as the single integrated source of
data for decision support systems
- W H Inmon

TATA
INFOTECH Ltd

15

Data Warehouse :
Another definition

A Data Warehouse is a
subject-oriented,
integrated,
time variant and
non-volatile
collection of data in support of
managements decision-making process.
- W H Inmon
TATA
INFOTECH Ltd

16

Characteristics of a
Data Warehouse
The DW provides access to
corporate / organizational data
The data in the DW is consistent
The data in the DW can be separated and combined by
means of every possible measure in the business
The DW is where data is published

TATA
INFOTECH Ltd

17

DW Data Model Characteristics

Data centric not process based


Simple to understand
Flexible to add/modify
Design reflects business information
Query driven design
Denormalised
Intuitive and easy to use

The Dimensional Model


The CEO of a Company says We sell products in various markets and we
measure our performance over time
Time
Market

Product

The Dimensional Model


Each Cell in the cube contains business
measures for a particular combination of
Product, Market and Time

Other Names
Star Join Schema, Star Schema,
Multidimensional model

DW Data modelling Multidimensionality : An example


2Q95
3Q95
4Q95
1Q96
2Q96

Ruby
Emerald
Saffire

East

West

North

SalesRevenue
North
Emerald
1Q96

DW Data modelling Star Schema


Fact tables
Time

Sales

Product

Dimension tables
Region

Customer

Star Schema Features


One large central table called Fact Table
A number of attendant tables having a
single join attaching them to the Fact Table
called Dimension Tables
A Time Dimension
Fact Table Primary key is a composite key
of foreign keys

Snowflake Design
Snowflake refers to normalising dimension
tables
Creating Outrigger tables containing
containing descriptions of codes in dimension table
containing additional attributes

Snowflake Schema Example


Supplier

Store
District

Location

Region

Product

Sales Fact
Sales Assoc.

Month
Day
Season

Time

Seller
Sales Dept.

Data Modelling Steps


Starting point - End users and source data
Identify a subject area
Find out the facts
Associate the facts with the business dimensions
Define the attributes in the dimensions
Decide on the level of detail - the granularity
Decide the summarise and purge period

Data extraction, transformation


and upload

Operational
Systems
Data Extraction
TATA
INFOTECH Ltd

Data Warehouse

EIS

Data Upload
Data
Transformation

27

Data Extraction
The Extract program

Rummages through a file or database


Uses some criteria for selection
Identifies qualified data and
Transports the data over onto another
file or database.

TATA
INFOTECH Ltd

28

Data Extraction Cleanup

Restructuring of records or fields


Removal of Operational-only data
Supply of missing field values
Data Integrity checks
Data Consistency and Range checks, etc...

TATA
INFOTECH Ltd

29

Data transformation
Integrating dissimilar data types
Changing codes
Adding a time attribute
Summarising data
Calculating derived values
Denormalising data

TATA
INFOTECH Ltd

30

Data loading

Initial and incremental loading


Updation of metadata
Updation of log
Rollback in case of loading errors

31

ETL Tools

Informatica
Ardent DataStage
Oracle Warehouse Builder
Microsoft Data Transformation
Service(DTS)

32

Data Access &


Analysis

Ease of navigation across screen


Value addition by better information presentation
( graphs, charts and maps)
Highlighting exception information by
Alarms and Alerts
Drill-down / roll-up through successive levels of
data
What -if analysis

TATA
INFOTECH Ltd

33

Reporting Tools

Microstrategy
Business Objects
Cognos
Brio
Hyperion

34

What is OLAP?
OLAP is an On-line Analytical processing
technology which creates new business
information from existing data , through a rich
set of business transformations and numerical
calculations.

TATA
INFOTECH Ltd

35

OLAP Characteristics
Always involves interactive query and analysis of the
data. The interaction is usually multiple passes
Involves drilling down into successively lower levels
of detail data
Involves roll-ups to higher levels of summarization
and aggregation.
Offers analytical modelling capabilities

TATA
INFOTECH Ltd

36

Data marts
Architecture
Characteristics
Example

37

Data mart in the DSS Archite


External
Sources

Data Mining

EIS

OR

Data
Extraction,
Scrubbing, &
Transformation

OLAP

AI

Data Sources
(Operational
Systems)
TATA
INFOTECH Ltd

Data
Warehouse

Data Mart

Data Access

38

What are Data marts ?


Data marts are scaled-down and less
expensive versions of data warehouses
Data marts utilize large-scale data
warehousing concepts on a smaller, more
focussed level
Data marts are focussed at departmental
users
Decentralised approach
TATA
INFOTECH Ltd

39

What is data mining?


Data Mining, the extraction of hidden
information from large databases. It is a
powerful new technology with great potential
to help companies focus on the most
important information in the data warehouse.

TATA
INFOTECH Ltd

40

Data mining
capabilities
Discovery of unknown patterns
Prediction of trends and behaviors
Discovery of anomalies in data

TATA
INFOTECH Ltd

41

Potrebbero piacerti anche