Sei sulla pagina 1di 30

Data Warehousing

Naveed Iqbal, Assistant Professor


FAST-NU, Islamabad
(Lecture Slides Week # 1)

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
WELCOME
 To this course – Data Warehousing
 Highly Demanding True Professional Course
 A Versatile Learning Opportunity
 Learning and Fun

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Instructor Profile – Naveed Iqbal
 9+ years hands-on Versatile and Multidisciplinary Experience in
the areas of:
 Geographic Information Systems (GIS)
 Data Warehousing and Decision Support Systems
 IT Management / IT Service Management and Project Management
 Software Development, Implementation and Infrastructure
Management
 University Level Teaching
 Pioneering Career, Techno-Managerial skill-set and expertise.
 MS (CS) –FAST-NU, M.Sc (CS), M.Sc (Mathematics)
 Professional Courses from LUMS, IMS, NUST, COMSATS,
ORACLE UNIVERSITY and UK
 Certified ITIL / IT Service Management Professional

FAST-NU, Islamabad Data Warehousing - Fall 2010 3

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Code of Conduct
 Regularity
 Attendance criteria as per university policy
 Punctuality
 No entry after 5 minutes from class start time (N/A for habitual late
comers)
 Discipline
 ABSOLUTLY NO COMPROMISE
 Positive Attitude
 High Level of Class Participation
 No Plagiarism, Cheating …
 No Change in Deadlines
 No Usage of Mobile / Other Devices
 …

FAST-NU, Islamabad Data Warehousing - Fall 2010 4

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Approach of the Course
 Develop an understanding of the underlying RDBMS
concepts.
 Apply these concepts to VLDB / DSS environments
and understand where and why they break down?
 Expose the differences between RDBMS and Data
Warehouse in the context of VLDB.
 Provide the basics of DSS tools such as OLAP, Data
Mining and demonstrate their applications.
 Demonstrate the application of DSS concepts and
limitations of the OLTP concepts through lab
exercises.

FAST-NU, Islamabad Data Warehousing - Fall 2010 5

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Summary of the Course
 Introduction & Background
 De-Normalization
 Online Analytical Processing (OLAP)
 Dimensional Modeling
 Extract-Transform-Load (ETL)
 Data Quality Management (DQM)
 Need for Speed (Parallelism, Join and Indexing Techniques)
 DWH Implementation Steps
 Complete Implementation Case Study
 Lab and Tool Usage
 …

FAST-NU, Islamabad Data Warehousing - Fall 2010 6

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Books

 Reference Books
 W. H. Inmon, Building the Data Warehouse,
John Wiley & Sons Inc., NY
 R. Kimball, The Data Warehouse Toolkit,
John Wiley & Sons Inc., NY
 A. Abdullah, “Data Warehousing for Beginners:
Concepts & Issues”.
 Paulraj Ponniah, Data Warehousing
Fundamentals, John Wiley & Sons Inc., NY
 ...

FAST-NU, Islamabad Data Warehousing - Fall 2010 7

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Course Execution Plan

 Lecturing / Discussions
 Assignments
 Case Studies
 Projects
 Marks Breakup:
Mid-I: 15% Quizzes: 5%
Mid-II: 15% Assignments: 7%
Final: 35% Case Study*: 8%
Project*: 15%
* Mandatory (Missing means F)

FAST-NU, Islamabad Data Warehousing - Fall 2010 8

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Data Warehousing
(Introduction and Background)

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Why this Course?

 The World is changing / (in fact changed)


 Either change or Be left behind.
 Missing the opportunities or going in the
wrong direction has prevented us from
growing.
 What is the right direction?
 Harnessing the data, in the knowledge
driven economy.
 Doing what can’t be or difficult to automate.

FAST-NU, Islamabad Data Warehousing - Fall 2010 10

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Historical Overview

 1960: Master Files and Reports


 1965: Lots of Master Files
 1970: Direct Memory Access and DBMS
 1975: Online High Performance Transaction
Processing
 1980: PCs and 4GL Technology (MIS/DSS)
 1985: Extract Programs, Extract Processing
 1990: The Legacy System’s Web

FAST-NU, Islamabad Data Warehousing - Fall 2010 11

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
The Need of the Time

 Drowning in data AND/BUT starving for


information.

 Knowledge is power BUT Intelligence is


absolute/super power.

FAST-NU, Islamabad Data Warehousing - Fall 2010 12

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
The Need of the Time
POWER
($/£)

Intelligence

Knowledge

Information

Data

FAST-NU, Islamabad Data Warehousing - Fall 2010 13

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Scenario 1

ABC Pvt Ltd is a company with branches at


Karachi, Quetta, Peshawar and Lahore. The Sales
Manager wants quarterly sales report. Each
branch has a separate operational system.

FAST-NU, Islamabad Data Warehousing - Fall 2010 14

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Scenario 1 : ABC Pvt Ltd.

Karachi

Quetta
Sales per item type per branch Sales
for first quarter. Manager

Peshawar

Lahore

FAST-NU, Islamabad Data Warehousing - Fall 2010 15

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 1:ABC Pvt Ltd.

 Extract sales information from each database.


 Store the information in a common repository
at a single site.

FAST-NU, Islamabad Data Warehousing - Fall 2010 16

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 1:ABC Pvt Ltd.

Karachi

Report
Quetta
Query & Sales
Data
Analysis tools Manager
Warehouse

Peshawar

Lahore

FAST-NU, Islamabad Data Warehousing - Fall 2010 17

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Scenario 2

One Stop Shopping Super Market has huge


operational database. Whenever Executives wants
some report, the OLTP system becomes slow and
data entry operators have to wait for some time.

FAST-NU, Islamabad Data Warehousing - Fall 2010 18

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Scenario 2 : One Stop Shopping

Data Entry Operator

Report

Wait Operational Management


Database

Data Entry Operator

FAST-NU, Islamabad Data Warehousing - Fall 2010 19

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 2

 Extract data needed for analysis from


operational database.
 Store it in warehouse.
 Refresh warehouse at regular interval so that it
contains up to date information for analysis.
 Warehouse will contain data with historical
perspective.

FAST-NU, Islamabad Data Warehousing - Fall 2010 20

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 2

Data Entry
Operator

Report

Transaction Extract Data


Operational Manager
data Warehouse
database

Data Entry
Operator

FAST-NU, Islamabad Data Warehousing - Fall 2010 21

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Scenario 3

Cakes & Cookies is a small, new company. President


of the company wants his company should grow. He
needs information so that he can make correct
decisions.

FAST-NU, Islamabad Data Warehousing - Fall 2010 22

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 3

 Improve the quality of data before


loading it into the warehouse.
 Perform data cleaning and
transformation before loading the data.
 Use query analysis tools to support
adhoc queries.

FAST-NU, Islamabad Data Warehousing - Fall 2010 23

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Solution 3

Expansion

sales

Data Query and Analysis President


Warehouse tool

time

Improvement

FAST-NU, Islamabad Data Warehousing - Fall 2010 24

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Case Study

 AFCO Foods & Beverages is a new company


which produces dairy, bread and meat
products with production unit located at
Gujranwala.
 There products are sold in all the region of
Pakistan.
 They have sales units at provincial Head
Quarters.
 The President of the company wants sales
information.

FAST-NU, Islamabad Data Warehousing - Fall 2010 25

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Sales Information

Report: The number of units sold.

113

Report: The number of units sold over time

January February March April


14 41 33 25

FAST-NU, Islamabad Data Warehousing - Fall 2010 26

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Sales Information

Report : The number of items sold for each product with


time
Jan Feb Mar Apr
Wheat Bread 6 17

Cheese 6 16 6 8

Swiss Rolls 8 25 21

Time
Product

FAST-NU, Islamabad Data Warehousing - Fall 2010 27

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Sales Information

Report: The number of items sold in each City for each


product with time
Jan Feb Mar Apr
City
Karachi Wheat 3 10
Bread
Cheese 3 16 6

Time
Swiss Rolls 4 16 6

Lahore Wheat 3 7
Bread
Product
Cheese 3 8

Swiss Rolls 4 9 15

FAST-NU, Islamabad Data Warehousing - Fall 2010 28

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Sales Information

Report: The number of items sold and income in each region for
each product with time.
Jan Feb Mar Apr
Rs U Rs U Rs U Rs U
Karachi Wheat Bread 7.44 3 24.80 10
Cheese 7.95 3 42.40 16 15.90 6

Swiss Rolls 7.32 4 29.98 16 10.98 6


Lahore Wheat Bread 7.44 3 17.36 7
Cheese 7.95 3 21.20 8

Swiss Rolls 7.32 4 16.47 9 27.45 15

FAST-NU, Islamabad Data Warehousing - Fall 2010 29

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/
Data Warehousing includes

 Build Data Warehouse


 Online Analysis/Analytical Processing (OLAP).
 Presentation.

Cleaning ,Selection &


Integration

RDBMS Presentation

Flat File
Client
Warehouse & OLAP server

FAST-NU, Islamabad Data Warehousing - Fall 2010 30

Created with Print2PDF. To remove this line, buy a license at: http://www.software602.com/

Potrebbero piacerti anche