Sei sulla pagina 1di 35

DATA WAREHOUSING

IT PRESENTATION

Manpreet Singh (2K19/DMBA/48)


Manish Jamwal (2K19/DMBA/47)
Jigyasa Rawat (2K19/DMBA/41)
Rishabh Kelkar(2K19/DMBA/80)
WHAT IS DATA WAREHOUSE

  Data Warehousing (DW) is process for collecting and managing data from varied sources to provide meaningful
business insights. 
 Data warehouse is typically used to connect and analyze business data from heterogeneous sources. 
 blend of technologies and components which aids the strategic use of data.
  electronic storage of a large amount of information by a business which is designed for query and analysis instead
of transaction processing
OTHER NAMES FOR DATAWAREHOUSE
HISTORY OF DATAWAREHOUSE

 1960- Dartmouth and General Mills in a joint research project, develop the terms dimensions and facts.
 1970- A Nielsen and IRI introduces dimensional data marts for retail sales.
 1983- Tera Data Corporation introduces a database management system which is specifically designed for
decision support
 Data warehousing started in the late 1980s when IBM worker Paul Murphy and Barry Devlin developed the
Business Data Warehouse.
 However, the real concept was given by Inmon Bill. He was considered as a father of data warehouse. He had
written about a variety of topics for building, usage, and maintenance of the warehouse & the Corporate
Information Factory.
WHO NEEDS DATA WAREHOUSE?

 Decision makers who rely on mass amount of data.


 Users who use customized, complex processes to obtain information from multiple data sources.
 If the user wants fast performance on a huge amount of data which is a necessity for reports, grids or charts, then
Data warehouse proves useful.
 Data warehouse is a first step If you want to discover 'hidden patterns' of data-flows and groupings.
 It also essential for those people who want a systematic approach for making decisions.
APPLICATIONS OF DATAWAREHOUSE-1

 Airline:
In the Airline system, it is used for operation purpose like crew assignment, analyses of route profitability,
frequent flyer program promotions, etc.
 Healthcare:
Healthcare sector also used Data warehouse to strategize and predict outcomes, generate patient's treatment
reports, share data with tie-in insurance companies, medical aid services, etc.
 Retain chain:
In retail chains, Data warehouse is widely used for distribution and marketing. It also helps to track items,
customer buying pattern, promotions and also used for determining pricing policy.
APPLICATIONS OF DATAWAREHOUSE-II

 Telecommunication:
Data warehouse is used in this sector for product promotions, sales decisions and to make distribution
decisions.
 Public sector:
In the public sector, data warehouse is used for intelligence gathering. It helps government agencies to
maintain and analyze tax records, health policy records, for every individual.
BEST PRACTICES TO IMPLEMENT A DATA WAREHOUSE

 Decide a plan to test the consistency, accuracy, and integrity of the data.
 The data warehouse must be well integrated, well defined and time stamped.
 While designing Datawarehouse make sure you use right tool, stick to life cycle, take care about data conflicts.
 Ensure to involve all stakeholders including business personnel in Datawarehouse implementation process.
Establish that Data warehousing is a joint/ team project. You don't want to create Data warehouse that is not useful
to the end users.
 Prepare a training plan for the end users.
CHARACTERISTICS OF DATA WAREHOUSE

Subject-Oriented
Integrated
Time-variant
Non-volatile
 Subject-Oriented
A data warehouse is subject oriented as it offers information regarding a theme instead of companies' ongoing
operations. These subjects can be sales, marketing, distributions, etc.

 Integrated
In Data Warehouse, integration means the establishment of a common unit of measure for all similar data from the
dissimilar database. The data also needs to be stored in the Datawarehouse in common and universally acceptable
manner.
 Time-Variant
The time horizon for data warehouse is quite extensive compared with operational systems. The data collected in a
data warehouse is recognized with a particular period and offers information from the historical point of view. It
contains an element of time, explicitly or implicitly.

 Non-volatile
Data warehouse is also non-volatile means the previous data is not erased when new data is entered in it.
DATA WAREHOUSE ARCHITECTURE

DATA WAREHOUSE ARCHITECTURE is complex as it’s an information system that contains historical and
commutative data from multiple sources.
There are 3 approaches for constructing data-warehouse:
 Single Tier
The objective of a single layer is to minimize the amount of data stored. This goal is to remove data redundancy. This
architecture is not frequently used in practice.
DATA WAREHOUSE ARCHITECTURE- 2 TIER

 Two-tier architecture
Two-layer architecture separates physically available sources and data warehouse. This architecture is not expandable
and also not supporting a large number of end-users. It also has connectivity problems because of network limitations
DATA WAREHOUSE ARCHITECTURE -3 TIER
DATA WAREHOUSE ARCHITECTURE

 Three-tier architecture
This is the most widely used architecture.
It consists of the Top, Middle and Bottom Tier.
Bottom Tier: The database of the Datawarehouse servers as the bottom tier. It is usually a relational database system.
Data is cleansed, transformed, and loaded into this layer using back-end tools.
Middle Tier: The middle tier in Data warehouse is an OLAP server which is implemented using either ROLAP or
MOLAP model. For a user, this application tier presents an abstracted view of the database.
Top-Tier: The top tier is a front-end client layer. Top tier is the tools and API that you connect and get data out from
the data warehouse. It could be Query tools, reporting tools, managed query tools, Analysis tools and Data mining
tools.
METADATA

 Metadata is data about data which defines the data warehouse. It is used for building, maintaining and managing
the data warehouse.
For example, a line in sales database may contain:
4030 KJ732 299.90
This is a meaningless data until we consult the Meta that tell us it was:
Model number: 4030
Sales Agent ID: KJ732
Total sales amount of $299.90
Metadata can be classified into following categories:

 Technical Meta Data: This kind of Metadata contains information about warehouse which is used by Data
warehouse designers and administrators.

 Business Meta Data: This kind of Metadata contains detail that gives end-users a way easy to understand
information stored in the data warehouse.
DATA WAREHOUSE VS. OPERATIONAL DBMS

• OLTP (on‐line transaction processing)


– Major task of traditional relational DBMS
– Day‐to‐day operations: purchasing, inventory, banking, manufacturing,
payroll, registration, accounting, etc.
– Fast response time

• OLAP (on‐line analytical processing)


– Major task of data warehouse system
– Data analysis and decision making
OLTP VS OLAP
TYPES OF OLAP SERVERS

 Relational OLAP (ROLAP)


 Multidimensional OLAP (MOLAP)
 Hybrid OLAP (HOLAP)
 Specialized SQL Servers
OLAP OPERATIONS

Here is the list of OLAP operations −


 Roll-up
 Drill-down
 Slice and dice
 Pivot (rotate)
ROLL UP

Roll-up performs aggregation on a data cube in any of the


following ways −
 By climbing up a concept hierarchy for a dimension
 By dimension reduction
DRILL DOWN

Drill-down is the reverse operation of roll-up. It is


performed by either of the following ways −
 By stepping down a concept hierarchy for a dimension
 By introducing a new dimension.
SLICE

 The slice operation selects one particular


dimension from a given cube and
provides a new sub-cube
DICE

 Dice selects two or more dimensions from


a given cube and provides a new sub-cube
PIVOT

 The pivot operation is also known as


rotation. It rotates the data axes in view in
order to provide an alternative
presentation of data.
ADVANTAGES OF DATA WAREHOUSE

 A Data Warehouse Saves Time


 A Data Warehouse Delivers Enhanced Business Intelligence
 A Data Warehouse Enhances Data Quality and Consistency
 A Data Warehouse Provides Historical Intelligence
DATA MART

A data mart is a subject-oriented database that is often a partitioned segment of an


enterprise data warehouse. The subset of data held in a data mart typically aligns with
a particular business unit like sales, finance, or marketing. Data marts accelerate
business processes by allowing access to relevant information in a data warehouse or
operational data store within days, as opposed to months or longer. Because a data
mart only contains the data applicable to a certain business area, it is a cost-effective
way to gain actionable insights quickly.
DATA MART
REASONS FOR CREATING DATA MARTS

1. Easy access to frequently required data.


2. Improves end user response time
3. Easy creation
4. Lower cost
5. Potential users are clearly defined
6. Easily leads you to more specific data
TYPES OF DATA MART

1. Dependent Data Mart:


 The data is extracted from the OLTP systems and then populated in the central Data warehouse.
 From the Data warehouse the data then travels down to the Data mart.

2. Independent Data Mart:


 The data is directly got from the source system(OLTP).
 This is suitable for small organizations or smaller groups within an organization

3. Hybrid Data Mart:


 The data is fed from both OLTP systems as well as Data warehouse.
ADVANTAGES OF DATA MART

 Implementation of data mart needs less time as compared to implementation of data warehouse as data mart is
designed for a particular department of an organization.
 Organizations are provided with choices to choose model of data mart depending upon cost and their business.
 Data can be easily accessed from data mart.
 It contains frequently accessed queries, so enable to analyze business trend.
DISADVANTAGES OF DATA MART

 Since it stores the data related only to specific function, so does not store huge
volume of data related to each and every department of an organization like data
warehouse.
 Creating too many data marts become cumbersome sometimes
DATA WAREHOUSE VS DATA MART

 Data Warehouse is a large repository of data collected from different sources whereas Data Mart is only subtype of
a data warehouse.
 Data Warehouse is focused on all departments in an organization whereas Data Mart focuses on a specific group.
 Data Warehouse designing process is complicated whereas the Data Mart process is easy to design.
 Data Warehouse takes a long time for data handling whereas Data Mart takes a short time for data handling.
 Data Warehouse size range is 100 GB to 1 TB+ whereas Data Mart size is less than 100 GB.
 Data Warehouse implementation process takes 1 month to 1 year whereas Data Mart takes a few months to
complete the implementation process.
THANK YOU

Potrebbero piacerti anche