Sei sulla pagina 1di 24

DATA WAREHOUSING

SCHEMAS AND OBJECTS


Schema
2

 Schema is a logical description of the entire


database. It includes the name and description of
records of all record types including all associated
data-items and aggregates.

 A schema is a collection of database objects,


including tables, views, indexes, and synonyms.
Schema
3

 The model of your source data and the


requirements of your users help you design the data
warehouse schema. You can sometimes get the
source model from your company's enterprise data
model
 The physical implementation of the logical data
warehouse model may require some changes to
adapt it to your system parameters--size of
machine, number of users, storage capacity, type of
network, and software.
Star Schemas
4

 The star schema is the simplest data warehouse


schema. It is called a star schema because the
diagram resembles a star, with points radiating
from a center. The center of the star consists of one
or more fact tables and the points of the star are
the dimension tables, as shown in Figure 2-1.
 A star schema optimizes performance by keeping
queries simple and providing fast response time. All
the information about each level is stored in one
row.
5
6
Star Schema
7

 Each dimension has only one dimension table and each


table holds a set of attributes. For example, the location
dimension table contains the attribute set {location_key,
street, city, province_or_state,country}.
 This constraint may cause data redundancy. For
example, "Vancouver" and "Victoria" both the cities are
in the Canadian province of British Columbia. The
entries for such cities may cause data redundancy
along the attributes province_or_state and country.
8
9
Characteristics of Star Schema:
10

 Every dimension in a star schema is represented with the only one-


dimension table.
 The dimension table should contain the set of attributes.
 The dimension table is joined to the fact table using a foreign key
 The dimension table are not joined to each other
 Fact table would contain key and measure
 The Star schema is easy to understand and provides optimal disk
usage.
 The dimension tables are not normalized. For instance, in the above
figure, Country_ID does not have Country lookup table as an OLTP
design would have.
 The schema is widely supported by BI Tools
Snowflake Schema
11

 A Snowflake Schema is an extension of a Star


Schema, and it adds additional dimensions. It is
called snowflake because its diagram resembles a
Snowflake.
Snowflake Schema
12

 The dimension tables are normalized which splits


data into additional tables. In the following
example, Country is further normalized into an
individual table.
13
Characteristics of Snowflake Schema:
14

 The main benefit of the snowflake schema it uses


smaller disk space.
 Easier to implement a dimension is added to the
Schema
 Due to multiple tables query performance is
reduced
 The primary challenge that you will face while using
the snowflake Schema is that you need to perform
more maintenance efforts because of the more
lookup tables.
Galaxy schema
15

 A Galaxy Schema contains two fact table that


shares dimension tables. It is also called Fact
Constellation Schema. The schema is viewed as a
collection of stars hence the name Galaxy Schema.
16
Galaxy Schema:
17

 As you can see in above figure, there are two facts


table
 Revenue

 Product.

 In Galaxy schema shares dimensions are called


Conformed Dimensions.
Characteristics of Galaxy Schema:
18

 The dimensions in this schema are separated into


separate dimensions based on the various levels of
hierarchy.
 For example, if geography has four levels of hierarchy
like region, country, state, and city then Galaxy schema
should have four dimensions.
 Moreover, it is possible to build this type of schema by
splitting the one-star schema into more Star schemes.
 The dimensions are large in this schema which is needed
to build based on the levels of hierarchy.
 This schema is helpful for aggregating fact tables for
better understanding.
Fact tables
19

 Fact tables are the large tables in your warehouse


schema that store business measurements. Fact
tables typically contain facts and foreign keys to
the dimension tables.

 Fact tables represent data, usually numeric and


additive, that can be analyzed and examined.
Examples include sales, cost, and profit.
Creating a New Fact Table
20

 You must define a fact table for each star schema.


From a modeling standpoint, the primary key of the
fact table is usually a composite key that is made
up of all of its foreign keys.
Dimension Tables
21

 A dimension is a structure, often composed of one or


more hierarchies, that categorizes data.
 Dimensional attributes help to describe the
dimensional value.
 They are normally descriptive, textual values.
Several distinct dimensions, combined with facts,
enable you to answer business questions.
 Commonly used dimensions are customers,
products, and time.
Dimension Tables
22

 Dimension data is typically collected at the lowest


level of detail and then aggregated into higher
level totals that are more useful for analysis. These
natural rollups or aggregations within a dimension
table are called hierarchies.
Hierarchies
23

 Hierarchies are logical structures that use ordered levels as


a means of organizing data.
 A hierarchy can be used to define data aggregation. For
example, in a time dimension, a hierarchy might aggregate
data from the month level to the quarter level to the year
level.
 Within a hierarchy, each level is logically connected to the
levels above and below it. Data values at lower levels
aggregate into the data values at higher levels.
 A dimension can be composed of more than one hierarchy.
For example, in the product dimension, there might be two
hierarchies--one for product categories and one for
product suppliers.
Levels
24

Level A customer Hierarchy

 A level represents a position


in a hierarchy.
 For example, a time dimension
might have a hierarchy that
represents data at the month,
quarter, and year levels.
 Levels range from general to
specific, with the root level as
the highest or most general
level.

Potrebbero piacerti anche