0 valutazioniIl 0% ha trovato utile questo documento (0 voti)
133 visualizzazioni24 pagine
The document discusses different data warehouse schemas including star schemas, snowflake schemas, and galaxy schemas. It provides details on schema components like fact tables, dimension tables, and hierarchies. Specifically, it explains that a star schema has one or more fact tables connected to dimension tables, while a snowflake schema extends a star schema by normalizing dimensions into additional tables. A galaxy schema contains multiple fact tables that share dimension tables.
The document discusses different data warehouse schemas including star schemas, snowflake schemas, and galaxy schemas. It provides details on schema components like fact tables, dimension tables, and hierarchies. Specifically, it explains that a star schema has one or more fact tables connected to dimension tables, while a snowflake schema extends a star schema by normalizing dimensions into additional tables. A galaxy schema contains multiple fact tables that share dimension tables.
The document discusses different data warehouse schemas including star schemas, snowflake schemas, and galaxy schemas. It provides details on schema components like fact tables, dimension tables, and hierarchies. Specifically, it explains that a star schema has one or more fact tables connected to dimension tables, while a snowflake schema extends a star schema by normalizing dimensions into additional tables. A galaxy schema contains multiple fact tables that share dimension tables.
database. It includes the name and description of records of all record types including all associated data-items and aggregates.
A schema is a collection of database objects,
including tables, views, indexes, and synonyms. Schema 3
The model of your source data and the
requirements of your users help you design the data warehouse schema. You can sometimes get the source model from your company's enterprise data model The physical implementation of the logical data warehouse model may require some changes to adapt it to your system parameters--size of machine, number of users, storage capacity, type of network, and software. Star Schemas 4
The star schema is the simplest data warehouse
schema. It is called a star schema because the diagram resembles a star, with points radiating from a center. The center of the star consists of one or more fact tables and the points of the star are the dimension tables, as shown in Figure 2-1. A star schema optimizes performance by keeping queries simple and providing fast response time. All the information about each level is stored in one row. 5 6 Star Schema 7
Each dimension has only one dimension table and each
table holds a set of attributes. For example, the location dimension table contains the attribute set {location_key, street, city, province_or_state,country}. This constraint may cause data redundancy. For example, "Vancouver" and "Victoria" both the cities are in the Canadian province of British Columbia. The entries for such cities may cause data redundancy along the attributes province_or_state and country. 8 9 Characteristics of Star Schema: 10
Every dimension in a star schema is represented with the only one-
dimension table. The dimension table should contain the set of attributes. The dimension table is joined to the fact table using a foreign key The dimension table are not joined to each other Fact table would contain key and measure The Star schema is easy to understand and provides optimal disk usage. The dimension tables are not normalized. For instance, in the above figure, Country_ID does not have Country lookup table as an OLTP design would have. The schema is widely supported by BI Tools Snowflake Schema 11
A Snowflake Schema is an extension of a Star
Schema, and it adds additional dimensions. It is called snowflake because its diagram resembles a Snowflake. Snowflake Schema 12
The dimension tables are normalized which splits
data into additional tables. In the following example, Country is further normalized into an individual table. 13 Characteristics of Snowflake Schema: 14
The main benefit of the snowflake schema it uses
smaller disk space. Easier to implement a dimension is added to the Schema Due to multiple tables query performance is reduced The primary challenge that you will face while using the snowflake Schema is that you need to perform more maintenance efforts because of the more lookup tables. Galaxy schema 15
A Galaxy Schema contains two fact table that
shares dimension tables. It is also called Fact Constellation Schema. The schema is viewed as a collection of stars hence the name Galaxy Schema. 16 Galaxy Schema: 17
As you can see in above figure, there are two facts
table Revenue
Product.
In Galaxy schema shares dimensions are called
Conformed Dimensions. Characteristics of Galaxy Schema: 18
The dimensions in this schema are separated into
separate dimensions based on the various levels of hierarchy. For example, if geography has four levels of hierarchy like region, country, state, and city then Galaxy schema should have four dimensions. Moreover, it is possible to build this type of schema by splitting the one-star schema into more Star schemes. The dimensions are large in this schema which is needed to build based on the levels of hierarchy. This schema is helpful for aggregating fact tables for better understanding. Fact tables 19
Fact tables are the large tables in your warehouse
schema that store business measurements. Fact tables typically contain facts and foreign keys to the dimension tables.
Fact tables represent data, usually numeric and
additive, that can be analyzed and examined. Examples include sales, cost, and profit. Creating a New Fact Table 20
You must define a fact table for each star schema.
From a modeling standpoint, the primary key of the fact table is usually a composite key that is made up of all of its foreign keys. Dimension Tables 21
A dimension is a structure, often composed of one or
more hierarchies, that categorizes data. Dimensional attributes help to describe the dimensional value. They are normally descriptive, textual values. Several distinct dimensions, combined with facts, enable you to answer business questions. Commonly used dimensions are customers, products, and time. Dimension Tables 22
Dimension data is typically collected at the lowest
level of detail and then aggregated into higher level totals that are more useful for analysis. These natural rollups or aggregations within a dimension table are called hierarchies. Hierarchies 23
Hierarchies are logical structures that use ordered levels as
a means of organizing data. A hierarchy can be used to define data aggregation. For example, in a time dimension, a hierarchy might aggregate data from the month level to the quarter level to the year level. Within a hierarchy, each level is logically connected to the levels above and below it. Data values at lower levels aggregate into the data values at higher levels. A dimension can be composed of more than one hierarchy. For example, in the product dimension, there might be two hierarchies--one for product categories and one for product suppliers. Levels 24
Level A customer Hierarchy
A level represents a position
in a hierarchy. For example, a time dimension might have a hierarchy that represents data at the month, quarter, and year levels. Levels range from general to specific, with the root level as the highest or most general level.