Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
com/
A fact table is usually designed at a low level of granularity. This means that we need
to find the lowest level of information that can be stored in a fact table e.g.,
employee performance is a very high level of granularity.
Employee_performance_daily and employee_perfomance_weekly can be
considered as lower levels of granularity.
The granularity is the lowest level of information stored in the fact table. The depth of
the data level is known as granularity. In date dimension, the level could be year,
month, quarter, period, week, and day of granularity.
View:
Tail raid data representation is provided with a view to access data from its table.
It has logical structure that does not occupy space.
Changes get affected in the corresponding tables.
Materialized view:
In scenarios where certain data may not be appropriate to store in the schema, the
data (or attributes) can be stored in a junk dimension. The nature of the data of junk
dimension is usually Boolean or flag values.
A single dimension is formed by lumping a number of small dimensions. This is
called a junk dimension. Junk dimension has unrelated attributes. The process of
grouping random flags and text attributes in a dimension by transmitting them to a
distinguished sub-dimension is related to junk dimension.
SCDs (slowly changing dimensions) are the dimensions in which the data changes
slowly, rather than changing regularly on a time basis.
SCD1: It is a record that is used to replace the original record even when there is
only one record existing in the database. The current data will be replaced and the
new data will take its place.
SCD2: It is the new record file that is added to the dimension table. This record
exists in the database with the current data and the previous data that is stored in
the history.
SCD3: This uses the original data that is modified to the new data. This consists of
two records: one record that exists in the database and another record that will
replace the old database record with the new information.
8. What is a star schema?
Star schema is a schema used in data warehousing where a single fact table
references a number of dimension tables. In a star schema, “keys” from all the
dimension tables flow into the fact table. This entity-relationship diagram resembles a
star, hence it is named a Star schema.
What is a fact table? Explain how many fact tables are there in a star schema?
A fact table is nothing but a table which consists information about measurements, facts, metrics
of a business process. It is usually located in the center of a star schema. A star schema is also
called as a snowflake schema. Usually, a fact table consists of two types of columns:
1. The first column has the fact data
2. The second column has the foreign key relation
There is only one fact table that is stored in the star schema or snowflake schema. So, multiple
fact tables are stored under fact constellation schema.
Question 88. What Is Surrogate Key? Where We Use It? Explain With
Examples.
Answer :
Surrogate key is a substitution for the natural primary key.It is just a unique identifier
or number for each row that can be used for the primary key to the table. The only
requirement for a surrogate primary key is that it is unique for each row in the table.
Data warehouses typically use a surrogate, (also known as artificial or identity key),
key for the dimension tables primary keys. They can use Info sequence generator, or
Oracle sequence, or SQL Server Identity values for the surrogate key.
It is useful because the natural primary key (i.e. Customer Number in Customer
table) can change and this makes updates more difficult.
Some tables have columns such as AIRPORT_NAME OR CITY_NAME which are
stated as the primary keys (according to the business users) but ,not only can these
change, indexing on a numerical value is probably better and you could consider
creating a surrogate key called, say, AIRPORT_ID. This would be internal to the
system and as far as the client is concerned, you may display only the
AIRPORT_NAME.
Question 212. Explain The Difference Between The Truncate And Delete
Commands?
Answer :
Truncate :
It is a DDL command, used to delete tables or clusters. Since it is a DDL command
hence it is auto commit and Rollback can't be performed. It is faster than delete.
Delete:
It is DML command, generally used to delete a record, clusters or tables. Rollback
command can be performed , in order to retrieve the earlier deleted things. To make
deleted things permanently, "commit" command should be used.
Fact table contains the measurement of business processes, and it contains foreign
keys for the dimension tables.
24. What are the key columns in Fact and dimension tables?
Foreign keys of dimension tables are primary keys of entity tables. Foreign keys of fact
tables are the primary keys of the dimension tables.
25. What is SCD?
SCD is defined as slowly changing dimensions, and it applies to the cases where record
changes over time.
Star schema is nothing but a type of organizing the tables in such a way that result can
be retrieved from the database quickly in the data warehouse environment.
Snowflake schema which has primary dimension table to which one or more dimensions
can be joined. The primary dimension table is the only table that can be joined with the
fact table.
Core dimension is nothing but a Dimension table which is used as dedicated for single
fact table or datamart.
Name itself implies that it is a self explanatory term. Cleaning of Orphan records, Data
breaching business rules, Inconsistent data and missing information in a database.
Metadata is defined as data about the data. The metadata contains information like
number of columns used, fix width and limited width, ordering of fields and data types of
the fields.
Surrogate key is nothing but a substitute for the natural primary key. It is set to be a
unique identifier for each row that can be used for the primary key to a table.
https://www.complexsql.com/data-warehouse-interview-questions/
What is Star-schema?
This schema is used in data warehouse models where one centralized fact
table references number of dimension tables so as the keys (primary key)
from all the dimension tables flow into the fact table (as foreign key) where
measures are stored. This entity-relationship diagram looks like a star, hence
the name.
Consider a fact table that stores sales quantity for each product and customer
on a certain time. Sales quantity will be the measure here and keys from
customer, product and time dimension tables will flow into the fact table.
If you are not very familiar about Star Schema design or its use, we strongly
recommend you read our excellent article on this subject - different schema in
dimensional modeling
Dimension Table
i.) A view is a virtual table formed from one or more base tables or views. It
doesn't physically hold any data.
ii.) Since a View is not pre-computed and stored on a disk, you always get
the updated data in a View when any changes are made to the original base
table.
iii.) It is used for security purpose. Using Views, you can restrict the user
from accessing sensitive information in a database.
iv.) It reduces the complexity of queries by getting data from several tables
into a single customized View.
Materialized view
i.) A Materialized View is the physical copy of the original base tables. It
holds data physically in a table.
In this model, we have two popular schemas available. They are Star
Schema and Snowflake Schema. (Schema is nothing but arrangement of
tables or database structure)
Star Schema
Properties of Sub-Query:
A clustered index reorders the way records in the table are physically
stored.
A Non-Clustered index creates a separate object within the table and does
not reorders the way records in the table was stored.
Database lock provides exclusive access to the record. A user can only
modify those records to which he has applied a lock. This prevents data from
being corrupted when multiple users try to write to the database.
Type of lock
1. Shared Lock
When a shared lock is applied on data item, other transactions can only read
the item, but can't write into it.
2. Exclusive Lock
When an exclusive lock is applied on data item, other transactions can't read
or write into the data item.
A foreign key is used to link two tables together. A foreign key in one table
points to a primary key in another table.
They are used to enforce referential integrity and prevent any actions that
would destroy links between tables with the corresponding data values.
Advantages:
1. The result set of a view is not stored physically, doesn't consume extra
disk space.
2. The view hide some of the columns and complexity of joins from the user.
3. Views help limit data access to specific users.
Disadvantages:
TRUNCATE is used to delete the data of a table. Here, Commit and Rollback
statement can’t be performed. Where condition can't be used along with
TRUNCATE statement.
Drop command is used to drop a table definition and all the data, indexes,
triggers, constraints and permission specifications for that table.
https://cognosbitech.wordpress.com/2018/08/10/data-warehouse-concepts-interview-questions-
answers-part-i/
https://www.folkstalk.com/2012/12/how-delete-duplicate-records-table-oracle-sql.html#more