Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Ans:- Yes.But those datatype will be char (only the values can
numeric/char).
2.use Assign key values component (if your gde is higher than 1.10)
3.write a stored proc to this and call this stor proc wherever you need.
Yes, dimension table contains numerics but not contain measures and
facts
Ans:- Star schema contains the dimesion tables mapped around one or
more fact tables.
It is a denormalised model.
Snowflake schema
7. What is ER Diagram ?
Simply stated the ER model is a conceptual data model that views the real
world as entities and relationships. A basic component of the model is the
Entity-Relationship diagram which is used to visually represents data
objects.
Since Chen wrote his paper the model has been extended and today it is
commonly used for database design For the database designer, the utility of
the ER model is:
it maps well to the relational model. The constructs used in the ER model
can easily be transformed into relational tables.
it is simple and easy to understand with a minimum of training. Therefore,
the model can be used by the database designer to communicate the design
to the end user.
9. What is VLDB??
Ans:- Meta data is the data about data; Business Analyst or data modeler
usually capture information about data - the source (where and how the
data is originated), nature of data (char, varchar, nullable, existance, valid
values etc) and behavior of data (how it is modified / derived and the life
cycle ) in data dictionary a.k.a metadata. Metadata is also presented at the
Datamart level, subsets, fact and dimensions, ODS etc. For a DW user,
metadata provides vital information for analysis / DSS.
Ans:- Incremental loading means loading the ongoing changes in the OLTP.
DWH Schema
* Used for OLAP systems.,* New generation schema.,* De Normalized
* Easy to understand and navigate., * Extract and complex problems can be
easily solved .,* Very good model
Ans:- Hierarchies
Hierarchies are logical structures that use ordered levels as a means of
organizing data. A hierarchy can be used to define data aggregation. For
example, in a time dimension, a hierarchy might aggregate data from the
month level to the quarter level to the year level. A hierarchy can also be
used to define a navigational drill path and to establish a family structure.
Within a hierarchy, each level is logically connected to the levels above and
below it. Data values at lower levels aggregate into the data values at higher
levels. A dimension can be composed of more than one hierarchy. For
example, in the product dimension, there might be two hierarchies--one for
product categories and one for product suppliers.
Levels
A level represents a position in a hierarchy. For example, a time dimension
might have a hierarchy that represents data at the month, quarter, and year
levels. Levels range from general to specific, with the root level as the
highest or most general level. The levels in a dimension are organized into
one or more hierarchies.
Level Relationships
Level relationships specify top-to-bottom ordering of levels from most
general (the root) to most specific information. They define the parent-child
relationship between the levels in a hierarchy.
17. What is data validation strategies for data mart validation after
loading process?
Ans:- Data validation is to make sure that the loaded data is accurate and
meets the business requriments.
View is nothing but an alias and it can be used to resolve the loops in the
universe.
It is just a unique identifier or number for each row that can be used for the
primary key to the table. The only requirement for a surrogate primary key
is that it is unique for each row in the table.
On the 1st of January 2002, Employee 'E1' belongs to Business Unit 'BU1'
(that's what would be in your Employee Dimension). This employee has a
turnover allocated to him on the Business Unit 'BU1' But on the 2nd of June
the Employee 'E1' is muted from Business Unit 'BU1' to Business Unit 'BU2.'
All the new turnover have to belong to the new Business Unit 'BU2' but the
old one should Belong to the Business Unit 'BU1.'
If you used the natural business key 'E1' for your employee within your
datawarehouse everything would be allocated to Business Unit 'BU2' even
what actualy belongs to 'BU1.'
If you use surrogate keys, you could create on the 2nd of June a new record
for the Employee 'E1' in your Employee Dimension with a new surrogate key.
This way, in your fact table, you have your old data (before 2nd of June)
with the SID of the Employee 'E1' + 'BU1.' All new data (after 2nd of June)
would take the SID of the employee 'E1' + 'BU2.'
Ans:- Linked cube in which a sub-set of the data can be analysed into great
detail. The linking ensures that the data in the cubes remain consistent.
Ans:- View - store the SQL statement in the database and let you use it as a
table. Everytime you access the view, the SQL statement executes.
Materialized view - stores the results of the SQL in table form in the
database. SQL statement only executes once and after that everytime you
run the query, the stored result set is used. Pros include quick query results.
Ans:- Basically the fact table consists of the Index keys of the
dimension/ook up tables and the measures.
so when ever we have the keys in a table .that itself implies that the table is
in the normal form.
Ans:- Basic diff is E-R modeling will have logical and physical model.
Dimensional model will have only physical model.
E-R modeling is used for normalizing the OLTP database design.
Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design.
Ans:- Conformed dimensions are the dimensions which can be used across
multiple Data Marts in combination with multiple facts tables accordingly??
Ans:- Every company has methodology of their own. But to name a few
SDLC Methodology, AIM methodology are stardadly used. Other
methodologies are AMM, World class methodology and many more.
32. What is BUS Schema?
Ans:- BUS Schema is composed of a master suite of confirmed dimension
and standardized definition if facts.
Ans:- You can disconnect the report from the catalog to which it is attached
by saving the report with a snapshot of the data. However, you must
reconnect to the catalog if you want to refresh the data.
38. Is OLAP databases are called decision support system ???
true/false?
Ans:- True.
Ans:- You can schedule any report using Business Objects (reporter) .1)
Open report in BO2) Select option " File->Send To- BCA"3) Select the BCA
name to which report has to be scheduled4) Set other options for report
scheduling like time , any macro , user etc.
42. what is aggregate table and aggregate fact table ... any
examples of both??
Ans:- Aggregate table contains summarised data. The materialized view
are aggregated tables. For ex in sales we have only date transaction. if we
want to create a report like sales by product per year. in such cases we
aggregate the date vales into week_agg, month_agg, quarter_agg,
year_agg. to retrive date from this tables we use @aggrtegate function.
Ans:- Datawarehouse is the place where the data is stored for analyzing
where as OLAP is the process of analyzing the data,managing
aggregations,partitioning information into cubes for indepth visualization.
· Transformation
Transform data task allows point-to-point generating, modifying and
transforming data.
· Loading
Load data task adds records to a database table in a warehouse.
Data modeling is probably the most labor intensive and time consuming part
of the development process. Why bother especially if you are pressed for
time? A common response by practitioners who write on the subject is that
you should no more build a database without a model than you should build
a house without blueprints.
The goal of the data model is to make sure that the all data objects required
by the database are completely and accurately represented. Because the
data model uses easily understood notations and natural language , it can be
reviewed and verified as correct by the end-users.
Since Chen wrote his paper the model has been extended and today it is
commonly used for database design For the database designer, the utility of
the ER model is:
It maps well to the relational model. The constructs used in the ER model
can easily be transformed into relational tables.
It is simple and easy to understand with a minimum of training. Therefore,
the model can be used by the database designer to communicate the design
to the end user.
Ans:- Star schema is a type of organising the tables such that we can
retrieve the result from the database easily and fastly in the warehouse
environment.Usually a star schema consists of one or more dimension tables
around a fact table which looks like a star,so that it got its name.
2. Data Contents
3. Database Design
4. View
Ans:- A lookUp table is the one which is used when updating a warehouse.
When the lookup is placed on the target table (fact table / warehouse) based
upon the primary key of the target, it just updates the table by allowing only
new records or updated records based on the lookup condition.
Ans:- Data Marts are designed to help manager make strategic decisions
about their business.
Data Marts are subset of the corporate-wide data that is of value to a
specific group of users.
There are two types of Data Marts:
1.Independent data marts – sources from data captured form OLTP system,
external providers or from data generated locally within a particular
department or geographic area.
2.Dependent data mart – sources directly form enterprise data warehouses.
Ans:- Star schema - all dimensions will be linked directly with a fat table.
Snow schema - dimensions maybe interlinked or may have one-to-many
relationship with other tables.
64. Which columns go to the fact table and which columns go the
dimension table?
The Primary Key columns of the Dimension Tables go to the Fact Tables as
Foreign Keys.
Ans:- Granularity
The first step in designing a fact table is to
determine the granularity of the fact table. By
granularity, we mean the lowest level of information
that will be stored in the fact table. This
constitutes two steps:
Ans:- Level of granularity means level of detail that you put into the fact
table in a data warehouse. For example: Based on design you can decide to
put the sales data in each transaction. Now, level of granularity would mean
what detail are you willing to put for each transactional fact. Product sales
with respect to each minute or you want to aggregate it upto minute and put
that data.
Window of opportunity refers to the time of interval and if the DBA was
unable to take back up in the specified time then the database was
considered as VLDB.
73. What are Semi-additive and factless facts and in which scenario
will you use such kinds of fact tables?
Ans:- Conformed dimensions mean the exact same thing with every possible
fact table to which they are joined
Ex:Date Dimensions is connected all facts like Sales facts, Inventory
facts..etc
77. Why are OLTP database designs not generally a good idea for a
Data Warehouse?
Ans:- Since in OLTP,tables are normalised and hence query response will be
slow for end user and OLTP doesnot contain years of data and hence cannot
be analysed.
ANs:- On the fact table it is best to use bitmap indexes. Dimension tables
can use bitmap and/or the other types of clustered/non-clustered,
unique/non-unique indexes.
To my knowledge, SQLServer does not support bitmap indexes. Only Oracle
supports bitmaps.
79. Why should you put your data warehouse on a different system
than your OLTP system?
Ans:- A OLTP system is basically " data oriented " (ER model) and not "
Subject oriented "(Dimensional Model) .That is why we design a separate
system that will have a subject oriented OLAP system...
Moreover if a complex querry is fired on a OLTP system will cause a heavy
overhead on the OLTP server that will affect the daytoday business directly.
The loading of a warehouse will likely consume a lot of machine
resources. Additionally, users may create querries or reports that are very
resource intensive
because of the potentially large amount of data available. Such loads and
resource needs will conflict with the needs of the OLTP systems for resources
and will negatively impact those production systems.
Ans:- Transaction logs write sequentially and don't need to be read at all.
The ideal is to have each on RAID 1/0 because it has much better write
performance than RAID 5.
RAID 1 is also better for TX logs and costs less than 1/0 to implement. It has
a tad less reliability and performance is a little worse generally speaking.
RAID 5 is best for data generally because of cost and the fact it provides
great read capability.