Sei sulla pagina 1di 4

SS G515 - Comprehensive Examination

NAME: IDNO:
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
I SEMESTER 2006-2007
SS G515 DATA WAREHOUSING
Comprehensive Examination
PART A (CLOSED BOOK)
12th Dec. 2006 Weightage: 25% Time: 3 hours
Points to note:
 Answer multiple choice questions in the Question paper itself
 Some questions may have more than one correct option. You will get credit only if you
mark all the correct options
 There is NO NEGATIVE marking
 ENCIRCLE the correct option(s) using ink
 Short answer questions are to be solved in the answer sheet provided
Multiple-Choice Questions (30*0.5=15)
1. Pick the correct statement(s):
(a) Snowflaking affects the fact table
(b) Outriggers affect the fact table
(c) Mini-dimensions affect the fact table
(d) Role-playing dimensions affect the fact table
2. Real-time data warehousing is closely related to:
(a) Class I ODS
(b) Class II ODS
(c) Class III ODS
(d) Class IV ODS
3. Granularity of the operational systems is definitely captured in:
(a) Data warehouse
(b) Super mart
(c) Operational data store
(d) Data mart
4. Bitmap indexes
(a) Retrieve the records satisfying a given condition
(b) Identify the records satisfying a given condition
(c) Both identify and retrieve
(d) None
5. Coverage tables contain
(a) Products on promotion that were sold
(b) Products on promotion that were not sold
(c) Products that were sold
(d) (a) & (b)
6. The number of summary tables (assume separate fact table approach to store summary
data), if we have seven dimension tables and one level of hierarchy along all dimensions:
(a) 43
(b) 63
(c) 49
(d) 243
7. Pick the odd one out:
(a) Dependent data marts
(b) Independent data marts
(c) Data warehouse
(d) ODS
8. A dimension can be added to an existing star schema when it is at:
(a) A finer granularity than the fact table
(b) A coarser granularity than the fact table
(c) The same granularity as that of the fact table
(d) Granularity has nothing to do with adding a dimension table
9. Periodic refreshing of the data warehouse always
(a) Adds new data and archives old data
(b) Adds new data and deletes old data
Page 1 of 4
SS G515 - Comprehensive Examination
(c) Adds new data
(d) Updates existing values in the data warehouse
10. Pick the odd one out:
(a) Data vault
(b) Business dimension modeling
(c) ER modeling
(d) Dimensional modeling
11. Pick the correct statement(s):
(a) A data mart is a complete subset of a data warehouse
(b) A supermart is a complete subset of a data warehouse
(c) Both are complete subsets of a data warehouse
(d) All data marts in an organization have the same granularity
12. Most common kind of queries in a data warehouse:
(a) Inside-out queries
(b) Outside-in queries
(c) Browse queries
(d) Range queries
13. An OLAP tool provides for:
(a) Multidimensional Analysis
(b) Roll-up and drill-down
(c) Slicing and dicing
(d) Rotation
14. Lookup tables:
(a) Speed up the loading of dimension tables
(b) Speed up the loading of fact tables
(c) Speed up outside-in queries
(d) Speed up inside-out queries
15. Snowflaking is not recommended in a data warehouse because:
(a) It makes browsing difficult
(b) It prohibits the use of bitmap indexes
(c) Queries require more joins
(d) All of the above
16. Partitioning advantages include:
(a) Speeding up queries
(b) Allowing incremental backups
(c) Ease of data purging
(d) Ease of refreshing
17. Pick the odd view maintenance scheme:
(a) Lazy
(b) Periodic
(c) Immediate
(d) Forced
18. Data modeling technique used for data marts:
(a) Dimensional modeling
(b) ER – model
(c) Extended ER – model
(d) All
19. Using the top-down approach of Inmon for building an enterprise data warehouse involves:
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
20. Building a data mart for a business process/department that is very critical for your
organization is a ______________ project:
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
21. Rapidly changing monster dimension can be handled using:
(a) Outrigger
Page 2 of 4
SS G515 - Comprehensive Examination
(b) Mini-dimension
(c) Snow-flaking
(d) Vertical splitting
22. Pick the correct statement(s) about fact tables
(a) Natural keys can appear in the fact table
(b) The same dimension can appear many times in a fact table
(c) Base level & summarized data can appear in the same fact table
(d) Null values can appear in a fact table
23. The number of aggregated tables in a data mart for a particular business process depends
on:
(a) Number of facts
(b) Number of dimensions
(c) Levels of hierarchies in each dimension
(d) Method of storing aggregated records
24. Examples of multi-valued dimensions
(a) Patient having many diagnosis
(b) Customer buying many products
(c) Different sales persons on a given day in a daily grain sales FT
(d) All of the above
25. Partitioning wrt time dimension is generally recommended because:
(a) It can be easily done using range partitioning
(b) It simplifies the refresh process
(c) It allows for incremental backups
(d) Definition of time never changes
26. Pick the odd one out:
(a) Inside–out queries
(b) Outside-in queries
(c) Fact-focused
(d) Dimension-focused
27. Aggregate navigator is a:
(a) Middleware
(b) Form of query optimization
(c) View maintenance software
(d) Aggregate generator
28. Finding out about products that were on promotion but did not sell requires:
(a) Roll-up
(b) Slicing & dicing
(c) Drill-through
(d) Drill-across
29. Dimensional modeling is more restrictive that ER modeling because:
(a) Data is always classified as fact or dimension
(b) Dimension tables must have single field primary keys
(c) Dimensional tables can not be normalized
(d) Two dimension tables cannot be linked through foreign keys
30. The CUBE operator of SQL:
(a) Stores aggregate records in separate tables
(b) Stores aggregate records in multidimensional arrays
(c) Computes all possible aggregates
(d) Stores aggregate records in the same table
Short Answer Questions (5*2=10)
1. Give a situation when the use of multi-dimensional databases is not recommended. List the
problems associated with using multidimensional databases in the data warehouse
architecture.
2. Give steps of the aggregate navigation algorithm. Also give the architecture of the aggregate
navigator.
3. Compare and contrast the concept and use of data marts in Kimball’s and Inmon’s
approaches.
4. What points one should keep in mind while creating aggregates?
5. Discuss the role of an ODS in data warehousing.

Page 3 of 4
SS G515 - Comprehensive Examination
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
I SEMESTER 2006-2007
SS G515 DATA WAREHOUSING
Comprehensive Examination
th
Date: 12 December 2006
Time: 3 Hours
Weightage: 40% [Part A (closed book) – 25 & Part B (open book) – 15]
Part B – Open Book

1. There is a hotel chain in the US having 500 hotels spread over 40 states.
There are two brand lines of hotels. The primary line of hotels features larger
than average room, most of which are suites. The target customers are
business travelers and upscale vacationers. The second line of hotel features
competitive rates, though with limited facilities. The target customers are
price-sensitive customers. There are different type of rooms, like standard
rooms and suites. Rooms are also categorized by size as small, medium, and
large. Each room may also incorporate certain optional features, such as a
refrigerator or kitchenette.
The management wants to analyze use of the hotel chain’s capacity i.e.
occupancy rate and the biggest challenge they face is determining how to
price the hotel rooms.
You are required to design a data mart for the hotel management based on
the following requirements:
(a) For any day allow occupancy rate to be analyzed across products or
locations. Products are particular room types
Occupancy rate = occupied room/ (occupied rooms+vacant
rooms+unavailable rooms)
(b) Over time, allow analysis of average utilization levels for specific hotels,
products or groups.
(c) For each room type and hotel, capture the accommodation revenues for
comparison to occupancy levels.
Design a star schema keeping in mind the above requirements. You are
required to identify all dimension and facts. Also classify each fact as additive,
semi-additive, or non-additive. Generate some reports using SQL that you
feel will be useful for the management.
[10]
2. SPARSITY FAILURE
Consider a factless fact table containing the finest granularity attendance data
of students at BITS. The table contains data from academic year 2000-2001
onwards. Queries requiring aggregated attendance data are quite common.
Give the schema of the fact table and name the associated dimensions.
Suggest a suitable aggregation strategy. Will the phenomenon of sparsity
failure occur when you pre-compute the aggregates? Justify your answer.
[5]

Page 4 of 4

Potrebbero piacerti anche