Sei sulla pagina 1di 5

NAME: IDNO:

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI


I SEMESTER 2004-2005
SS G515 DATA WAREHOUSING
Comprehensive Examination
th
Date: 08 December 2004
Time: 3 Hours
Weightage: 35% [Part A (closed book) – 19 & Part B (open book) – 16]
Part A – Closed Book
Points to note:
Answer multiple choice questions in the Question paper itself
Some questions may have more than one correct option. You will get credit only if you
mark all the correct options
 There is NO NEGATIVE MARKING
 PUT A TICK on the correct option(s)
 Short answer questions are to be solved in the supplementary answer sheet provided
Multiple-Choice Questions (20*0.5=10)
1. Pick the correct statement(s):
(a) A data mart is a complete subset of a data warehouse
(b) A supermart is a complete subset of a data warehouse
(c) Both are complete subsets of a data warehouse
(d) All data marts in an organization have the same granularity
2. Pick the correct statement(s):
(a) ER modeling is more restrictive than dimensional modeling
(b) ER modeling represents a logical model
(c) ER model represents relationships explicitly
(d) ER model is not a conceptual model
dimensional models are logical models
3. Pick the correct statement(s):
(a) Dimensional model is a physical model
(b) Dimensional model represents relationships explicitly
(c) According to Inmon, dimensional modeling is preferred for data marts
(d) Dimensional modeling is more restrictive than ER modeling
4. For doing Market-Basket Analysis, we need to add a degenerate dimension,
Transaction ID, to the sales fact table of a grocery store data warehouse. Pick the
correct statement(s):
(a) We cannot alter the fact table
(b) We can always add transaction ID to the fact table
(c) Transaction ID can be added only when we have a daily grain
(d) Transaction ID can be added only when we have a receipt line grain
5. If the telephone numbers of a group of customers living in a particular city
change, you would handle the changes using:
(a) Type I change
(b) Type II change
(c) Type III change
(d) Minidimensions
6. Surrogate keys are used to:
(a) Buffer the data warehouse from changes in the operational system
(b) Save space
(c) Implement Type II changes
(d) Implement Type III changes
7. Surrogate keys are used to:
(a) Replace natural keys in dimension tables
(b) Replace natural keys in fact tables
(c) Reduce space requirements of dimension tables
(d) Reduce space requirements of fact tables
8. Most common kind of queries in a data warehouse:
(a) Inside-out queries
(b) Outside-in queries
(c) Browse queries
(d) Range queries
9. An OLAP tool provides for:
(a) Multidimensional Analysis
(b) Roll-up and drill-down
(c) Slicing and dicing
(d) Rotation
10. Lookup tables:
(a) Speed up the loading of dimension tables
(b) Speed up the loading of fact tables
(c) Speed up outside-in queries
(d) Speed up inside-out queries
11. Snowflaking is not recommended in a data warehouse because:
(a) It makes browsing difficult
(b) It prohibits the use of bitmap indexes
(c) Queries require more joins
(d) All of the above
12. In a data warehouse bitmap indexes are created on:
(a) Fact tables
(b) Dimension tables
(c) Normalized dimension tables
(d) Any kind of table
13. Partitioning advantages include:
(a) Speeding up queries
(b) Allowing incremental backups
(c) Ease of data purging
(d) Ease of refreshing
14. Pick the odd one out:
(a) Lazy
(b) Periodic
(c) Immediate
(d) Forced
15. A Business Intelligence system requires data from:
(a) Data warehouse
(b) Operational systems
(c) All possible sources within the organization and possibly from external
sources
(d) Web servers
16. A business intelligence system will have the following tools:
(a) OLAP tool
(b) Data mining tool
(c) Query tool
(d) Reporting tool
17. Data modeling technique used for data marts:
(a) Dimensional modeling
(b) ER – model
(c) Extended ER – model
(d) All
18. Building a data mart for a business process/department that is very critical for
your organization is a ______________ project:
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
19. In a data warehouse, if D1 and D2 are two conformed dimensions, then:
(a) D1 may be an exact replica of D2
(b) D1 may be at a rolled up level of granularity compared to D2
(c) Columns of D1 may be a subset of D2 and vice versa
(d) Rows of D1 may be a subset of D2 and vice versa
20. Granularity is related to:
(a) Fact tables only
(b) Dimension tables only
(c) Both fact and dimension tables
(d) Neither fact nor dimension table

Short Answer Questions (6*1.5=9)

1. What are supermarts? Compare and contrast them with data marts?
2. Both Kimball and Inmon agree on one aspect of data modeling for data
warehouses. What is it?
3. Suppose the marital status of a customer of AMAZON.COM changed from single
to married and also the customer moved to a new state. If in your warehouse you
don’t keep history, what problems would you face? Give two specific examples.
4. Explain how rapidly changing monster dimensions can be handled.
5. Explain how mutivalued dimensions are handled in dimensional modeling.
Illustrate by giving suitable example.
6. What are inside-out queries? Give an example.
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
I SEMESTER 2004-2005
SS G515 DATA WAREHOUSING
Comprehensive Examination
th
Date: 08 December 2004
Time: 3 Hours
Weightage: 35% [Part A (closed book) – 19 & Part B (open book) – 16]
Part B – Open Book

1. There is a hotel chain in the US having 500 hotels spread over 40 states. There are
two brand lines of hotels. The primary line of hotels features larger than average
room, most of which are suites. The target customers are business travelers and
upscale vacationers. The second line of hotel features competitive rates, though with
limited facilities. The target customers are price-sensitive customers. There are
different type of rooms, like standard rooms and suites. Rooms are also categorized
by size as small, medium, and large. Each room may also incorporate certain optional
features, such as a refrigerator or kitchenette.
The management wants to analyze use of the hotel chain’s capacity i.e. occupancy
rate and the biggest challenge they face is determining how to price the hotel rooms.
You are required to design a data mart for the hotel management based on the
following requirements:
(a) For any day allow occupancy rate to be analyzed across products or locations.
Products are particular room types
Occupancy rate = occupied room/ (occupied rooms+vacant rooms+unavailable rooms)
(b) Over time, allow analysis of average utilization levels for specific hotels, products
or groups.
(c) For each room type and hotel, capture the accommodation revenues for
comparison to occupancy levels.
Design a star schema keeping in mind the above requirements. You are required to
identify all dimension and facts. Also classify each fact as additive, semi-additive, or
non-additive. Generate some reports using SQL that you feel will be useful for the
management.
[7]
2. What are the two main type of analysis that can be performed on the data warehouse
data? Give an overview of tools that are required to do these kinds of analyses.
[2]
3. Computer company A keeps data about the PC models it sells in the schema:
Computers (number, proc, speed, memory, hd)
Monitors (number, screen, maxResX, maxResY)
For instance, the tuple (123, PIII, 500, 128, 18,7) in Computers means that model
123 has a Pentium III processor running at 500 megahertz with 128M of memory and
an 18,7 G hard disk. The tuple (234, 19, 1024, 1024) in Monitors means that model
234 has a 19-inch screen with a maximum resolution of 1024×1024. The attribute
number in the relation Computers denotes a model (a number of the model) of a
computer while the attribute number in the relation Monitors denotes a model (a
number of the model) of a monitor.
Computer company B only sells complete systems, consisting of a computer and
monitor. Its schema is:
Systems (id, processor, mem, disk, screenSize)
The attribute processor is an integer speed; the type of the processor is not
recorded. Neither is the maximum resolution of the monitor recorded. Attributes id,
mem, and disk are analogous to number, memory, and hd from company A,
but the disk size is measured in megabytes instead of gigabytes.
(a). Suggest a global data warehouse schema that would allow us to maintain as
much information as we could about the products sold by companies A and B.
(b). Write SQL queries to gather the information from the data at companies A and
B and put it in a warehouse with your global schema of exercise 1.
[7]

Potrebbero piacerti anche