Sei sulla pagina 1di 2

Birla Institute of Technology & Science, Pilani Distance Learning Programmes Division Second Semester 2008-2009 Comprehensive Examination

(EC-2 Regular) Course No. Course Title Nature of Exam Weightage Duration Date of Exam Note:
1. 2. 3. 4.

: SS ZG515 : DATA WAREHOUSING : Open Book : 60% : 3 Hours : 04/04/2009 (AN)

No. of Pages =1 No. of Questions = 6

Please follow all the Instructions to Candidates given on the cover page of the answer book. All parts of a question should be answered consecutively. Each answer should start from a fresh page. Mobile phones and computers of any kind should not be used inside the examination hall. Use of any unfair means will result in severe disciplinary action.

Q.1

What new operators are being added to SQL to make it more suitable for OLAP? Give details about these operators. [10] Compare and contrast aggregates and indexes. Also compare and contrast aggregates and data marts. [10] What role views play in dimensional modeling? Give example situations where one would use views. [10] A Bank offers 25 different types of services. The bank has 250 branches spread over 50 states. On any given day it is observed that only 15% of the account holders utilize the bank services. On an average each branch has 30,000 accounts. The bank is currently holding data for 900 days (monthly aggregates are required). (a) Estimate the size of the warehouse in terms of records. (b) Assuming sparsity for 1-way, 2-way, and 3-way aggregates as 50%, 75%, and 100% respectively, estimate the size of the warehouse when you create all possible aggregates. (c) Did the phenomenon of sparsity failure occur? Explain. [2 + 6 + 2 = 10] Should the application developers be aware of the partitioning strategy and aggregations available? Give reasons in support of your answer. [5] Why it is recommended to partition data only wrt time dimension? What are the additional benefits we get by doing so? (compare with benefits wrt dimensions other than time) [5]

Q.2

Q.3

Q.4

Q.5 (a).

Q.5 (b).

Q.6

What are degenerate dimension? Explain how degenerate dimension can help us in carrying out the Market-Basket Analysis. Market-Basket Analysis. It allows you to find out what kind of products customers are buying together during a visit to the grocery store. This analysis is not linked to individual customers in the sense that a customer visiting twice would be considered a different instance of customer. Assuming finest granularity data, modify the star schema in such a way that you are able to do the market-basket analysis. Also assume the there are three dimensions viz, product, store, & time associated the sales fact table. How would you modify your star schema in order to allow for this kind of analysis? [10] **********

In a university data warehouse, give a star schema to capture the attendance of students in different courses. It is required to generate a report that would give the semester wise percentage attendance in different courses. Suggest any aggregate(s) that would facilitate in generation of this report. Give aggregated schemas also. It is given that the average attendance in courses is 50%. Percentage attendance = Ni /(Total no. of students x Total no. of lectures) Where Ni is the no. of students attending ith lecture [10] Q.8 Why it is important to have a time dimension in a data warehouse? What makes time dimension different from other dimensions? What points should be kept in mind while designing the time dimension? [10] Q.9 In a university data warehouse, how would you model the grades of students, as a fact or as a dimension? Give a detailed justification in support of your answer. Also design a star schema that would capture for each semester, the grades of students in different courses. (Assume BITS education model) [10] Q.10 A Direct-to-home (DTH) TV company wants to build a data warehouse to analyze data that they have collected over the years. They have a huge base of customers all over India. The customers are basically households in different parts of the country. The country is divided into four regions (N, S, E, & W). The company offers 100+ channels which are clubbed into different packages and there are some core channels which are part of every package. Each package has a different rate. The company offers different subscription models (monthly, half yearly, & yearly). From time to time the company offers some promotional schemes to its existing and potential customers. The company keeps track of all the subscriptions sold and their current status (new, old, & discontinued). The subscriptions are sold through agents. The company wants to design some tailor-made packages for households based on the composition of the households (elderly men, elderly women, adult males, adult females, teenagers, & kids) and their demographics. You are required to design a data warehouse for the DTH company. First do a detailed analysis requirement and combine it with the above information provided to you to come up with a detailed star schema. Use all the advanced dimensional modeling techniques you have learnt in the course to model the above case study. You are allowed to distort the classical star schema, but give proper justification for doing so. Suggest a suitable aggregation and partitioning strategy. Give example tuples for each relation you have in your design. [10+4+4+2]
Q.7

Potrebbero piacerti anche