Sei sulla pagina 1di 14

1. (3 points for each) a.

A Datalog query is safe when every variable that appears on the right appears in a non-negated, non-arithmetic body part. b. Inserts and Updates. c. Updates and Deletes. d. SELECT A.Aonly, A.common, B.Bonly FROM A,B WHERE A.common1 = B.common1 AND A.common2 = B.common2 AND ...

(where Aonly are the attributes of A not in B, Bonly are the attributes of B not in A, A.common are the attributes common to A and B - comprised of common1, common2, and so on). e. Preservation of MDs. f. None. g. Tuning the schema, Reformulating the query, and controlling the use of indices. h. True. i. True. j. Database systems. This question is for fun only (as in all my exams) and points are awarded irrespective of what you write.

2. There are five entity sets - books, authors, book publishers, book sellers, and book stores. Books and authors are connected by a many-many relationship. Books and book publishers are connected by a many-one relationship. There is a referential integrity constraint from books to book publishers.

Book sellers has a one-many relationship to book stores. Two subclasses inherit from book stores - brick and web-based. Book stores has a many-many relationship to book publishers.

There are at least two notes: (i) A book seller must have at least one book store. (ii) A book store can be brick or web-based but not both.

3. First decompose the schema into BCNF before decomposition into 4NF. The key is {A,B,C}. Since the B->D FD is violating BCNF, we decompose the relation schema into {B,D} and {A,B,C}. This latter schema is in 4NF (all MDs are trivial), so we are done.

4. This query is not impossible.

CREATE VIEW Occurs(name,count) AS SELECT name, COUNT(*) FROM Students GROUP BY name;

SELECT name FROM Occurs WHERE count = (SELECT MAX(count) FROM Occurs);

5. We can first find the gas station with the highest revenue, remove it, and find the station with the highest revenue (again). The following Datalog program mimics this idea.

CannotbeHighest(x,y,z,r1) <- GasStations(x,y,z,r1), GasStations(_,_,_,r2), r1 < r2. CannotbeSecondHighest(x) <- CannotbeHighest(x,y,z,r1), CannotbeHighest(_,_,_,r2), r1 < r2. SecondHighest(x) <- CannotbeHighest(x,_,_,_), NOT CannotbeSecondHighest(x). Ans(y) <- GasStations(x,_,y,_), SecondHighest(x).

6. This problem is merely asking you to do the divide operation, which we have seen in many homeworks (e.g., "find the customers who have rented ALL the DVD movies", "find the students who have taken ALL the courses required for graduation"). Similarly, find the As that pair up with ALL the Bs.

// first make a list of all As

AllAs = pi_{A1,A2,...,An} R // the list of Bs is given in S AllBs = S // ideally, all the As occur with all the Bs Ideally = AllAs X AllBs // the reality is given by the relation R Reality = R // their difference identifies some As and the Bs they // don't occur with Notpresent = pi_{A1,A2,...,An} (Ideally - Reality) // these must be the bad As, so lets remove them! Ans = AllAs - Notpresent

7. The first part can be solved by: sigma_{gender <> 'M' AND gender <> 'F') Students = {} Notice that we need the AND here, not the OR.

The second part can be solved quite simply by: pi_id (Students) - pi_ssn (Individuals) = {} or we can say pi_id (Students) <subset of> pi_ssn (individuals)

In other words, the id field of *every* record in Students must map into the ssn of *some* record in Individuals.

8. This problem is taken straight from the last homework.

Spring 2001 Final Solution Sketches -----------------------

1. a. m+n b. Snap-together visualization: a way to rapidly prototype visualizations by connecting together information from database tables. c. zero d. Ones which are joins, involve aggregates etc. e. Because it is cheaper than checking every tuple to ensure there are no duplicates. The answer is NOT "because that is how they designed it." The answer is NOT "to do this you need to use DISTINCT." f. A referential integrity constraint is a stricter form of a -one constraint. Not only can there be at most one entity connected by the -one relationship, but that entity *has to exist*. g. False. h. Horizontal partitioning, vertical partitioning, decompositions, controlling the use of indices, avoiding subquery blocks etc. i. pi_a (sigma_b (R times S))

where b = condition where attributes with similar names are matched up and a = list of attributes, with duplicates removed. j. Opun.

2. Highways has to be an entity set. Thus the diagram has to be pushed out and the arrow in the "passes through" relationship towards Cities has to be removed.

3. The relational algebra condition is just short for B->CD. Therefore {A,B,E} is a key and the relation can be decomposed along the above FD to produce: {{B,C,D},{B,A,E}}.

4. Impossible.

5. The neatest solution I found is:

Answer(a) <- RequiredForGraduation(b), HasTaken(a,b), RequiredForGraduation(c), NOT HasTaken(a,c).

i.e., there is a course (b) that is required for graduation and that is taken by a, and there is a course (c) that is required for graduation and that is not taken by a. This is the definition of "some but not all."

6. Nobody seems to have got this right. Again, the pitfall is (as I have said numerous times), trying to solve it in one (or few) steps.

First find Ancestors (this was demonstrated in class):

Ancestor(x,y) <- Father(x,y). Ancestor(x,y) <- Father(x,z), Ancestor(z,y).

Then, find CommonAncestors:

CommonAncestors(x,y,z) <- Ancestor(x,z), Ancestor(y,z), x<>y.

i.e., z is a common ancestor of x and y if z is an ancestor of x and z is an ancestor of y. But there could be many such CommonAncestors. The goal is to find the earliest. There is no way to do this in one step. Find the CommonAncestors that cannot be earliest and subtract.

CannotBeEarliest(x,y,z) <- CommonAncestors(x,y,z), CommonAncestors(x,y,w), Ancestor(w,z).

In other words, z cannot be the earliest common ancestor (ECA) of x and y, if z is himself an ancestor of w who is also a common ancestor of x and y. Thus,

ECA(x,y,z) <- CommonAncestors(x,y,z), NOT CannotBeEarliest(x,y,z).

Partial credit is given for good attempts.

7. CREATE VIEW T2(ship,battle,date,results) AS SELECT ship,battle,date,result FROM Outcomes, Battles WHERE Outcomes.battle = Battles.name;

SELECT T3.ship FROM T2 AS T3, T2 AS T4 WHERE T3.ship = T4.ship (* ships are the same *) AND T3.battle <> T4.battle (* "another" battle *) AND T3.date < T4.date (* "later") AND T3.result = "damaged";

8. Again, very few people solved this problem. Create a relation that has all numbers in it (0-9). Call it Numbers.

SELECT S,E,N,... FROM S AS Numbers, E AS Numbers, N AS Numbers, ... WHERE 1000*S+100*E+10*E+D + 1000*M+100*O+10*R+E = 10000*M+1000*O+100*N+10*E+Y; Spring 2000 Final Answer key and Grading Checklist

--------------------------------

Question 1 (30 points): Short Answer Questions ----------------------------------------------------------

a. Lossless Join Decomposition

b. Any variable appearing on the right hand side of a query must appear in a non-negated, non-arithmetic clause.

c. Let S = R. Then FD AB->C holds if:

(S Theta_Join_{c} R) = {}

where c = (R.A = S.A AND R.B = S.B and R.C <> S.C)

d. Constraints and Triggers

e. Because operations on bags are easier to implement, do not require checking for membership (e.g. UNIONs) and no comparisons are necessary. See section 4.6.1 of your textbook.

f. nCf(n/2) where f() is either the floor or the ceiling operator.

g. Because every object of a class is assigned a unique OID, that can serve as its key.

h. superkey

i. Any number can be expressed as the sum of 1s. People who are good at knitting tend not to have the Y chromosome.

The answer is *not* beers and diapers, since that is a useful, interesting, and non-obvious pattern.

j. The best answer I found was that it leaves the right hand to have a drink! :-)

Question 2 (10 points): E/R Diagram ----------------------------------------------------------

Everything except the last two sentences talk about "prescribe" as a verb and then suddenly it becomes a noun. This is a hint for "pushing out" a relationship into an entity set. Thus, prescriptions is a connecting entity set which typically will have many-one relationships to Doctors, Drugs, and Patients.

Due to the constraint in the last sentence, the arrows entering Drugs and Patients have to be removed.

Using prescription as a relationship will cost you five points. More mistakes will have additional penalties.

Question 3 (10 points): Decompositions ----------------------------------------------------------

Only one person answered this question satisfactorily (Subhangan Venkatesan).

We know that both BCNF and 3NF provide lossless join decompositions. All that needs to be done is to examine closely how we do decompositions in both.

In both cases, for a violating FD X->Y, we get {X,Y} on one side and {X,something} on the other.

So, - the intersection of the two decompositions must be X (the left side of the FD) - the right side of the FD must be on one side of the decomposition.

To phrase it in the form the question asks for:

A decomposition of R into S and T is a lossless-join decomposition if:

S Intersection T -> S OR S Intersection T -> T

either of the two holds.

As you can see, in the second example, S = {ABC}, T={CDE}, but

neither of C->ABC or C->CDE holds, hence it is not lossless. In the first example, A->ABC holds, so it is lossless.

Most of you got the first bullet right but extrapolated it to say that the intersection should be a key. It so happens in this case that A is a key, but this rule will not hold in general. Partial credit will be given for attempt.

Question 4 (20 points): Queries and Misc. ----------------------------------------------------------

a. This is not impossible.

MultiAdvisor(Y) <- Advisor(X,Y),Advisor(Z,Y),X<>Z. Answer(X) <- Student(X,Y), NOT MultiAdvisor(Y).

b. Minimum = 0. This will occur when no tuples match in R and S. Maximum = mn. This will occur when all tuples in R match all in S.

c. It will not hold if R, S and T are bags instead of sets.

Question 5 (10 points): Monotonic Operators ----------------------------------------------------------

Only the difference operator is not monotone. In the expression

A-B, adding more tuples to B can actually make the answer smaller. This is an exercise problem from the textbook.

Question 6 (10 points): Relational Algebra ----------------------------------------------------------

Again, this is not impossible. Here's a sketch.

Ideally = ... Reality = HasTaken BadStudents = Ideally - Reality

Now, look at BadStudents and pick out the students who only occur once (i.e., are bad because of only one course). This is easily done by finding students who occur two or more times and then subtracting them from BadStudents.

Question 7 (10 points): SQL ----------------------------------------------------------

SELECT S1.name FROM Student S1, Student S2, Student S3 WHERE S1.gpa = S2.gpa AND S2.gpa = S3.gpa

AND S1.age < S2.age AND S1.age < S3.age AND S2.id <> S3.id

Leaving out the last AND will cost you three points.

Question 8 (15 points): SQL ----------------------------------------------------------

Again, this is not impossible. This problem is a solved one in the puzzles section on the course web page.

Potrebbero piacerti anche