Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Normalization
see
Pernul, Unland: Datenbanken im Unternehmen, Chapter 3.6.3.3
also:
Elmasri, Navathe: Fundamentals of Database Systems, Chapter 10 -
11
top
This means that the domain of the attribute may only contain single, individable
values and that the value of the attribute in each tuple may only be a single
value from the domain. So, 1NF prevents some set of values or tuples being an
attribute value for a single tuple. In other words, 1NF prevents "relations inside
relations" or "relations as attributes of tuples".
To make this more clear, take a look at the following example "company":
Transitive Dependency:
In the section on 2NF we did not consider that City → Zip Code. Let's expand
our relation (not in 3NF):
Now, we have the FDs Department Number → Zip Code and Zip Code → City,
so City depends on Department Number transitively because of Zip Code. In
the table of the example we can see that the information "City Berlin belongs to
Zip Code 13629" is redundant. So again we could form a new relation (City, Zip
Code) to eliminate this redundancy.
If we choose Y as a subset of the set of keys, we can see that partial functional
dependencies are a special kind of transitive dependencies; so 3NF implies
2NF.
Boyce-Codd-Normal Form:
The BCNF is an even stronger normal form than 3NF, as sometimes there are
dependencies between prime attributes. A relation scheme is in BCNF if for
every nontrivial FD X → A, X is a superkey of R; this means that every FD has
to have a superkey on the left side. Technically:
A 3NF relational schema RS(S;F) is in BCNF if for each Y ⊆ S and for each
attribute A ∈ S\Y the following holds: Y → A ⇒ Y → S.
In the table above, we also can notice the following FDs of attributes Zip Code,
City, Street and Street Number:
City, Street, Street Number → Zip Code
Zip Code → City
Key for these four attributes is the set City, Street, Street Number as well as the
set Zip Code, Street, Street Number. So all these four attributes are prime, but
there is a dependency between some of them (see from above):
We will not look into the issue on how to acquire BCNF from a 3NF relation.
Examples
Given:
RS(S;F) with S = {A, B, C, D, E} and F = {AB → CE, E → AB, C → D}
F is already minimal.
Given:
RS(S;F) with S = {A, B, C, D} and F = {AC → BD, D → A, CD → A}
Nonprime attribute: B
RS(S;F) is in 2NF, because B is fully functional dependent on candidate keys.
It is in 3NF because B is not transitively dependent on a candidate key.
It is not in BCNF, because the functional dependency D → A is not a key
dependency.
top
8.2 Normalization
If a relation schema is normalized, it is decomposed into smaller relation
schemas that show the desirable properties. These smaller relation schemas
have a normal form of a higher degree. But there are some restrictions:
r:
A B C
1 1 1
1 2 2
2 1 2
Decomposition:
πAB(r):
A B
1 1
1 2
2 1
πBC(r):
B C
1 1
2 2
1 2
Join:
A B C
1 1 1
1 1 2
1 2 2
2 1 1
2 1 2
The FD AE → B is lost!
Definition:
• (attribute preservation)
Algorithm NORMALIZATION(RS(S;F))
Input: a relation schema RS(S;F) in an undesirable form
Output: a valid 3NF-decomposition of RS(S;F) into {RSi(Xi;Fi)} (i=1, ..., n)
NORMALIZATION(RS(S;F))
BEGIN
F := REDUCE(F);
MERGE(RSi(Xi;Fi), RSj(Xj;Fj))
MERGE(RSi(Xi;Fi), RSj(Xj;Fj))
BEGIN
IF Xi ⊆ Xj and RSi(Xi ∪ Xj, Fi ∪ Fj) suffice the 3NF THEN
merge RSi(Xi;Fi) and RSj(Xj;Fj)
END
Example: Normalization
F={
Exercise → Lecturer (E → L)
/* Each exercise is coached by a lecturer */
F'={ E → L, ES → G, TR → E, TS → E, TL → E, TE → R }
RS1({E,L}; {E → L}),
Both relation schemas RS1 and RS3 can be merged because of RS1 ⊆ RS3
and because the new relation schema does not violate any 3NF-constraints. In
the relation schema RS4 the key TS is also the only candidate key, thus the
nonadditive join property is granted. All FDs of F+ are also enclosed in the
decompositions and therefore the dependency preservation property is granted
as well. A possible solution could look like this:
({Exercise, Student, Grade}; {Exercise Student → Grade})
test
candidate key: Exercise Student; BCNF
({Time, Lecture Room, Exercise, Lecturer}; { Time Lecture Room →
Exercise, Time Lecturer → Exercise, Time Exercise → Lecture Room,
assignment Exercise → Lecturer})
candidate key: Time Lecture Room, Time Lecturer, Time Exercise;
3NF
({Time, Student, Exercise}; {Time Student → Exercise})
timetable
candidate key: Time Student; BCNF
Note: The relation schema "assignment" is not in BCNF. Hence, anomalies like
the insertion anomaly still can occur: An exercise can only be inserted into the
database if a time is available for it; a lecturer and a lecture room can only be
inserted if they are assigned to an exercise. The first anomaly can be avoided
by not merging RS1 and RS3. The other ones are based on the fact that no
more properties are specified for lecturer and lecture room.
top