Sei sulla pagina 1di 35

Database design with a relational

model

Functional dependencies
Normalization
Functional Dependencies

 Definition:
– An integrity constraint between two sets of attributes
 Think of it as a relationship between two sets of
attributes
 Generalizes the concept of a key, where both X and Y
represent sets of attributes
X→Y
– X functionally determines Y
– Y is functionally determined by X
FD {X → Y}
i.e., when two tuples agree on their values for the X domains, they
agree on their values for the Y domains
{X} {Y}
1,2,a,0 m
1,2,a,0 m
Functional Dependencies

 Legal instances of a relation must satisfy all ICs,


including FDs
 FD are a property of the intension, not the extension,
but can be “ruled out” using the extension
 Primary key is a particular case of a FD (but not all
FDs are keys)
 F+ denotes the closure of F
– The set of all functional dependencies implied by a given
set of functional dependencies
Functional Dependency

A B C D …

A1 B1 C1 D1 …
Determinant
A1 B1 C1 D2 …
AB → C
A1 B2 C2 D1 …

A2 B1 C3 D1 … Dependent
Functional Dependency Closure

 Set F of Functional Dependencies (given)


Relation:
– EmpProj: SSN, Pnumber, Hours, Ename, Pname, Plocation
– FDs F:
 {SSN → Ename}
 Pnumber → {Pname, Plocation}
 {SSN, Pnumber} → Hours}
 Closures:
– {SSN}+ = {SSN, Ename}
– {Pnumber}+ = {Pnumber, Pname, Plocation}
 F+
– {SSN, Pnumber}+ = {SSN, Pnumber, Ename, Pname,
Plocation, Hours}
Attribute Closure

 Determine FDs that can easily be identified (given)


 To see if a given FD is in the closure of a set F of FDs, use
Armstrong’s Axioms to infer additional dependencies
 Armstrong’s Axioms
– 3 rules of inference can generate all FDs:
 Reflexivity:
If X ⊇Y, then X → Y
 Augmentation:
If X → Y, then XZ → YZ for any Z
 Transitivity:
If X → Y and Y → Z, then X → Z

 But keep in mind 2 other rules that are useful:


 Union:
If X → Y and X → Z, then X → YZ
 Decomposition:
If X → YZ, then X → Y and X → Z
Minimal Set

 Requirements:
– Every dependency has a single attribute for its right side
– No X → A in F can be replaced with Y → A where Y is a subset of
X and still have dependencies equivalent to F
– No dependency can be removed and still have a set of
dependency equivalent to F
 Definition:
– If the right-hand (dependent) side of every FD in S involves just
one attribute, and
– if the left-hand (determinant) side of every FD in S is irreducible
(can’t discard another attribute without making a new set not
equivalent to S - aka, changing the closure of S+), and
– no FD in S can be discarded without changing the closure of S+,
then the set S of FDs is irreducible (minimal).
Example

 Given:
A → B, ABCD → E, EF → G, EF → H, and ACDF → EG
 First rewrite:
ACDF → E and ACDF → G
 Then consider ACDF → G
(implied by: A → B, ABCD → E, and EF → G)
Delete it (and, in the same way, ACD → E replaces ABCD →
E)
Minimal Set: A → B, ACD → E, ACDF → G, and ACDF → H.

Single attribute on right, Minimize left, Delete redundant


FDs
Another example…
A = employee number, B = department number, C = manager’s
employee number, D = project number for a project unique to a
manager, E = department name, F = time specified for a project

Given: A -> BC, B -> E, CD -> EF

Determine: Does AD->F hold?


A-> BC (given),
A->C (decomposition),
AD->CD (augmentation),
CD->EF (given),
AD->EF (transitivity),
AD->F (decomposition)
Cover

 Let S1 and S2 be two sets of FDs.


 If every FD implied by S1 is implied by S2 (S1+⊆S2+),
S2 is a cover for S1.
So, if FDs are enforced for S2, they are enforced for S1.
If S1 is also a cover for S2, they are equivalent.
From another
direction
 For P, the following FDs are part of the closure:
P# ->Pname, P# -> Color, P# -> Weight, P# -> City
The following FDs are not irreducible:
 P# -> {Pname, Color}
 P# -> Weight
 P# -> City
– (right-hand side is not a single attribute)
 {P#, Pname} -> Color
 P# -> Pname
 P# -> Weight
 P# -> City
– (left-hand side can be further reduced without changing closure
 P# -> P#
 P# -> Pname
 P# -> Color
 P# -> Weight
 P# -> City
– (first FD can be dropped without changing closure)
Functional Dependency from the SQL perspective

An SQL consideration:
If X functionally determines Y, then Y is functionally dependent on X.

SQL Translation: In a legal instance with one or more determinant columns


and one dependent columns, if you create a SELECT statement with a
WHERE clause containing “=“ for some value in the determinant columns,
any tuples returned by the dependent column in your FROM clause must
always be the same for all tuples.

So the SELECT contains the right side and where or group by contains the
left side of a Functional Dependency statement

SELECT DependentColumn FROM Table WHERE DeterminantColumn1 =


‘value1’…AND DeterminantColumnN = ‘valueN’
Normalization
Product Support Coverage

ProductNumber Date Time EmployeeI PhoneNumber PayRate Withholding


D
CR76 05/13/0 13:30 SG5 3651101 12 1.80
CR56 2 12:00 SG5 3651101 12 1.80
CR74 05/13/0 12:00 SG37 3651102 12 1.80
2
CR56 10:30 SG5 3651102 18 2.40
05/13/0
2
07/04/0
ProductNumber,Date -> Time, EmployeeID, PhoneNumber
2

EmployeeID, Date, Time -> ProductNumber


PhoneNumber, Date, Time -> EmployeeID, ProductNumber
EmployeeID, Date -> PhoneNumber
Date -> PayRate
PayRate -> Withholding
Product Coverage Support
ProductNumber,Date -> Time, EmployeeID,
PhoneNumber
EmployeeID, Date, Time -> ProductNumber
PhoneNumber, Date, Time -> EmployeeID,
Illustrate FDs ProductNumber
EmployeeID, Date -> PhoneNumber
Date -> PayRate
PayRate -> Withholding

ProductNumb D T EmployeeI PhoneNumbe PayRate Withholding


er ate ime D r
Decompose

Product Support Coverage 1NF

Pay 2NF

Coverage Wages Taxes 3NF

Product Coverage Phone Assignment BCNF


Sample Set in First Normal Form

Ssn Name Department Rank Wage Hours Bonded


123223666 Amsden Human 8 10 40 Y
Resource

231515368 Skaggs Management 8 10 30


Information
Systems
131243560 Sider Marketing & 5 7 30
Sales

434263751 Gallow Marketing & 5 7 32


Sales

612674134 Moore Marketing & 8 10 40


Sales
Normal Forms

 The key, the whole key, and nothing but the key
 Assumptions:
– FDs exist
– A designated primary key can be found
 Remember, a key is a FD, but a FD is not always a key
 A property of a relation schema indicating the type of
redundancy in the relation schema
 The goal is to remove redundancy based on
dependencies
 The desire is to minimize additional integrity
constraints
Normal Forms

 Considerations:
– Relational design by analysis
– Normal forms are based on functional dependencies
(FDs)
– Intuitive, perhaps, but identifying a strictly controlled
procedure allows a programmatic process
– Should consider 2 additional properties
 Lossless join (nonadditive join property)
– required
 Dependency preservation property
– use when possible
Lossless Joins and Dependency
Preservation

 If relation R and FDs F hold over R, then decomposing R


into R1 and R2 is lossless if the closure of F contains either:
– FD R1 ∩ R2 -> R1
or
– FD R1 ∩ R2 -> R2

 If the closure of the attributes in R1, independent of those


attributes in R2, unioned with the closure of attributes of
R2, independent of those attributes in R1, are equivalent to
the closure F, then dependency is preserved
2NF to 3NF

PN PM

AD3111 20
PN PM D
IF1000 20
AD3111 20 A00
MA2100 10
MA2100 10 D11
OP1000 10
OP1000 10 D11

IF1000 20 A00
PM D
20 A00
10 D11
Sample Set

Ssn Name Department Rank Wage Hours Bonded


123223666 Amsden Human 8 10 40 Y
Resource

231515368 Skaggs Management 8 10 30


Information
Systems
131243560 Sider Marketing & 5 7 30
Sales

434263751 Gallow Marketing & 5 7 32


Sales

612674134 Moore Marketing & 8 10 40


Sales
Normal Forms

 The key, the whole key, and nothing but the key
 Assumptions:
– FDs exist
– A designated primary key can be found
 Remember, a key is a FD, but a FD is not always
a key
 A property of a relation schema indicating the
type of redundancy in the relation schema
 The goal is to remove redundancy based on
dependencies
 The desire is to minimize additional integrity
constraints
Normal Forms

 Considerations:
– Relational design by analysis
– Normal forms are based on functional dependencies (FDs)
– Intuitive, perhaps, but identifying a strictly controlled
procedure allows a programmatic process
– Should consider 2 additional properties
 Lossless join (nonadditive join property)
– required
 Dependency preservation property
– use when possible
Lossless Joins and Dependency
Preservation

 If relation R and FDs F hold over R, then decomposing


R into R1 and R2 is lossless if the closure of F
contains either:
– FD R1 ∩ R2 -> R1
or
– FD R1 ∩ R2 -> R2

 If the closure of the attributes in R1, independent of


those attributes in R2, unioned with the closure of
attributes of R2, independent of those attributes in
R1, are equivalent to the closure F, then dependency
is preserved
Normal Forms

 Definitions:
– Superkey –
 R={A1,A2,…,An}
 A set of attributes S ⊆ R
 No 2 tuples (t1,t2) in any legal relation state r of R has t1[S]=t2[S]
– Key –
 A superkey that is disqualified by any attribute being removed, &
 Must be minimal (can’t remove another attribute without
disqualifying as key of R)
– Candidate Key –
 A set of attributes in a relation satisfying key definition
– Primary Key –
 Arbitrarily designated
– Primary Attribute –
 A member of a candidate key
Levels of
Normalization

BCNF

3 NF

2 NF

1 NF
BCNF table

EmployeeNumb Employee DepartmentNumb HireDate Salary


er er
100 Mauser E21 6/19/1980 26150

110 Roberts A00 5/16/1958 38170

120 O’Connell A00 12/5/1963 37950

130 Clark C01 7/28/1971 26150

140 Ross C01 12/15/1976 35420

150 Adamson D11 2/12/1972 30280

160 Clark D11 10/11/1977 26150

290 Parker D31 5/30/1980 15340

310 Skaggs D31 9/12/1964 15900


First Normal Form (1NF)

 Requirements:
– 1NF disallows multivalued attributes, or composite
attributes, or their combinations, by requiring only single
atomic (indivisible) values in the domain of an attribute
– In other words, no “relation within relations”
Business Rules Example
1NF

PN AN PM D S
 Staffing hours (S) are on a per
project activity (activities AD3111 20 20 A00 0.8
within projects) basis - AN AD3111 30 20 A00 1.5
 Managers (PM) and their AD3111 40 20 A00 1
departments (D) are assigned AD3111 50 20 A00 1.25
to projects (PN)
– A department is assigned to a
AD3111 60 20 A00 0.75
project managers AD3111 70 20 A00 0.35
– A project manager is assigned MA2100 20 10 D11 0.5
to projects
MA2100 30 10 D11 1
OP1000 30 10 D11 0.25
IF1000 30 20 A00 1
IF1000 50 20 A00 0.5
IF1000 60 20 A00 0.5
Second Normal Form (2NF)

 2NF requires full functional dependency on the


whole key
 If FD X→Y, removal of A eliminates the FD
 For any attribute A∈X, (X-{A})→Y
 If A can be removed and FD remains, X→Y is a
partial functional dependency (a violation of 2NF)
\
2NF violations

 Partial dependencies – X is a subset of


some key K

K X A
1NF
Staffing is on a per project activity ( and activities within projects)
basis
2NF
Managers and their departments are assigned to projects

{PN,AN→STAFF}
PN AN PM D S PN →{PM,DN} PN AN STAF
AD3111 20 20 A00 0.8 AD3111 20 F
0.8
AD3111 30 20 A00 1.5 AD3111 30 1.5
AD3111 40 20 A00 1 AD3111 40 1
AD3111 50 20 A00 1.25 AD3111 50 1.25
AD3111 60 20 A00 0.75 AD3111 60 .75
AD3111 70 20 A00 0.35 AD3111 70 .35
MA2100 20 10 D11 0.5 MA2100 20 .5
PN PM D
MA2100 30 10 D11 1 MA2100 30 1
AD3111 20 A0
OP1000 30 10 D11 0.25 0 OP1000 30 0.25
MA210 10 D1
IF1000 30 20 A00 1 0 1 IF1000 30 1
OP1000 10 D1
IF1000 50 20 A00 0.5 IF1000 50 .5
1
IF1000 60 20 A00 0.5 IF1000 20 A0
IF1000 60 .5
0
Third Normal Form (3NF)

 3NF requires elimination of transitive dependency of


nonprime (non-key) attributes on another nonprime
(non-key) attribute (i.e., dependency is directly on
the primary key)
{X→Y,Y→Z}⊨X→Z
– If a functional dependency X→A holds in R, one of
these must be true:
 X is a superkey of R, or
 A ∈X
 (i.e., each attribute in A - X is part of some candidate key for R)
3NF violations

 Transitive dependencies – X is not


a subset of the key

X A
K

X
A department is assigned to a project manager
2NF A project manager is assigned to projects 3NF

PN PM
PN PM D PN→PM →D AD311 20
AD311 20 A0 1
MA210 10
1
MA210 10 0
D1 0
OP100 10
0
OP100 10 1
D1 0
IF1000 20
0
IF1000 20 1
A0
0
PM D
20 A0
10 0
D1
1

Potrebbero piacerti anche