Sei sulla pagina 1di 54

Database Lecture Notes Normalization 2 How to Normalize

Dr. Meg Murray mcmurray@kennesaw.edu

Normalization Why?
All relations are not equal Tables not normalized experience issues known as modification problems
Insertion problems
Difficulties inserting data into a relation

Modification problems
Difficulties modifying data into a relation

Deletion problems
Difficulties deleting data from a relation

Deletion Anomaly

If you delete any row, you delete information about both the machine and the repair

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Modification Anomalies
The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Normalization
Normalization is a process of analyzing a relation to ensure that it is well formed More specifically, if a relation is normalized (well formed), rows can be inserted, deleted, or modified without creating update anomalies

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Normalization Review:

Solving Modification Problems


Most modification problems are solved by breaking an existing table into two or more tables through a process known as normalization So the question.

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

How Many Tables?

Should we store these two tables as they are, or should we combine them into one table in our new database?

Normal Forms
Relations are categorized as a normal form based on which modification anomalies or other problems that they are subject to:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Normal Forms
1NF A table that qualifies as a relation is in 1NF 2NF A relation is in 2NF if all of its nonkey attributes are dependent on all of the primary key [focus is on composite primary keys] 3NF A relation is in 3NF if it is in 2NF and has no determinants except the primary key Boyce-Codd Normal Form (BCNF) A relation is in BCNF if every determinant is a candidate key
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

BCNF
Boyce-Codd Normal Form (BCNF) A relation is in BCNF if every determinant is a candidate key I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Codd.

Normalization Review:

Definition Review
Determinant
The attribute that can be used to find the value of another attribute in the relation The right-hand side of a functional dependency
StudentID (StudentName, DormName, DormRoom)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Normalization Review:

Definition Review II
Candidate key
The value of a candidate key can be used to find the value of every other attribute in the table A simple candidate key consists of only one attribute A composite candidate key consists of more than one attribute
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

The CUSTOMER Table

CUSTOMER (CustomerNumber, CustomerName, StreetAddress, City, State, ZIP, ContactName, Phone)

What is the primary key? What are the candidate keys? What are the non-keyed attributes?
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Simple Examples
Remember the question:
Is every determinant a candidate key or are all nonkey columns dependent on the key, the whole key and nothing but the key?

Normalization Example
(StudentID) (StudentName, DormName, DormCost) What are the determinants? Does StudentID determine Student Name? Does Student ID determine Dorm Name? Does Student ID determine Dorm cost? Probably not more likely Dorm Name does If so, Dorm Name is a determinate of Dorm cost Is StudentID a candidate key? Is Dorm Name a candidate key?
Is every determinant a candidate key Are all nonkey columns dependent on the key, the whole key and nothing but the key?
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Normalization Example
(StudentID)
However, if

(StudentName, DormName, DormCost) (DormCost)

(DormName)

Then DormCost should be placed into its own relation, resulting in the relations:

(StudentID) (DormName)
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

(StudentName, DormName) (DormCost)

Normalization Example
(AttorneyID,Cl ientID)
ATTORNEY (ClientName, AttorneyID MeetingDate, Duration) ClientID ClientName MeetingDate Duration

However, if
(ClientID) (ClientName)

Then .
KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Then ClientName should be placed into its own relation, resulting in the relations:

(AttorneyID,C lientID)
(ClientID)
SCHEDULE AttorneyID ClientID MeetingDate Duration

(MeetingDate, Duration)
(ClientName)
CLIENT ClientID ClientName

Walking through the forms

1st Normal Form [1NF]


Eliminate Repeating Groups
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data

Give each table a primary key (unique identifier)

Putting a Table into 1NF makes it a Relation


Do you remember the rules of a relation?

Is this in 1NF?

Characteristics of 1NF
Characteristics
Table Format No repeating groups Primary key (PK) identified

Steps to 1NF
1. Eliminate repeating groups.
Present data in a tabular format, where each cell has a single value and there are no repeating groups.

Repeating Groups
Do you see the repeating group in this table?

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Steps to 1NF
2. Identify the Primary Key (PK)
At a first glance the AIRCRAFT_NUMBER seems a good candidate for a PK, but would not uniquely identify all of the remaining row attributes. The combination of AIRCRAFT_NUMBER and PILOT_NUMBER is a PK candidate that will uniquely identify all row attributes.

Table in 1NF

Table with primary key identified [attributes listed vertically]

Identify Dependencies
AIRCRAFT_NUMBER, PILOT_NUMBER --> AIRCRAFT_NAME, PILOT_NAME, MISSION_CLASS, FLYING_HOUR, COST_HOUR
Primary Key (PK) dependency. The PK is also a composite key.

AIRCRAFT_NUMBER --> AIRCRAFT_NAME


Partial dependency ... aircraft name is only dependent on a part of the composite AIRCRAFT_NUMBER, PILOT_NUMBER key.

Identify Dependencies
PILOT_NUMBER --> PILOT_NAME
Partial dependency ... pilot name is only dependent on a part of the composite AIRCRAFT_NUMBER, PILOT_NUMBER key.

PILOT_NUMBER --> PILOT_NAME, FLYING_HOUR, COST_HOUR


Partial dependencies

MISSION_CLASS --> COST_HOUR


Transitive dependency .... COST_HOUR non-prime/non-key attribute is dependent on non-prime/non-key MISSION_CLASS attribute

Helpful to Create Dependency Diagram

2nd Normal Form [2NF]


Characteristics
For Tables with composite keys 1NF No partial dependencies
In other words, a non-key field must provide a fact about the whole key - not just one part of the key

2nd Normal Form [2NF]


If an attribute depends on only part of a composite key, remove it to a separate table.
Often map to components [themes, entities]

Create relationships between these new tables and their predecessors through the use of foreign keys.
http://databases.about.com/od/specificproducts/a/2nf.htm

Look for Partial Dependencies


Ask questions such as:
Are both Aircraft_Number and Pilot_Number needed to determine Pilot_Name? Are both Aircraft_Number and Pilot_Number needed to determine Mission_Class?

Look for Partial Dependencies


Move partial dependencies to their own tables
AIRCRAFT (Aircraft_Number) PILOT (Pilot_Number) FLYING HOURS (Aircraft_Number, Pilot_Number)

Steps to 2NF
Notice how moving partial dependencies separates into key components
AIRCRAFT PILOT FLYING HOURS

Steps to 2NF
Assign dependent attributes to each key component
AIRCRAFT ( AIRCRAFT_NUMBER,
AIRCRAFT_NAME )

PILOT ( PILOT_NUMBER, PILOT_NAME,


MISSION_CLASS, COST_HOUR )

FLIGHT ( AIRCRAFT_NUMBER, PILOT_NUMBER,


FLYING_HOUR )

Draw New Dependency Diagram

3rd Normal Form [3NF]


Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
Third normal form is violated when a non-key field is a fact about another non-key field [transitive dependency]

3rd Normal Form


Characteristics
2NF No transitive dependencies

3-38

3rd Normal Form

3-39

Back to the ERD

3-40

Is the Air Pilot Example in BCNF?

Are all determinants candidate keys? Often when in 3NF, also in BCNF

Steps to BCNF

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Another Example

Putting a Relation into BCNF: EQUIPMENT_REPAIR

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Identify Functional Dependencies


EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost, RepairNumber, RepairDate, RepairAmount)

FD: ItemNumber (Type, AcquisitionCost) RepairNumber (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount)

Is there a determinate key that is not a candidate key?


KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Put into Tables


ItemNumber is not a candidate key so
Move it and its attributes to a new table
ITEM(ItemNumber,Type, Acquisition)

The determinate becomes the primary key


ITEM(ItemNumber,Type, Acquisition)

Leave a foreign key in the original table


REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Putting a Relation into BCNF: New Relations

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Putting a Relation into BCNF: SKU_DATA

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Putting a Relation into BCNF: SKU_DATA


SKU_DATA (SKU, SKU_Description, Department, Buyer)
SKU (SKU_Description, Department, Buyer) SKU_Description (SKU, Department, Buyer) Buyer Department

SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where BUYER.Buyer must exist in SKU_DATA.Buyer

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Putting a Relation into BCNF: New Relations

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Multivalued Dependencies
A multivalued dependency occurs when a determinant determines a particular set of values:
Employee Degree Employee Sibling PartKit Part

The determinant of a multivalued dependency can never be a primary key


KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Multivalued Dependencies

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) 2008 Pearson Prentice Hall

Eliminating Anomolies from Multivalued Dependencies


Multivalued dependencies are not a problem if they are in a separate relation, so:
Always put multivalued dependencies into their own relation This is known as Fourth Normal Form (4NF)

References
Example for AirPilot:
http://dotnet.org.za/willy/archive/2008/04/10/ta king-a-step-back-database-normalisation-1nf2nf-3nf-bcnf-and-4nf-part-1.aspx

Good reference
http://www.bkent.net/Doc/simple5.htm

Potrebbero piacerti anche