Digital Assignment 1: Normalization

Digital Assignment 1
Submitted by:
18BCE2436 Prashant Kumar Jha
18BCE2414 Saugat Malla
18BCE0100 Chandrasekhar
Normalization
Normalization is used to generate a set of relation schemas that allows us to store
information without unnecessary redundancy, yet also allows us to retrieve
information easily. The approach is to design schemas that are in an appropriate
normal form. To determine whether a relation schema is in one of
the desirable normal forms, we need additional information about the real-world
enterprise that we are modeling with the database. The most common approach
is to use functional dependencies.
Characteristics of Normalization:
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like
Insertion, Update and Deletion Anomalies.
o Normalization divides the larger table into the smaller table and links them
using relationship.
o The normal form is used to reduce redundancy from the database table.
Functional Dependencies:
Functional dependency is a relationship that exists when one attribute uniquely determines another
attribute.
If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as
X->Y, which specifies Y is functionally dependent on X. Here X is a determinant set and Y is a
dependent attribute. Each value of X is associated with precisely one Y value.
Functional dependency in a database serves as a constraint between two sets of attributes. Defining
functional dependency is an important part of relational database design and contributes to aspect
normalization.
Types of Normal Forms
1NF:
A relation is in 1NF if it contains an atomic value. As per the rule of the first
normal form, an attribute (column) of a table cannot have multiple values. It should
hold only atomic values.
Example:
Emp_id Emp_name Emp_address Emp_mobile
1 John New york 9213345890

9124134643
2 Sam New Delhi 9348328745
3 Tom Mumbai 9343475436

9383487687
4 Ron California 8343243434
As we can see from the table the two employees are having two number due to
which it is not atomic, therefore it is not in 1NF.
2NF:
A relation will be in 2Nf if it is in 1NF and all non-key attributes are fully
functional dependent on the primary key.
A table is in 2NF if the following conditions hold:
 Table should be in 1NF
 No non-prime attribute is dependent on the proper subset of any candidate
key of the table.
Example:
T_id subject Teacher_Age
1 Physics 23
1 Maths 23
3 Chemistry 30
Here the candidate keys are: T_id(teachers id) and subject

Non prime attribute: Teacher_Age
The above table is in 1NF because each attribute has atomic values but it is not in
2NF because non prime attributes teacher_age is dependent on teachers id alone
which is a proper subset of candidate key because of which the rule of 2NF is
violated.
So to make the table comply with 2NF we can break it in two table:
T_id Teacher_Age
1 23
3 30
T_id subject
1 Physics
1 Maths
3 Chemistry
Now the above two tables comply with second normal form.
3NF:
A relation will be in 3NF if it is in 2NF and no transition dependency exists.
A table is said to be in 3NF if both the following conditions hold:
 Table must be in 2NF
 Transitive functional dependency of non-prime attributes on any super key
must be removed.
An attribute which is not a part of any candidate key is called a non-prime

attribute.
3NF can be explained as follows:

A table is in 3NF if it is in 2NF and for each functional dependency X->Y at least
one of the following conditions should hold:
 X is a super key of the table
 Y is a prime attribute of the table
A attribute which is a part of one of the candidate keys is known as prime attribute.
Example:
A company wanted to store the complete address of each employee, they create a
table named employee_details as shown below:
Emp_id Emp_name Emp_zip Emp_state Emp_city
1 John 12444 UP Agra
2 Tom 13434 TN Chennai
3 Bob 12412 UK Chennai
4 Rob 12435 MP Gwalior
Super keys: {Emp_id}, {Emp_id,Emp_name},{Emp_id,Emp_name,Emp_zip}

Candidate Keys: Emp_id
Non-prime attributes: All attributes except emp_id are non-prime as they are not
part of the candidate key.
In the table above Emp_state and Emp_city dependent on Emp_zip. And, Emp_zip
is dependent on Emp_id that makes non-prime attributes (Emp_state, Emp_city &
Emp_district) transitively dependent on super key (Emp_id). This violates the rule
of 3NF.
So to make the above table comply with 3NF we break the table into two tables:
Employee table:
Emp_id Emp_name Emp_zip
1 John 12444
2 Tom 13434
3 Bob 12412
4 Rob 12435
Emplyee_zip table:
Emp_zip Emp_state Emp_city
12444 UP Agra
13434 TN Chennai
12412 UK Chennai
12435 MP Gwalior
Multi-Valued Dependency
Definition:
A multivalued dependency (MVD) on R, X ->-> Y , says that if two tuples
of R agree on all the attributes of X, then their components in Y may be
swapped, and the result will be two tuples that are also in the relation that is,
for each value of X, the values of Y are independent of the values of R-X-Y.
o A Multivalued dependency generally occurs when two attributes in a table

are independent of each other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are
dependent on a third attribute there is requirement of at least three attributes.
A table is said to have multi-valued dependency, if the following conditions are

true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists,

then the table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
3. And, for a relation R(A,B,C), if there is a multi-valued dependency between, A
and B, then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to have multi-valued
dependency.
Also, if a table has attributes P, Q and R, then Q and R are multi-valued facts of P.
It is represented by double arrow:
P->->Q
Q->->R
General Example:
Drinkers(name, addr, phones, beersLiked)
• A drinker’s phones are idependent of the beers they like.
• Thus, each of the drinker’s phones appears with each of the beers they like in all
combinations.
– If a drinker has 3 phones and likes 10 beers, then the drinker has 30 tuples
– where each phone is repeated 10 times and each beer 3 times.
Example: A car manufacturer company that produces cars of different colors every year.
Car_Model Manufacuring_year Color
C1 2009 Blue
C2 2010 Yellow
C3 2015 Orange
C4 2014 Green
C5 2019 Blue
In the above example, Manufacturing_year and color are independent of each other
but depent on Car_model. In the above example the two columns are said to be
multivalue dependent on Car_model.
The above dependencies can be represented as follows:

Car_model->Manufacturing_year
Car_model->colour
Rules of Multivalued dependencies:
 Every functional dependency is an multivalued dependency.
o If X->Y, then swapping Y’s between two tuples that agree on X
does not change the tuples.
o If X->->Y and Z is all the other attributes, then X->->Z
Rules for Manipulating Multivalued Dependencies

o Trivial dependencies rule: If A->->B is an MD, then A->->AB is
also an MD.
o Splitting does not hold:
o Like functional dependencies, we cannot generally split
the left side of Multivalued dependencies.
o But unlike Functional dependencies, we cannot split the
right side either.
Fourth Normal Form
o The redundancy that comes from Multivalued dependencies is not removable
by putting the database schema in Boyce Codd Normal Form.
o 4NF is a stronger normal form that treats Multivalued dependencies as
Functional dependencies when it comes to decomposition.
Definition:
A relation R is in 4NF if whenever X->->Y is a nontrivial multivalued dependency,
then X is a superkey.
-Nontrivial means that:
o Y is not a subset of X, and
o X and Y are not, together,all the attributes.
Fourth Normal Form comes into picture when Multi-valued Dependency occur in
any relation. In this tutorial we will learn about Multi-valued Dependency, how to
remove it and how to make any table satisfy the fourth normal form.
Rules for 4th Normal Form

For a table to satisfy the Fourth Normal Form, it should satisfy the following two
conditions:
1. It should be in the Boyce-Codd Normal Form.

2. And, the table should not have any Multi-valued Dependency.
Example:
STU_ID Course Hobby
1 Computer Hockey
2 Math Football
2 Physics Singing
4 Chemistry Cricket
The above table is in 3NF, but the course and hobby are two independent entity
due to which there is no relationship between course and hobby.
In the above table a student with STU_ID 2 contains two course and two hobbies
due to which there is multivalued dependency on STU_ID, which leads to
unnecessary repetition of data.
To make the above table into 4NF, we can decompose it into two tables:
Student_Course
STU_ID Course
1 Computer
2 Maths
2 Physics
4 Chemistry
STUDENT_HOBBY
STU_ID HOBBY
1 Hockey
2 Football
2 Singing
4 Cricket
Decomposition and 4NF:

If X->->Y is a 4NF violation for relation R, we can decompose R using the same
technique as for BCNF.
Example:
Drinkers(name,addr,phones,beersLiked)
FD: name->adr
MVD: name->->phones
Name->->beersLiked
- Key is
-{name,phones,beersLiked}
Since for this example all dependencies violate 4NF we can decompose as follows:
1. Drinkers1(name,addr)
-In 4NF, only dependency is name->addr.
2. Drinkers2(name,phones,beersLiked)
- MVD: name->->phones and name->->beersLiked
-All the three attributes form the key
Relationship among normal forms:

o 4NF implies BCNF, i.e., if a relation is in 4NF, it is also in BCNF
o BCNF implies 3NF, i.e, if a relation is in BCNF, it is also in 3NF
Property 3NF BCNF 4NF
Eliminates Maybe Yes Yes

redundancy due to
FDs
Eliminates No No Yes
redundancy due to
MDs
Preserves FDs Yes Maybe Maybe
Preserves MDs Maybe Maybe Maybe

Digital Assignment 1: Normalization

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Digital Assignment 1: Normalization

Caricato da

Copyright:

Formati disponibili

Digital Assignment 1

1 John New york 9213345890

2 Sam New Delhi 9348328745

3 Tom Mumbai 9343475436

4 Ron California 8343243434

T_id subject Teacher_Age

Here the candidate keys are: T_id(teachers id) and subject

An attribute which is not a part of any candidate key is called a non-prime

3NF can be explained as follows:

Emp_id Emp_name Emp_zip Emp_state Emp_city

1 John 12444 UP Agra

2 Tom 13434 TN Chennai

3 Bob 12412 UK Chennai

4 Rob 12435 MP Gwalior

Super keys: {Emp_id}, {Emp_id,Emp_name},{Emp_id,Emp_name,Emp_zip}

Emp_id Emp_name Emp_zip

Emp_zip Emp_state Emp_city

o A Multivalued dependency generally occurs when two attributes in a table

A table is said to have multi-valued dependency, if the following conditions are

1. For a dependency A → B, if for a single value of A, multiple value of B exists,

• A drinker’s phones are idependent of the beers they like.

– where each phone is repeated 10 times and each beer 3 times.

Car_Model Manufacuring_year Color

The above dependencies can be represented as follows:

Rules for Manipulating Multivalued Dependencies

Rules for 4th Normal Form

1. It should be in the Boyce-Codd Normal Form.

STU_ID Course Hobby

Decomposition and 4NF:

Relationship among normal forms:

Property 3NF BCNF 4NF

Eliminates Maybe Yes Yes

Preserves MDs Maybe Maybe Maybe

Potrebbero piacerti anche