Sei sulla pagina 1di 11

REDUCING DATA REDUNDANCY A table structure has been suggested for employees database.

After testing the structure with data we find that there is a problem in adding, deleting and modifying data. We see that the table structure could lead to inconsistency in data and that it occupies lot of disk space. The table structure is as follows Employee code First name Last name Candidate code Address City Telno Qualification Birthdate Skill The table with some sample data is as follows Emp First Last City . code Name Name 0000 Angela Smith Mumbai 1 0000 Angela Smith Mumbai 1 0000 Angela Smith Mumbai 1 0000 Betty Williams Pune 2 0000 Betty Williams Pune 2 0000 Betty Williams Pune 2 0000 Betty Williams Pune 2 Skill Communication Presentation Team Leading Communication Presentation Convincing Planning

0000 3 0000 3 0000 3

Carol Carol Carol

Jones Jones Jones

Hyderabad Hyderabad Hyderabad

Communication Team leading Inter personal

Solution Identify how to reduce data redundancy and inconsistency in updating, adding and deleting data. Understanding Data Redundancy Redundancy means repetition of data. Redundancy increases the time involved in updating, adding and deleting data. It also increases utilization of disk space and hence disk I/O. Redundancy can therefore lead to Update anomalies Inserting, modifying and deleting data may cause inconsistencies Inconsistencies Errors are most likely to occur when facts are repeated Un-necessary utilization of disk space We can use systematic approaches like normalization to reduce redundancy and duplicity Normalization is a scientific method of breaking down complex table structures into simple table structures by using certain rules. We can therefore reduce redundancy in a table and eliminate the problems of inconsistency and disk space usage.

Normalization has numerous benefits It enables faster sorting and index creation, few nulls and an increase in compactness of database. Normalization helps to simplify the structure of tables Performance of an application is directly linked to the database design A poor design hinders the performance of the system Logical design of the database lays the foundation for an optimal database Rules to be followed to achieve a good database design Each table should have an identifier (tablename) Each table should store data for single type of entity Columns that accept nulls should be avoided Repetition of columns or values should be avoided Normalization results in formation of tables that satisfy certain specified rules and represent certain normal forms

The normal forms are used to ensure that various types of anomalies and inconsistencies are not introduced in the database. A table structure is always in certain normal form

First Normal Form (1 NF) A table is said to be in 1st Normal Form when each cell contains precisely one value. Following is the content of project table Emp code E 101 E 305 E 508 Dept Systems Sales Admin Proj Code P27 P51 P20 P27 P 22 P 51 P27 Hours 90 101 60 109 98 NULL 72

The data in the table is not normalized because the cells in Projcode and Hours have more than 1 value. By applying 1NF definition to the project table we would arrive at the following:Emp code E 101 E 101 Dept Systems Systems Proj Code P27 P51 Hours 90 101

E 101 E 305 E 305 E 508 E 508

Systems Sales Sales Admin Admin

P20 P27 P 22 P 51 P27

60 109 98 NULL 72

Functional Dependency Normalization theory is based on the fundamental notion of functional dependency Given a relation R, attribute A is functionally dependent on attribute B if each value of A in R is associated with precisely one value of B In other words attribute A is functionally dependent on B if and only if for each value of B there is exactly one value of A Thus attribute B is called determinant For Example consider the employee table Code Name City E1 Rajesh Delhi E2 Ashish Mumbai E3 Manish Kolkata Given a particular value of code there is precisely one corresponding value for name. For example for code E1 there is exactly one value of Name (Rajesh). Hence name is functionally dependent on code. Similarly there is exactly one value of city for each value of code. Hence attribute city is functionally dependent on code. The attribute code is determinant. We can also say that Code determines name and city.

Second Normal Form (2NF) A table is said to be in 2nd normal form when it is in 1NF and every attribute in the row is functionally dependent upon the whole key and not just part of the key. Consider the project table Empcode Projcode Dept Hours The table has following rows Empcode E101 E305 E508 E101 E101 E508 Projcode P27 P27 P51 P51 P20 P27 Dept Systems Finance Admin Systems Systems Admin Hours 90 10 NULL 101 60 72

The situation could lead to following problems

Insertion - The department of a particular employee cannot be recorded until the employee is assigned a project Updation - For a given employee the employee code and department are repeated several times. Hence if an employee is transferred to another department, this change will have to be recorded in every row of the employee table pertaining to that employee. Any omission will lead to inconsistencies. Deletion - If an employee completes work on a project the employees record will be deleted as a result the information regarding the department to which the employee belongs will also be lost.

The primary key here is composite (Empcode + Projcode) The table satisfies definition of 1NF, we now need to check if it satisfies 2NF In the table for each value of empcode, there is more than 1 value of hours. Thus for empcode E101 there are 3 values of hours 90,101 and 60 respectively. Hence hours is not functionally dependent on Empcode. Similarly for each value of projcode there is more than 1value of hours thus for projcode P27, there are three values of hours 90, 10 and 72. However for a combination of ecode and projcode there is exactly one value of hours hence hours is functionally dependent on the whole key ecode + projcode Now we must check if dept is functionally dependent on the whole key ecode + projcode

For each value of ecode there is exactly 1 value of dept, for ecode E101 there is exactly 1 value system department. Hence dept is functionally dependent on ecode. However for each value of projcode there is more than 1 value of department, for projcode P27 there are 2 values of department System and Finance. Hence Dept is not functionally dependent on p4rojcode. Dept is therefore functionally dependent on part of key (which is ecode) and not functionally dependent on whole key (empcode + projcode). Therefore the table Project is not in 2NF, for a table to be in 2NF, the non key attributes must be fully functionally dependent on the whole key and not just part of key. Converting a table into 2NF Find and remove attributes that are functionally dependent on only a part of the key and not on the whole key, place them in a different table. Group the remaining attributes. To convert the project table into 2NF we must remove the attributes that are not fully functionally dependent on the whole key and place them in a different table along with the attribute that it is functionally dependent on. In the above example since Dept is not fully functionally dependent on the whole key ecode+projcode, we place dept along with ecode in a separate table called Employeedept Now the project table will contain Ecode, Projcode and Hours EmployeeDept Ecode E101 E305 E508 Dept Systems Sales Admin

Project Ecode E101 E101 E101 E305 E508 E508

Projcode P27 P51 P20 P27 P51 P27

Hours 90 101 60 10 NULL 72

Third Normal Form (3 NF) A relation is said to be in 3 NF when it is in 2 NF and every non key attribute is functionally dependent only on the primary key. Consider the employee table Ecode Dept E101 Systems E305 Finance E402 Sales E508 Admin E607 Finance E608 Finance Depthead E901 E906 E906 E908 E909 E909

The problem with dependencies of this kind are :Insertion The department head of a new department that does not have any employees at present cannot be entered in the depthead column . This is because the primary key is unknown. Updation For a given department, the code for a particular department head (depthead) is repeated several times. Hence, if a

depthead moves to another department, the change will have to be made consistently across the table. Deletion If the record of an employee is deleted, information regarding the head of the department will also be deleted, hence there is loss of information. We must check if the table is in 3 NF, since each cell in the table has a single value the table is in 1 NF The primary key in employee table is ecode . For each value of ecode there is exactly one value of dept. Hence the attribute dept is functionally dependent on the primary key, Ecode. Similarly for each value of Ecode there is exactly one value of depthead. Hence depthead is functionally dependent on the primary key ecode. Hence all the attributes are functionally dependent on the whole key Ecode. Hence table is in 2 NF However the attribute depthead is dependent on the attribute dept also. As per 3 NF all non-key attributes have to be functionally dependent only on the primary key. This table is not in 3 NF since depthead is functionally dependent o dept which is not a primary key. Converting a table into 3 NF Find and remove non key attributes that are functionally dependent on attributes that are not the primary key. Place them in a different table. Group the remaining attributes. To convert the table employee into 3NF we must remove the column depthead since it is not functionally dependent only on the primary key Ecode and place it in another table called department along with the attribute dept which it is functionally dependent on. Employee Ecode Dept E101 Systems E305 Finance

E402 E508 E607 E608

Sales Admin Finance Finance

Department Dept Depthead Systems E901 Sales E906 Admin E908 Finance E909

Potrebbero piacerti anche