Sei sulla pagina 1di 3

Database Normalization

Database normalization is the process of organizing data in your database. As we


all know, its important for a database to maintain data consistency and reduce data
redundancy. Lets use a simple example to illustrate the above principles.

The diagram below (figure 1.1) shows a simple Customers and Orders table and the
information that we wish to store in our database. As you can recall how
relationships work in databases, the primary key of the parent table is carried in the
child table as a foreign key. In this case, our Customers table is shown as the parent
table while the Orders table is the child table since it will be the Orders table
carrying the primary key of the Customers table as the foreign key . This way, the
order records for a specific customer can be associated with that customer.

Figure 1.1

Customers
Customer ID (primary key)
Customer Name
Customer Type
Contact Name
Category Name

Orders
OrderID (primary key)
CustomerID (foreign key)
Order title
Order description
Amount

Figure 1.2

Customer Customer Customer Contact Category


ID Name Type Name Name
1 Microsoft Systems Advertiser Bill Hardware
2 Nike Advertiser James Sports
3 SUI Shop Advertiser Victor Retail
4 Microsoft Office Advertiser Bill Software

This database would be able to maintain data consistency and have minimal
redundancy if each customer only had 1 contact person and belonged to only 1
category. (See figure 1.2) It would then be considered normalized.
But what if a customer record had more than 1 contact person? (See figure 1.3) This
would mean that for a given customer, 1 or more contacts can exist.

Figure 1.3

Customer Customer Customer Contact Category


ID Name Type Name Name
1 Microsoft Systems Advertiser Bill Hardware
Jesse
2 Nike Advertiser James Sports
3 SUI Shop Advertiser Victor Retail
Zhiwang
5 Microsoft Office Advertiser Bill Software

So now, we’re going to have to reorganize the attributes of the Customers table.

First Normal Form: Eliminating Repeating Groups

First normal form involves the removal of repeating groups. So what is a repeating
group? Well, in the previous example, our repeating group would the contacts, since
for a given customer, 1 or more contacts can exist.

Therefore, for each repeating group that you encounter, the repeating group is
moved to a separate table. And so, you’ll end up with a new table to store the
contacts information, Contacts table.

Customer
Customer ID (primary
key)
Customer Name
Customer Type
Category Name

Contacts
ContactID
Contact Name
Customer ID (foreign key)

And since the Contacts table carries CustomerID as the foreign key, that would
make it the child table of the Customer table.
Let’s consider the benefits of splitting our data up like this.

With the old database design, if we wanted to store more than 1 contact person’s
name in the customer record, we would have to add more columns in the Customer
table. And every time a customer wanted to store more contacts, we would have to
continually add columns, causing us to constantly change the database structure
and model. This is something that all database designers tend to avoid. A solid and
stable database model is the foundation on which everything else rests.

But by splitting the data, adding an additional contact person’s name to the
customer record only means inserting a new row in the new Contacts table.
Something that can be done with a simple INSERT query and requires no change to
our existing database design. And this design is also robust enough to allow
customers to add any number of contacts easily and comfortably.

So now that we have achieved the first normal form, we can now move on to the
second normal form. And this involves the elimination of redundant data.

Second Normal Form: Eliminating Redundant Data

When you review our Contacts table,

Potrebbero piacerti anche