Sei sulla pagina 1di 56

Introduction to Data Modeling & Relational Database Design

Prof. Dr. Abdalla Eldaoushy Institute of National Planning, Cairo, EGYPT

Introduction to Data Modeling & Relational Database Design

I-1

Daoushy

Introduction to Data Modeling & Relational Database Design


Objectives

DBMS, RDBMS, ORDBMS Shared & Integrated Data. Primary and Foreign Keys & the Data Integrity. Base Tables & Views. Main Elements of ERD. Normalization. Data Privacy (Views, Passwords, Locks). Distributed DataBase (Partioned & Replicated).

Introduction to Data Modeling & Relational Database Design

I-2

Daoushy

Terminology:

- DATA - INFORMATION - KNOWLEDGE - HIERARCHICAL DATA STRUCTURES Bit Byte Field Record File DataBase

- Entity : Any object about which we wish to record data and obtain information. It is usually distinguishable. Examples : Employees, Departments, Products, . . .

What Is A DBMS ?

It can be defined as a SYSTEM which facilitates shared access to data in a dataBase, and which maintain the reliability, security, and integrity of the dataBase by controlling access to it and supervising updates.

A DataBase Management System (DBMS) involves 4 major components : Data H/W S/W Users

Introduction to Data Modeling & Relational Database Design

I-3

Daoushy

Components of DBMS:

DBMS
DataBase

Application Programs

Users

1. DATA: Must be integrated and usually shared. 2. Hardware: Secondary storage devices, 1/O, Processor(s), and Main memory. 3. Software: DBMS and Access methods. (ORDBMS --Oracle in our case) 4. Users: Application Programs, Online Terminals Users, and DBA.

Introduction to Data Modeling & Relational Database Design

I-4

Daoushy

DATA :

For a Single-User : At most one user can access the


dataBase at any given time.

For a Multi-User-System :
Many users can access the dataBase concurrently (at the same time). A major objective of most multi-user-systems is precisely to allow each individual-user to behave as if he/she were working with a single-user-system.

The data in the dataBase will be both inrtegrated and shared. These two aspects represent a major advantage of dataBase systems. By integrated, it means that the dataBase can be thought of as a unification of several distinct data files with any redundancy among those files eliminated.

Introduction to Data Modeling & Relational Database Design

I-5

Daoushy

Example 1 :
A given dataBase might contain this table:
EMPNO ENAME JOB 7934 7369 7499 7521 7782 . . . MILLE SMITH ALLEN WARD CLARK DEPTNO 10 20 30 30 10 DNAME ACCONTING RESEARCH SALES SALES ACCONTING LOC NEW YORK DALAS CHICAGO CHICAGO NEW YORK

If we decompose the above table into the following two tables : DEPT (deptno, dname, loc) EMP (empno, ename, job, mgr, hiredate, sal, comm, deptno) Using deptno in both tables eliminate redundancy the above table. within

By shared it means that individual pieces of data in the dataBase can be shared among several different users, in the sense that each of these users can have access to the same piece of data (and different users can use it for different purposes). Different users can access the same piece of data at the same time (concurrent access). Such sharing is partly a consequence of the fact that dataBase is integrated.

Introduction to Data Modeling & Relational Database Design

I-6

Daoushy

Why An Enterprise Should Have DBMS ?

Enterprise may be :

Universities with Student data. Bank with Accounting data. Manufacturing Companies with Product data. Hospitals with Patient data. Governmental Departments with Planning data.

Why ?

A DBMS provides enterprise with centralized control of its data with the following advantages: Redundancy can be reduced. Inconsistency can be avoided. Data can be shared. Security restrictions can be applied. Integrity can be maintained. Data independence: We say that an application is a datadependent if it is impossible to change the storage structure (how the data is physically recorded) without affecting the application program. Data independence is a major objective of DBMS. It allows changing storage-structures or access requirements without affecting existing programs. Tables once created, they will be exist whether the application (Programs) changed or not or even deleted!. Supports Decision Needs.
I-7 Daoushy

Introduction to Data Modeling & Relational Database Design

Data Administration & Database Administration :

There will be some Identifiable Person who has central responsibility for the data. This person is the Data Administrator (DA). The DA is that person who understands the data and the needs of the enterprise with respect to that data at a senior management level. Thus, it is the DAs job to decide what data should be stored in the dataBase in the first place, and to establish polices for maintaining and dealing with that data once it has been stored. The DA is a Manager, not a Technician. The Technical Person responsible for implementing the DAs decisions is the DBA. The job of the DBA is to create the actual dataBase and to implement the technical controls needed to enforce the various policy decisions made by the DA. The DBA is also responsible for ensuring that the system operates with adequate performance and for providing a variety of other related technical services. The DBA will typically have a staff of systems programmers and other technical assistants (i.e., the DBA function will typically be performed in practice by a team of several people, not just by one person).

For simplicity, however, it is convenient to assume that the DBA is indeed a single individual.

Introduction to Data Modeling & Relational Database Design

I-8

Daoushy

Relational Systems :
Almost all the dataBase products developed since the late 1970 have been based on what is called the Relational Approach due to Dr. Codd when he proposed the Relational Model for database systems (RDBMS) in his paper : A Relational Model of Data for Large Shared Data Banks. The more popular models used at that time were the Hierarchical & Network Data Structures. RDBMS soon become very popular, specially for their ease of use and flexibility in structure. What is more, the vast majority of dataBase research over the last 25 years has also been based in some cases on that approach. For these reasons (plus the additional reason that the relational model is based on certain aspects of mathematics), the emphasis here is very heavily on relational systems and the relational approach. Definition : Briefly, a Relational Database System is a System in which : o The data is provided by user as TABLEs, and o The Users can generate new tables from existing ones.

Introduction to Data Modeling & Relational Database Design

I-9

Daoushy

Example 2 : Given the following tables : DEPT


DEPTNO 10 20 30 40 DNAME ACCOUNTING RESEARCH SALES OPERATIONS LOC NEW YORK DALAS CHICAGO BOSTON

EMP EMPNO 7369 7499 7521 7566 7654 7698 7782 7788 7839 7844 7876 7900 7902 7934 Then,

ENAME SMITH ALLEN WARD JONES MARTIN BLAKE CLARK SCOTT KING TURNER ADAMS JAMES FORD MILLER

DEPTNO 20 30 30 20 30 30 10 20 10 30 20 30 20 10

SAL 800 1600 1250 2975 1250 2850 2450 3000 5000 1500 1100 950 3000 1300

SQL> SELECT deptno, dname FROM dept; Will Display : DEPTNO DNAME

~~~~~~~
10 20 30 40

~~~~~~~
ACCOUNTING RESEARCH SALES OPERATIONS
I - 10

Introduction to Data Modeling & Relational Database Design Daoushy

SQL> SELECT ename, sal FROM emp WHERE SAL >= 3000; will Display : ENAME SAL

~~~~~
SCOTT KING FORD

~~~~
3000 5000 3000

These retrievals are in fact examples of the SELECT statement of the SQL Language.
SQL> SELECT dept.deptno, dname, empno, ename, sal FROM dept, emp WHERE dept.deptno = emp.deptno* ORDER BY deptno; will Display : DEPTNO DNAME EMPNO ENAME SAL

~~~~~~~

~~~~~~~~~~~~

~~~~~~

~~~~~~

~~~

10 ACCOUNTING 7782 CLARK 2450 10 ACCOUNTING 7839 KING 5000 10 ACCOUNTING 7934 MILLER 1300 20 RESEARCH 7369 SMITH 800 20 RESEARCH 7876 ADAMS 1100 20 RESEARCH 7902 FORD 3000 20 RESEARCH 7788 SCOTT 3000 20 RESEARCH 7566 JONES 2975 30 SALES 7499 ALLEN 1600 30 SALES 7698 BLAKE 2850 30 SALES 7654 MARTIN 1250 30 SALES 7900 JAMES 950 30 SALES 7844 TURNER 1500 30 SALES 7521 WARD 1250 -----------------------------* Will be discussed in more details when we discuss SQL Language.
Introduction to Data Modeling & Relational Database Design Daoushy I - 11

Data Integrity :
Considering the DEPT & EMP tables once again. These tables would be subject to numerous integrity rules. For example : o Employees salaries might have to be in the range of $800 to $5000. o Moreover, each row in the DEPT table must include a unique DEPTNO value; likewise, each row in EMP table must include a unique EMPNO value. o Each DEPTNO value in EMP table (if it exists) must exist as a DEPTNO value in the DEPT table (to reflect the fact that every employee must be assigned to an existing department. o The column DEPTNO in DEPT table and EMPNO in EMP table are called PRIMARY KEYs for their tables. o Column DEPTNO in EMP table is called a FOREIGN KEY which references the PRIMARY KEY of DEPT table.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 12

In Summary,
1. A dataBase System can be thought of as a computerized recordkeeping system. Such a system involves: DATA

H/W

S/W

USERS

USERS can be divided into Application Programmers (Developers), End Users, and DBA. The DBA is responsible for administrating the dataBase and dataBase-System in accordance with polices established by DA.

2. dataBases are integrated and usually shared. 3. Data Model can be usually considered as representing entities together with relationships among these entities although in fact a relationship is really just a special kind of entity. 4. DataBase-System provide a number of benefits, of which one of the most important is data independence. 5. DataBase-Systems can be based on a number of different approaches, including in particular, the Relational Approach. 6. The Relational Approach is easily the most important. In a relational system, the data is seen by the user as tables. 7. The standard language for dealing with Relational Systems is the SQL language.

!!
SQL & ANSI SQL. Oracle SQL has its native language PL/SQL.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 13

Practice 1 :

Consider again the previous DEPT & EMP tables. Show the output of the following SQL Retrieval Commands :

A, SELECT deptno, dname FROM dept; B, SELECT deptno, dname, loc FROM dept WHERE deptno = 20; C, SELECT dept.deptno, loc, empno, ename, sal FROM dept, emp WHERE dept.deptno = emp.deptno; D, INSERT INTO dept (deptno, dname, loc) VALUES (50, AGRICULTURE , ZAGAZIG); & SELECT * FROM dept; E, INSERT INTO temp (empno,ename) ( SELECT empno, ename FROM emp WHERE deptno = 30 );
This example assumes that we have previously created another table TEMP with just 2 columns (empno,ename). This can be done as follows (for example) : CREATE TABLE temp(empno, ename) AS SELECT empno, ename FROM emp WHERE 1=2; The INSERT statement insets into that table employee-numbers & names for all employees in dept 30.

F, UPDATE emp SET sal = sal*1.1 WHERE deptno = 20 ;


UPDATE statement updates the dataBase to reflect the fact that all employees in dept 20 have been given a 10% Salary increases.

G, DELETE FROM dept WHERE deptno = 50 ;

DELETE statement deletes all EMP rows for employees in dept 50.
I - 14

Introduction to Data Modeling & Relational Database Design Daoushy

Base Tables & Views :


We have seen that, starting with a given set of tables such as DEPT and EMP, relational commands (SQL statements) allow us to obtain further tables from that given set of tables. The original (given) tables are called BASE TABLES. BASE TABLES have independent existence, while DERIVED TABLES do not --- they depend on base tables.

BASE TABLES have to be named; most DERIVED TABLES ,by contrast, are not named. Relational Systems usually support one particular kind of DERIVED TABLES (called a VIEW) that does have a NAME. A VIEW is thus a named table that --- unlike a base table---does not have an independent existence of its own, but is instead usually defined in terms of one or more named tables (base tables or other views).

Example 3 :
the following statement might be used to define a VIEW called EMPDEPT30 as follows : SQL> CREATE VIEW empdept30 AS SELECT empno, ename, sal FROM emp WHERE deptno = 30;

Introduction to Data Modeling & Relational Database Design Daoushy

I - 15

SQL> SELECT * FROM empdept30;

EMPNO
~~~~~~~ 7499 7521 7654 7698 7844 7900

ENAME
~~~~~~ ALLEN WARD MARTIN BLAKE TURNER JAMES

SAL
~~~~ 1600 1250 1250 2850 1500 950

When the SQL statement above is executed, the expression following the AS (in the CREATE VIEW command) --which is in fact the view-definition ---- is not evaluated but is merely remembered by the system in some way.

In Summary :
BASE Tables really exist, in the sense they represent data that is actually stored in the dataBase. VIEWs ,by contrast, do not really exist but merely provide different ways of looking at the real data. View Logically represents subsets of data from one or more tables

Introduction to Data Modeling & Relational Database Design Daoushy

I - 16

System Development Life Cycle


Strategy and Analysis Design Build and Document Transition Production

System Development Life Cycle From concept to production, you can develop a database by using the system development life cycle, which contains multiple stages of development. This top-down, systematic approach to database development transforms business information requirements into an operational database.

Strategy and Analysis Study and analyze the business requirements. Interview users and managers to identify the information requirements. Incorporate the enterprise and application mission statements as well as any future system specifications. Build models of the system. Transfer the business narrative into a graphical representation of business information needs and rules. Confirm and refine the model with the analysts and experts.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 17

Design Design the database based on the model developed in the strategy and analysis phase.

Build and Document Build the prototype system. Write and execute the commands to create the tables and supporting objects for the database. Develop user documentation, help text, and operations manuals to support the use and operation of the system. Transition This phase concerns with moving the application into production with user acceptance testing, conversion of existing data, and parallel operation. It also includes making any modification required. Production That is putting the system to the users. Monitor its performance, and enhance the system. Note: The various phases can be carried out iteratively. Any how, this course focuses on the Build Phase of the System.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 18

Main Elements of Entity Relationship Diagram (ERD) : Entity Definitions :


Entity is an Object of Interest to the Business. Entity is a named thing. An entity is a thing of significance about which the business needs information.

Example 4 :
For a Manufacturing Company, the following are examples of different entities : EMPLOYEES DEPARTMENTS PROJECTS SUPPLIERS PARTS WAREHOUSES . . .

Attribute Definition :
Specific pieces of information which need to be known. An entity should have attributes.

Entity Diagramming Convention : Soft box. (see coming examples) Singular, unique name in UPPERCASE. Attribute names in lower case.
Introduction to Data Modeling & Relational Database Design Daoushy I - 19

Relationship Definition :
The way one entity relates to another. What one thing has to do with another? A named association between entities. A relationship is a bi-directional, significant association between two entities, or between one instance of an entity and another instance of the same entity (Recursive Relationship).

Example 5 :
o Consider the entities COURSE and INSTRUCTOR : A COURSE has a relationship with an INSTRUCTOR and an INSTRUCTOR has a relationship with a COURSE. A COURSE can be taught by an INSTRUCTOR. An INSTRUCTOR can teach a COURSE.

Diagramming Convention of Relationships :


A line between two entities. (see coming examples) Lowercase relationship name. Optionality (Mandatory or Optional). Degree (maximum cardinality) {Cardinality is a synonym of Degree . . .}.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 20

Example 6 :
The following ERD shows that Optionality(Mandatory & Optional) is shown on the part of the line nearest to the entity. Degree ( 1 & M) are also shown : Degree(Cardinality)

COPY ... ... M

TITLE 1 ... ... Optional

Mandatory

(Note : Cardinality is a synonym for the term Degree)

Reading ERD Diagram :


COPY ... ... of available in TITLE ... ...

Each COPY must be of one and only one TITLE. Each TITLE may be available in one or more COPY.

COURSE

taught by

assigned to

INSTRUCTOR

Each COURSE may be taught by one and only one INSTRUCTOR. Each INSTRUCTOR may be assigned to one or more COURSES. PATIENT examined by assigned to has DOCTOR

?
Introduction to Data Modeling & Relational Database Design Daoushy I - 21

Attribute Optionality : Mandatory Attributes : o A mandatory attribute means that a value must be stored for this attribute for each entity instance (NOT NULL). o A mandatory attribute will be tagged with . Optional Attributes : o A value may be stored for each entity instance (or may be NULL ---- which is the default) o Will be tagged with o. Example 7 :
EMPLOYEE * * * o o employee num first name last name title weight

Introduction to Data Modeling & Relational Database Design Daoushy

I - 22

Unique Identifier Definition : o Each entity-instance must be uniquely identified. This can be done
by a Unique Identifier.

o A combination of attributes or relationships that serve to identify an


instance of an entity is called Unique Identifier (UID).

o That is a UID is any combination of attributes and relationships that


serve to identify an instance of an entity uniquely.

o You can specify unique identifiers at any time during analysis, but
every entity must have a UID in order to begin design.

Example 8 :
Consider the following entity :

CUSTOMER # * customer num (an entity with a single attribute. It is a UID and tagged with #)

Introduction to Data Modeling & Relational Database Design Daoushy

I - 23

Compound UID --- Composite :

Consider the following ERD . . .

ACCOUNT * num

BANK # * num

?
What would you need to know to identify a Specific Instance of an ACCOUNT?

ACCOUNT # * num

BANK # * num

Use # to indicate that this Use a UID bar to indicate attribute is a part of the entitys UID. that a relationship is part of (the other part comes from the relationship). the entitys UID.

An entity may be identified by attributes, but it can also use its relationship with another entity as part of its UID. This is known as a composite UID. The ACCOUNT has a unique number, but it is only unique within a specific bank branch. This means that you need a combination of the banks UID and the account number to get a specific instance of an account. When the UID of a parent entity forms part of UID of the child entity, then mark the relationship between parent and child by drawing a bar across the relationship at the child end.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 24

The above ERD maps to the following (for example) tables: BRACH_NO BRACH_NAME 104 National Bank Of Egypt, Heliopolis 105 National Bank Of Egypt, Nasr City 106 National Bank Of Egypt, ElTayaran ... BANK table ACCOUNT_NO BALANCCE 75760 12,000.50 77956 1000.00 89570 55,775.00

BRANCH_NO 104 104 150 ...

ACCOUNT table

Introduction to Data Modeling & Relational Database Design Daoushy

I - 25

Transferability :
Transferability concerns with moving a relationship from one instance of the master entity to a different instance of the same master entity.

Transferable Relationships :
o In the following relationship, a DEPT has many employees working in it. An employee only works for one DEPT. However, most companies allow their employees to apply for internal vacancies in other departments. When an employee in dept 10 does this and accepted in dept 20 (for example), he became employed by dept 20 and not by dept 10. The relationship still exists between EMP and DEPT but now it is with a different instance of DEPT. EMP # * empno * ename . . o deptno DEPT # * deptno employed by employs * dname * loc
Deptno 10 Deptno 20 30 40

o All relationships are considered transferable unless you state that they should not be transferable.

In other words, In DEPT & EMP, you can UPDATE EMP and change DEPTNO for any employee. This is the TRANSFERABILITY (which is the default). For non-TRANSFERABILITY, either write database-trigger , or mention it when you use Oracle Designer.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 26

Non-Transferable Relationships: Example 9 :


o Sometimes, you may need to restrict transferability for business reasons. For example, A COPY is supplied by a specific instance of a SUPPLIER. Once a COPY has been supplied, the COMPANY that supplied it cannot be changed (at the child entity). o But you have to notice that if the COPY was mistakenly assigned to a SUPPLIER, the COPY would have to be deleted and a new COPY created. If this is the required business procedure, then the relationship should be NON-TRANSFERABLE (this can be marked at the child entity by a diamond symbol as shown).

COMPANY
# * id * name

COPY
* inventory num o condition

is supplied by the source of

SUPPLIER
* supplier num * sales contact

child

OTHER ... parent

What about the following Commands : UPDATE dept SET deptno = 50 WHERE deptno = 10; UPDATE emp SET deptno = 30 WHERE deptno = 10; UPDATE emp SET deptno = 50 WHERE deptno = 10;

Introduction to Data Modeling & Relational Database Design Daoushy

I - 27

Example 10 :
o Suppose that for a Business Rule, a person works for a specific dept cannot be moved to other dept. If this is the required business procedure, then the relationship should be NON-TRANSFERABLE o Add a diamond at the detail end of the relationship to show NON-TRANSFERABILITY.

EMP # * empno * ename . . o deptno

DEPT # * deptno employed by employs * dname * loc


Deptno 10 Deptno 20 30 40

How it can be applied ? :


1. Using Oracle Designer, or 2. Database Triggers ( or Columns Constraints, or Columns Privilege)

Introduction to Data Modeling & Relational Database Design Daoushy

I - 28

Mutually Exclusive Relationship --- Exclusivity (Arcs) :

places ORDER is placed by CUSTOMER

places is placed by

DEPARTMENT

Each ORDER may be paired with only one of the related entity type, not both.

There are two types of Exclusivity :

Explicit Arc Design, and Generic Arc Design.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 29

Explicit Arc Design : Example . . .

OFFICE_SUITE # * building id # * suite no

INDIVIDUAL # * id PARTNERSHIP # * code COMPANY # * num

Arc represents alternative Foreign Keys. Explicit Arc Design creates a FK column for each relationship. Note : Even if the relationships are designated as mandatory, the FK columns cannot be recorded as NN because for each row, only one of the FKs has a value.

OFFICE_SUITES Col. Name Key Type NULLs Sample Data


BUILDING_ID SUITE_NO PK NN 1024 512 799 3041 PK NN 101 210 144 510 IND_ID PAR_CODE FK1 30045 A4431 54532 10844 FK2 COM_NUM FK3

Introduction to Data Modeling & Relational Database Design Daoushy

I - 30

Generic Arc Design : Same Example . . .

OFFICE_SUITE # * building id # * suite no

INDIVIDUAL # * id PARTNERSHIP # * code COMPANY # * num

OFFICE_SUITES Col. Name Key Type NULLs Sample Data


BUILDING_ID SUITE_NO RENTER_ID PK NN 1024 512 799 3041 PK NN 101 210 144 510 FK,UID NN 30045 A4431 54532 10844 RENTER_TYPE

NN I P I C

The Generic Arc Design creates a single-foreign-keycolumn(RENTER_ID) and an individual column (RENTER_TYPE) to use as a flag for the type. Since the relationships are EXCLUSIVE, only one FK value exists for each row in the table. If all of the relationships in the ARC are mandatory, then you must make the FK column NN.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 31

Practice I-1 : Using an Explicit Arc Design, develop a table design for
the ERD : CITY STUDENT # * id * name
from the home of from the home of # * city code

* city name

the home of

OTHER_STATE # * state id * state name

from the home of

FOREIGN_COUNTRY # * country code * country name

CITIES Col. CITY_CODE CITY_NAME Name Key PK Type NULLs NN NN

OTHER_STATES Col. STATE_ Name ID Key PK Type NULLs NN

STATE_ NAME

NN

FOREIGN_COUNTRIES Col. Name COUNTRY_CODE COUNTRY_NAME Key Type PK NULLs NN NN STUDENTS ID Col. Name Key PK Type NN NULLs
NAME CIT_CITY_CODE FK1 NN OTH_STATE_ID FK2 FOR_COUNTRY_CODE FK3

Introduction to Data Modeling & Relational Database Design Daoushy

I - 32

Practice I-2 : Using an Generic Arc Design, develop a table design for
the ERD : CITY STUDENT # * id * name
from the home of from the home of

* city code * city name

OTHER_STATE # * state id * state name

from

FOREIGN_COUNTRY
the home of

# * country code * country name

CITYS Col. CITY_CODE CITY_NAME Name Key PK Type NULLs NN NN

OTHER_STATES Col. STATE_ Name ID Key PK Type NULLs NN

STATE_ NAME

NN

FOREIGN_COUNTRYS Col. Name COUNTRY_CODE COUNTRY_NAME Key Type PK NULLs NN NN STUDENTS Col. Name Key Type NULLs

ID PK NN

NAME

HOME_ CODE FK, UID NN

HOME_ TYPE NN

NN

Introduction to Data Modeling & Relational Database Design Daoushy

I - 33

Implementing Arcs :

C
ID . . . . FK_A FK_B ID . . ID . .

o When you implement an arc relationship using the Explicit Method, there is a requirement that only one of the FKs in an arc references a valid row; the other FKs must be NULL. o For example, in the figure above, you can implement this arc relationship using the following CHECK CONSTRAINT on the Table C : CHECK( ( FK_A FK_B OR ( FK_B FK_A IS IS IS IS NOT NULL AND NULL ) NOT NULL AND NULL ) )

o Another Example, if the REVIEWS Table references either PUBLICATIONS Table or the CATALOGS Table in an Explicit Arc Implementation, then the following Check Constraint is required on the REVIEWS Table : CHECK( ( pub_reference cat_reference OR ( cat_reference pub_reference IS IS IS IS NOT NULL AND NULL ) NOT NULL AND NULL ) )
I - 34

Introduction to Data Modeling & Relational Database Design Daoushy

FK_A

FK_B

FK_C

Check Constraint to validate Arcs :

CHECK( ( FK_A IS NOT NULL AND FK_B IS NULL AND FK_C IS NULL) OR ( FK_B IS NOT NULL AND FK_A IS NULL AND FK_C IS NULL) OR ( FK_C IS NOT NULL AND FK_A IS NULL AND FK_B IS NULL) )

Hint: You need Generic Arc method in case there is a composite key in the detail entity with one of the other master entities

Introduction to Data Modeling & Relational Database Design Daoushy

I - 35

Relationship Types : One-to-One (1:1) :


Have a degree of one and only one in both directions. Are rare.

Example 11 :
BICYCLE
ridden by the rider of

CYCLIST

Many-to-One (M:1 or 1:M) :


Have a degree of one or more in one direction and a degree of one and only one in the other direction. Are very common.

Example 12 :

CUSTOMER

visited by assigned to

SALES REPRESENTATIVE

COPY

has available as

TITLE

Introduction to Data Modeling & Relational Database Design Daoushy

I - 36

EMPLOYEE # * id * last_name 0 first_name

? ?
# * code

JOB

ITEM # * id 0 price 0 quantity

ORDER
in made up of

# * id * date ordered 0 date shipped

taken by

TEAM
within
the sales Rep. for

EMPLOYEE
made up of
# * id * last name 0 first name

DEPARTMENT
within

made up of

DIVISION
within

made up of

COMPANY

Introduction to Data Modeling & Relational Database Design Daoushy

I - 37

Many-to-Many Relationship (M:M) :


Have a degree of one or more in both directions. Are resolved with an intersection (or resolving) entity. See later.

Example 13 :
examined by responsible for

PATIENT

DOCTOR

Introduction to Data Modeling & Relational Database Design Daoushy

I - 38

Normalization Terminology :

Redundancy & Duplication in Database :


Consider the following table : S# ~~~ S2 S7 S2 S5 P# ~~~ P1 P1 P4 P1 DESC ~~~~~ bolt bolt nut bolt

Mathematically : If bolt for example has been omitted from the 4th record, it can be defined from the 1st or the 2nd record. This is what we call REDUNDANCY.

Database : We notice that bolt has been repeated in many rows for the same column called DESC. This is a duplication (repetition). To eliminate this REDUNDANCY(repeated items), split the above table into the following two tables :

S# ~~~ S2 S7 S2 S5

P# ~~~ P1 P1 P4 P1

QTY ~~~~ . . . . . . . . . . . .

P# ~~~ P1 P4 .. ..

DESC ~~~~~ bolt nut .. ..

Introduction to Data Modeling & Relational Database Design Daoushy

I - 39

Normal Form :
Normalization comes from the work of Prof. Ted Codd, a Cambridge mathematician, in the early 1970s. He noted a series of normal forms through which data should pass to create a set of relational tables. Academically, there are a considerable number of Normal Forms. Practically, three are sufficient (1NF, 2NF, 3NF). There is an extension to the 3NF called BCNF or 3.5 NF which can be considered.

Definitions : Data Group :


Within each normal form, the data is divided into groups according to the rule being applied. These are neither entities nor tables.

Repeating Group :
When a piece of data can have more than one value for a given value of the key, it is said to repeat. A repeating group is a group of data for which there can be more than one value for a given value of the key. For example, the following table has Repeating Group.

Student# ~~~~~~~ 1 2

Course# ~~~~~~~~~~~~~~~~~ SA, SD, Oracle, . . . DB, MATH1, OR, Oracle, . . .

3 .

. . . . . . .

Introduction to Data Modeling & Relational Database Design Daoushy

I - 40

Dependency :
When a data item is said to be dependent on another, it means that the data has no meaning without the other, and therefore cannot be accessed without the determinant.

Determinant :
The determinant is the item of data upon which something depends. YX (i.e., Y determine X)

Key :
Within normalization, the key is the data that uniquely identifies the group. If you use normalization within ER modeling, keys and UIDs are the same.

Foreign Key (FK) :


A FK is data within one group (table for example) which is used a primary key within another group. If you use normalization within ER modeling, foreign keys and relationships are the same.

Transitive (Inter-Data-Dependencies):
This term is used when data depends upon a key but the key itself depends upon another key. This happens when a data item was not available for normalization but is found later. For example the INSTRUCTOR & his Office.

Hint : By the way, Resolving Inter-Data-Dependencies lead to the 3NF.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 41

Reasons For Normalization :


Allows creating a complete set of related data and then using a set of rules to eliminate redundancy through a series of normal forms. The term 3NF and Relational Data Analysis (RDA) are also used as synonyms for normalization. At the end of normalization; you have a set of data groups which may become entities, each with attributes and UIDs, and with foreign keys also defined. You can use these data groups to create an ERD.

Rules of Normalization : Collect and list the raw data (0NF)


o The possible sources are: forms, reports, screens, interviews notes, existing documentation, and file layouts. o Do not attempt to normalize an entire system at one go. o Use data names that reflect the nature of the data. o From the list of data, select a key that has the following properties : It must have a unique value within the thing being normalized. It should be non-textual, if possible. Only invent a data item to use as a key if no practical alternative exists (it is called surrogate key).

Introduction to Data Modeling & Relational Database Design Daoushy

I - 42

Remove Repeating Groups(1NF) :


o Identify and list all of the data items which may have more than one value for a single value of the key. This is a Repeating Group. Remove it. o Select a key for the repeating group from within the data that you have extracted (or removed). o Copy the original key to join the newly selected key and mark it as a foreign key as well as a primary key. This preserves the association between the repeating group and the remaining data. o List all of the remaining data and the original keys as a group, RELATION, or table. o Note that a repeating group always has more than one item in its key (Composite-Key). Hint : Within the Repeating Groups, there may be other Repeating sub-Groups. Treat this Repeating Groups in the same way; i.e., remove the data repeated, select a key for the removed group, and copy the Original Key.

Remove Part-Key-Dependencies (2NF) :


o This rule is only applicable to groups of data with a multipart key (Composite Key). o Identify and remove all of the data items which are dependent upon only a part-of-the-key. o Copy the-part-of-the-key upon which the data is dependent to join the removed group and mark it as a foreign key beside a PK. This preserves the association between the groups. o List all of the remaining data and the original keys as a group, RELATION, or a table.
Introduction to Data Modeling & Relational Database Design Daoushy I - 43

Remove Inter-Data-Dependences (3NF) :


o List and remove all the data items which are dependent upon other data item which is not the key. o Copy the data item that the others depend upon and mark that item as a key. Mark it as a foreign key in the original group. Unlike in the 1NF and 2NF, this foreign key is not also a primary key. o Repeat this for all possible data combinations. o List all of the remaining data and the original keys as a group or Relation.

BCNF :
o BCNF is most often found where a data grouping has a multipart key but no data. It is often referred to as resolving a Key-OnlyRELATION.

Test and Identify Transitive Dependencies


o Does each item of data depend upon the key, the whole key and nothing but the key?

Optimize
o Join together groups of data where the key is identical. o Check that all foreign keys are marked. Remember that a foreign key is a data item which appears as a primary key in another group.

Retest.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 44

Practice I-3 :

Consider the following set of Data :

Customer Name Jon Doe

Customer Order Address Number Heliopolis 12345 12346 12347 12348 12349 Nasr City 23456 23457 23458 Maadi 12345 12346 12347

Product Product Qty Date Number Description Ordered Ordered A345 B345 . . . BOWLS BOLTS . . . 6 8 . 3/4/99 3/5/99 . . .

John Ali

Smith Joe

X123 Y123 Z123 D123 E123 F123

. . . . . . . . . . . . . . .

. . . . .

. . . . . . . . . . . . . . .

Introduction to Data Modeling & Relational Database Design Daoushy

I - 45

0NF
# CustName CustAddress OrdNum ProdNum ProdDesc QtyOrdered DateOrdered

1NF
# CustName CustAddress

2NF
# CustName CustAddress

3NF
# CustName CustAddress

BCNF/ Optimize
# CustNum CustName CustAddress

# CustName FK # OrdNum ProdNum ProdDesc QtyOrdered DateOrdered

# CustName FK # OrdNum ProdNum ProdDesc

# CustName FK # OrdNum

ProdNum FK

# OrdNum FK QtyOrdered DateOrdered

# OrdNum FK QtyOrdered DateOrdered

# OrdNum # CustNum FK ProdNum FK QtyOrdered DateOrdered

# ProdNum ProdDesc

# ProdNum ProdDesc

BCNF :

The 2nd group in the 4th column is a group that has a multipart key but no data. Therefore, it can be distributed among other groups.

RELATIONs
0NF : Collect Raw Data and select a Key (CustName). 1NF : o Remove Repeating Group. o Select a Key for the Repeating Group (OrdNum). o Copy the original key (CustName) to join the newly selected key and mark it as a FK as well as a part of the PK.
Introduction to Data Modeling & Relational Database Design Daoushy I - 46

2NF : o Remove Part-Key-Dependencies. o Copy the part-of-the-key upon which the data is dependent (OrdNum) and mark it as FK as well as a PK.

3NF : o Remove Inter-Data-Dependencies (Transitive). o List all data-items which are dependent upon other non-key-dataitems (ProdDesc). o Copy the data-item that the others depend on (ProdNum) and mark it as PK. o Mark it as FK in the Original-Group. Unlike in the 1NF and 2NF, this FK is not a PK in the Original-Group.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 47

Draw and use the model.

CUSTOMER
# CustNum CustName CustAddress

Groups become entities. Data becomes attributes. Foreign Keys become relationships. Keys become UIDs.

ORDER
# OrdNum QtyOrdered DateOrdered

PRODUCT
# ProdNum PrdDesc

In Summary,
0NF : Collection and listing of Raw Data. 1NF : When Repeating Groups are removed from the tables (All attributes must be single-valued). 2NF : When Part-Key-Dependencies are removed from the table (An attribute must depend upon its entitys entire UID). 3NF : When Inter-Data-Dependencies are removed from the table (No non-UID attribute can be dependent upon another non-UID attribute). BCNF : When Inter-Key-Dependencies are removed from the tables.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 48

Practice I-4: Put the following data into 1NF, 2NF, 3NF, and BCNF.

Emp No 7902

EName SMITH

Dept No 10

Dept Name SALES

MGR No 7988

MGR Name JONES

7899

JONES

20

MARKET

7699

WALKER

7562

SMITH

10

SALES

7099

Proj No 15 35 45 15 25 45 25

Proj Name FEASIBILITY TESTING HANDOVER FEASIBILITY ANALYSIS HANDOVER ANALYSIS

Hire Date 01-SEP-95 01-OCT-95 01-NOV-95 01-AUG-95 01-SEP-95 01-OCT-95 01-MAY-95

HW 100 100 150 200 250 200 150

Introduction to Data Modeling & Relational Database Design Daoushy

I - 49

Practice I-4 : Solution . . .

0NF
# EMPNO ENAME DEPTNO DNAME MGRNO MGRNAME PROJNO PROJNAME HIREDATE HW

1NF
# EMPNO ENAME DEPTNO DNAME MGRNO MGRNAME

2NF
# EMPNO ENAME DEPTNO DNAME MGRNO MGRNAME

3NF
# EMPNO ENAME DEPTNO FK MGRNO FK

Optimized
# EMPNO ENAME DEPTNO FK MGRNO FK

# DEPTNO DNAME # EMPNO FK # PROJNO PROJNAME HIREDATE HW # EMPNO FK # PROJNO HIREDATE HW

# DEPTNO DNAME

# MGRNO MGRNAME

# EMPNO FK # PROJNO HIREDATE HW # PROJNO FK PROJNAME

# PROJNO FK PROJNAME

# EMPNO FK # PROJNO HIREDATE HW

# PROJNO FK PROJNAME

0NF : Collect Raw Data and select a Key (EMPNO).

1NF : o Remove Repeating-Group. o Select a Key (PROJNO) for the Repeating-Group. o Copy the original key (EMPNO) to join the newly selected key and mark it as a FK as well as a part of the PK.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 50

2NF : o Remove Part-Key-Dependencies. o Copy the part-of-the-key upon which the data is dependent (PROJNO) and mark it as FK as well as a PK in the Original Group.

3NF : o Remove Inter-Data-Dependencies (Transitive). o List all data-items which are dependent upon other non-key-dataitems. o Copy the data-items that the others depend on and mark them as PK. o Mark them as FK in the Original-Group. Unlike in the 1NF and 2NF, these FKs are not also PKs in the Original-Group.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 51

Practice I-5 :

Create a basic ERD from the Normalized Groups of Data obtained in Practice I-4.

Practice I-5 : Solution . . .


EMP # empno ename mgr

ASSIGNMENT
hiredate hw

PROJECT
# projno projname

DEPT
# deptno dname

Hint :
o Notice the Composite Primary Key of the Entity ASSIGNMENT which comes from the relationships (it is also a Resolving Entity). o Notice also the Self-Joining of the Entity EMP.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 52

Practice I-6 :

For the following ERD, evaluate each Entity against the rules of NORMALIZATION. Identify any misplaced Attributes and explain what rule it violates:

ENROLLMENT grade code grade description course name

COURSE
# course number course name teacher number teacher name dept number dept name

STUDENT # student number student name

Introduction to Data Modeling & Relational Database Design Daoushy

I - 53

Practice I-6 : Solution . . .


COURSE Entity :
o This Entity is only in 2NF. o Dept number and Dept name, Teacher number and Teacher name violate 3NF (Inter-Data-Dependencies) and therefore are misplaced.

Details :
1NF : All Attributes are single-valued for each instance. 2NF : All Attributes are dependent upon the UID (course number). [Assuming that there is only one teacher and one dept per course]. 3NF : o Dept name is dependent upon Dept number (a non-UID). Move both Dept number and Dept name to a separate DEPT Entity with a relationship to COURSE Entity. o Teacher name is also dependent upon Teacher number (a non-UID). Move both Teacher number and Teacher name to a separate TEACHER Entity with a relationship to COURSE Entity.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 54

ENROLLMENT Entity :
o This Entity is only in 1NF. o course name violates 2NF (Part-Key-Dependencies) and therefore is misplaced. o grade code and grade description violate 3NF (Inter-DataDependencies) and therefore are misplaced.

Details :
1NF : All Attributes are single-valued for each instance. 2NF : o All Attributes are dependent upon the UID (course number, student number from the relationships to STUDENT and COURSE Entities). o However, course name is dependent upon only the course number whichh is a subset of the UID; therefore, move them to the COURSE Entity(or to the separate TEACHER Entity). 3NF : o grade description is dependent upon grade code (a nonUID).Move both grade code and grade description to a separate GRADE Entity with a relationship to ENROLLMENT Entity.

Introduction to Data Modeling & Relational Database Design Daoushy

I - 55

Practice I-7 : Redraw the ERD of Practice I-6 in 3NF. Practice I-7 : Solution . . .
assigned ENROLLMENT for the receiver of for STUDENT
# student number student name

assigned to completed with

GRADE
# grade code grade description

COURSE
# course number course name

taught by offered by the teacher of TEACHER the offerer of


# teacher number teacher name

DEPT
# dept number dept name

Introduction to Data Modeling & Relational Database Design Daoushy

I - 56

Potrebbero piacerti anche