Sei sulla pagina 1di 90

Database Management Systems

Entity-Relationship Model
Database Design
Why do we need it?
Agree on structure of the database before
deciding on a particular implementation.
Consider issues such as:
What entities to model
How entities are related
What constraints exist in the domain
How to achieve good designs
Database Design Formalisms
1. Object Definition Language (ODL):
Closer in spirit to object-oriented models
2. Entity/Relationship model (E/R):
More relational in nature.

Both can be translated (semi-automatically) to


relational schemas
ODL to OO-schema: direct transformation (C++
or Smalltalk based system).
Purpose of E/R Model
The E/R model allows us to sketch the
design of a database informally.
Designs are pictures called entity-
relationship diagrams.
Fairly mechanical ways to convert E/R
diagrams to real implementations like
relational databases exist.

4
Entity / Relationship Diagrams

Entities Product

Attributes address

Relationships between entities buys


name category
name

price
makes Company
Product

stockprice

buys employs

Person

address name ssn


University Example

A college contains many departments


Each department can offer any number of courses
Many instructors can work in a department
An instructor can work only in one department
For each department there is a Head
An instructor can be head of only one department
Each instructor can take any number of courses
A course can be taken by only one instructor
A student can enroll for any number of courses
Each course can have any number of students
Modeling
A database can be modeled as:
a collection of entities,
relationship among entities.
An entity is an object that exists and is distinguishable from
other objects.
Example: person, company, event, plant

Entities have attributes


Example: people have names and addresses
An entity set is a set of entities of the same type that share
the same properties.
Example: set of all persons, companies, trees, holidays
Entity Sets instructor and student
instructor_ID instructor_name student-ID student_name
ER Case Study
Banks Database
Each bank has a unique name.
Each branch has a number, name, address (number, street, city), and set of
phones.
Customer includes their name, set of address (P.O. Box, city, zip code, country),
set of phones, and social security number.
Accounts have numbers, types (e.g. saving, checking) and balance. Other
branches might use the same designation for accounts. So to name an account
uniquely, we need to give both the branch number to which this account belongs
to and the account number.
Not all bank customers must own accounts and a customer may have at most 5
accounts in the bank.
An account must have only one customer.
A customer may have many accounts in different branches.
Relationship Sets
A relationship is an association among several entities
Example:
44553 (Ahmed) advisor 22222 (Hassan)
student entityrelationship set instructor entity
A relationship set is a mathematical relation among n 2
entities, each taken from entity sets
{(e1, e2, en) | e1 E1, e2 E2, , en En}

where (e1, e2, , en) is a relationship


Example:
(44553,22222) advisor
Relationship Set advisor
Relationship Sets (Cont.)
An attribute can also be property of a relationship set.
For instance, the advisor relationship set between entity sets instructor and student
may have the attribute date which tracks when the student started being associated
with the advisor
What is a Relation ?
A mathematical definition:
if A, B are sets, then a relation R is a subset of
Ax B 1 a
A={1,2,3}, B={a,b,c,d}, 2 b
A=
R = {(1,a), (1,c), (3,b)} c
3
B= d
- makes is a subset of Product x Company:
makes Company
Product
Relationships
Relationships indicate a meaningful connection
between two entity types
Relationships may have attributes,
but they cannot have key attributes.
Identifying relationships connect a weak entity type
to some other entity type
indicates where the weak entity gets a key
to complete its own partial key

WorksOn DependentOf

Relationship Identifying
Relationship
Multiplicity of E/R Relations
one-one:
1 a
2 b
3 c
d
many-one
1 a
2 b
3 c
d
many-many
1 a
2 b
3 c
d
Mapping Cardinalities

One to one One to many

Note: Some elements in A and B may not be mapped to any


elements in the other set
Mapping Cardinalities

Many to one Many to many

Note: Some elements in A and B may not be mapped to any


elements in the other set
One-to-One Relationship
one-to-one relationship between an instructor and a student
an instructor is associated with at most one student via advisor
and a student is associated with at most one instructor via advisor
One-to-Many Relationship
one-to-many relationship between an instructor and a student
an instructor is associated with several (including 0) students via
advisor
a student is associated with at most one instructor via advisor,
Many-to-Many Relationship
An instructor is associated with several (possibly 0)
students via advisor
A student is associated with several (possibly 0)
instructors via advisor
Alternative Notation for Cardinality Limits

Cardinality limits can also express participation constraints


Multi-way Relationships
How do we model a purchase relationship between buyers,
products and stores?

Product

Purchase Store

Person

Can still model as a mathematical set (how ?)


Relational Roles
It is sometimes convenient
to name an entitys role in a relationship.
particularly useful in recursive relationships
removes ambiguity in direction of relationship

Supervision
supervisor supervisee

EMPLOYEE
Roles in Relationships
What if we need an entity set twice in one relationship?

Product

Purchase Store

salesperson buyer

Person
Data Modeling Case Study
The following is description by a pharmacy owner:
Ahmed Hassan catches a cold and what he suspects
is a flu virus. He makes an appointment with his family
doctor who confirm his diagnosis. The doctor
prescribes an antibiotic and nasal decongestant tablets.
Ahmed leaves the doctor's office and drives to his local
drug store. The pharmacist packages the medication
and types the labels for pill bottles.
The label includes information about customer, the
doctor who prescribe the drug, the drug (e.g.,
Penicillin), when to take it, and how often, the content
of the pill (250 mg), the number of refills, expiration
date, and the date of purchase."
Please develop a data model for the entities and
relationships within the context of pharmacy. 26
Attributes
An entity is represented by a set of attributes, that is descriptive
properties possessed by all members of an entity set.
Example:
instructor = (ID, name, street, city, salary )
course= (course_id, title, credits)
Domain the set of permitted values for each attribute
Attribute types:
Simple and composite attributes.
Single-valued and multivalued attributes
Example: multivalued attribute: phone_numbers
Derived attributes
Can be computed from other attributes
Example: age, given date_of_birth
Sets and Derived Attributes
Multivalued attributes double lined oval
multivalued = set valued
that there may be more than one value for the attribute.
Derived attributes dashed line ovals
the attribute is computed from other data

Locations NumEmployees
Multivalued Derived
Attribute Attribute
Composite Attributes
Composite attributes tree
composed of other attributes.
used for a set of related attributes,
when the set is not a conceptual entity
the composite doesnt have identity it doesnt have a key

Address

City
ZipCode
Street State
Composite
Attribute
Composite Attributes
Attributes on Relationships

date
Product

Purchase Store

Person
Converting Multi-way
Relationships to Binary
date
ProductOf Product

Purchase
StoreOf Store

BuyerOf Person
ER Case Study
Television Series Database

A Television network wishes to create a database to keep track of its TV


series. A television series has one or more episode. Television series
identified by name and season number, and includes their production
company name and Num_of_Episodes ( i.e. total number of episodes in a
specific season of a series ).

Episode of a specific season of a series is identified by episode number


and has a title and a length. No episode can exist without a corresponding
television series. Also each episode has only one writer. A writer is
identified by name, and also has birth date and a literary agency that
represents him or her.

An actor appears as a performer in a television series or a guest star on an


episode. An actor is identified by name and also has a nationality and birth
date. An actor plays a particular character in a television series or episode.
Keys in E/R Diagrams
Every entity must have a key

name category

price

Product
Attributes and Keys
Key attributes must be unique for each entity
Keys are used to identify particular entities
Partial keys are only partially unique
used for weak entity types

Age SSN Date

Attribute Key Attribute Partial Key


Attribute
Keys
A super key of an entity set is a set of one
or more attributes whose values uniquely
determine each entity.
A candidate key of an entity set is a
minimal super key
ID is candidate key of instructor
course_id is candidate key of course
Although several candidate keys may exist,
one of the candidate keys is selected to be
the primary key.
Keys in E/R Diagrams
name category
Underline:

price

No formal way Product


to specify multiple
keys in E/R diagrams
Person

address name ssn


Entity With Composite, Multivalued, and Derived Attributes
E-R Diagrams

Rectangles represent entity sets.


Diamonds represent relationship sets.
Attributes listed inside entity rectangle
Underline indicates primary key attributes
name category
name

price
makes Company
Product

stockprice

buys employs

Person

address name ssn


Relationship Sets with Attributes
Entity Set to Relation
name category

price

Product

Product(name, category, price)

name category price

gizmo gadgets $19.99


Classroom Design Exercise
Imagine we are creating a database for a dorm, which
includes a cooperative kitchen.
We want to record certain information about each
resident. What?
Not all residents belong to the kitchen coop. Those that do
interact in various ways:
1. They take turns at various jobs: preparer, cleanup, buyer (for
supplies). No one should have two jobs on one day.
2. They may or may not be vegetarian. Each meal must have at
least one vegetarian entre.
3. They pay fees to the coop.
For each meal, there is a menu. Each menu item requires
certain ingredients, which must be on hand.
Constraints in E/R Diagrams
Finding constraints is part of the modeling process.
Commonly used constraints:

Keys: social security number uniquely identifies a person.

Single-value constraints: a person can have only one father.

Referential integrity constraints: if you work for a company, it


must exist in the database.

Other constraints: peoples ages are between 0 and 150.


Other Constraints

<100
Product makes Company

What does this mean ?


Entity Types
Entity types boxes
Weak entity type double box

EMPLOYEE DEPENDENT

Entity Weak Entity


Weak Entity Sets
Entity sets are weak when their key comes from other
classes to which they are related.

Team affiliation University

sport number name


Weak Entity Sets
Sometimes an E.S. E s key comes not (completely) from its own
attributes, but from the keys of one or more E.S.s to which E is
linked by a supporting many-one relationship.
Called a weak E.S.
Represented by putting double rectangle around E and a double
diamond around each supporting relationship.
Many-one-ness of supporting relationship (includes 1-1)
essential.
With many-many, we wouldn't know which entity provided the key value.
Exactly one also essential, or else we might not be able to
extract key attributes by following the supporting relationship.
Example: Logins (Email Addresses)
Login name = user name + host name, e.g., ark@soe.ucsc.edu.
A login entity corresponds to a user name on a particular host,
but the passwd table doesnt record the host, just the user name,
e.g., ark.
Key for a login = the user name at the host (which is unique for
that host only) + the IP address of the host (which is unique
globally). name name

Logins @ Hosts
Example Schema
DEPENDENT Age
DependentOf

EID
Name EMPLOYEE
Phone

Name
PROJECT
WorksOn Budget

StartDate
Design Principles
Setting: client has (possibly vague) idea of what he/she wants. You
must design a database that represents these thoughts and only
these thoughts.
Avoid redundancy
= saying the same thing more than once.
Wastes space and encourages inconsistency.

Example
Good:
name name addr

product ManfBy Manfs


Example
Bad: repeats manufacturer address for each beer
they manufacture.
name manf

product Manf
addr

Bad: manufacturers name said twice.


name manf name addr

product ManfBy Manfs


Entity Sets Vs. Attributes
You may be unsure which concepts are worthy of being entity
sets, and which are handled more simply as attributes.
Especially tricky for the class design project, since there is a
temptation to create needless entity sets to make project
larger.
name name
Wrong:
Cars ManfBy Manfs

Right: name manf

Cars
Intuitive Rule for E.S. Vs. Attribute

Make an entity set only if it either:


1. Is more than a name of something; i.e., it
has nonkey attributes or relationships with
a number of different entity sets, or
2. Is the many in a many-one relationship.
Don't Overuse Weak E.S.
There is a tendency to feel that no E.S. has its entities
uniquely determined without following some
relationships.
However, in practice, we almost always create unique
ID's to compensate: social-security numbers, VIN's, etc.
The only times weak E.S.'s seem necessary are when:
a) We can't easily create such ID's; e.g., no one is going to accept
a species ID as part of the standard nomenclature (species is
a weak E.S. supported by membership in a genus).
b) There is no global authority to create them, e.g., crews and
studios.
Notation Summary
Equivalent Schema defined
Design an ER schema for the following
enterprise:
University Example

A college contains many departments


Each department can offer any number of courses
Many instructors can work in a department
An instructor can work only in one department
For each department there is a Head
An instructor can be head of only one department
Each instructor can take any number of courses
A course can be taken by only one instructor
A student can enroll for any number of courses
Each course can have any number of students
Steps in ER Modeling
Step 1: Identify the Entities

DEPARTMENT

STUDENT

COURSE

INSTRUCTOR
Steps in ER Modeling
Step 2: Find the relationships
One course is enrolled by multiple students and one student enrolls for
multiple courses, hence the cardinality between course and student is
Many to Many.
The department offers many courses and each course belongs to only
one department, hence the cardinality between department and course
is One to Many.
One department has multiple instructors and one instructor belongs to
one and only one department , hence the cardinality between
department and instructor is one to Many.
Each department there is a Head of department and one instructor is
Head of department ,hence the cardinality is one to one .
One course is taught by only one instructor, but the instructor teaches
many courses, hence the cardinality between course and instructor is
many to one.
Steps in ER Modeling
Step 3: Identify the key attributes

Deptname is the key attribute for the Entity Department, as it identifies the
Department uniquely.
Course# (CourseId) is the key attribute for Course Entity.
Student# (Student Number) is the key attribute for Student Entity.
Instructor Name is the key attribute for Instructor Entity.

Step 4: Identify other relevant attributes

For the department entity, the relevant attribute is location


For course entity, course name, duration, prerequisite
For instructor entity, room#, telephone#
For student entity, student name, date of birth
Steps in ER Modeling
Step 5:

Draw complete E-R diagram with all


attributes including Primary Key
E-R Diagram for a University Enterprise
Case Study 2
Design a DB representing cities, counties, and states in the US:
For states, record the name, population, and state capital (a city).
For counties, record the name, the population, and the located
state.
For cities, record the name, the population, the located state and
the located county.
Uniqueness assumptions:
Names of states are unique.
Names of counties are unique within a state (e.g., 26 states have
Washington Counties).
Cities are unique only within a state (e.g., there are 24 Springfields
among the 50 states).
Some counties and cities have the same name, even within a state
(e.g., Los Angeles).
All cities are located within a single county 69
Design 1: bad

Co. Popu. Co. name


Popu.

cities Located states name


Ci. Popu.

Ci. name capita


l

Problem: County Population is repeated for each city.

70
Design 2: good

Co. Popu. Co. name Popu. name

counties Located states

Belongs-to

cities capitals

Ci. Popu. Ci. name

71
Case Study 3
Design a DB consistent with the following facts.
Trains are either local trains or express trains, but never
both.
A train has a unique number and an engineer.
Stations are either express stops or local stops, but never
both.
A station has a unique name and an address.
All local trains stop at all stations.
Express trains stop only at express stations.
For each train and each station the train stops at, there is
a time.
72
Design 1: bad

number type name


time
addr
trains StopsAt stations
engineer type

Problem: does not capture the constraints that express trains only stop only at
express stations and local trains stop at all local stations
73
Design 2: good
Lname
address

local stations
Lnumber engineer time

local trains StopsAt2


Enumber engineer

express trains
time

StopsAt1 express stations

Ename
address
Case Study 4 (Pine Valley Furniture Company
1. The company sells a number of different furniture
products.
These products are grouped into several product lines.
The identifier for a product is Product_ID, while the
identifier for a product line is Product_Line_ID.
Referring to the customer invoice, we identify the
following additional attributes for product:
Product_Description, Product_Finish, and Unit_Price.
Another attribute for product line is Product_Line_Name.
A product line may group any number of products, but
must group at least one product.
Each product must belong to exactly one product line.
Case Study 4
2. Customers submit orders for products The
identifier for an order is Order-ID, and another
attribute is Order_Date.
A customer may submit any number of orders, but
need not submit any orders.
Each order is submitted by exactly one customer.
The identifier for a customer is Customer_ID.
Other attributes include Customer_Name and
Customer_Address.
Case Study4
3. A given customer order must request at
least one product.
Any product sold by Pine Valley Furniture
may not be requested on any order, or may be
requested on one or more orders.
An attribute associated with each order and
product is Quantity, which is the number of
units requested.
Case Study4
4. Pine Valley Furniture has established sales
territories for its customers.
Each customer does business in one or more of
these sales territories.
The identifier for a sales territory is
Territory_ID.
A sales territory may have any number of
customers, or may not have any customers
doing business.
Case Study4
5. Pine Valley Furniture Company has several
salespersons. The identifier for a salesperson is
Salesperson_ID.
Other attributes include Salesperson_Name,
Salesperson_Telephone, and Salesperson_Fax.
A salesperson serves exactly one sales territory.
Each sales territory is served by one or more
salespersons.
Case Study4
6. Each product is assembled from one or
more raw materials.
The identifier for the raw material entity is
Material_ID.
Other attributes include Unit_of_Measure and
Unit_Price.
Each raw material may be assembled into one
or more products.
Case Study4
7. Raw materials are supplied by vendors.
The identifier for a vendor is Vendor_ID.
Other attributes include Vendor_Name and
Vendor_Address.
Each raw material can be supplied by one or more
vendors.
A vendor may supply any number of raw materials, or
may not supply any raw materials to Pine Valley
Furniture.
An attribute of the relationship between vendor and
raw material is Unit_Price
Case Study4
8. Pine Valley Furniture has established a
number of work centers.
The identifier for a work center is
Work_Center_ID.
Another attribute is Location. Each product is
produced in one or more work centers.
A work center may be used to produce any
number of products, or may not be used to
produce any products.
Case Study4
9. The company has over 100 employees.
The identifier for employee is Employee_ID.
Other attributes are Employee_Name,
EmployeeAddress, and Skill.
An employee may have more than one skill.
And Each skill can be mastered by many
employees or none.
Case Study4
10: Each employee works in one or more
work centers.
A work center must have at least one
employee working in that center, but may
have any number of employees.
Case Study4
11. Each employee has exactly one
supervisor.
An employee who is a supervisor may
supervise any number of employees, but not
all employees are supervisors.
Library Case Study
When a library first receives a book from a publisher it is sent,
together with the accompanying delivery note, to the library
desk. Here the delivery note is checked against a file of books
ordered.
If no order can be found to match the note, a letter of enquiry is
sent to the publishers. If a matching order is found, a catalogue
note is prepared from the details on the validated delivery note.
The catalogue note, together with the book, is sent to the
registration department. The validated delivery note is sent to
the accounts department where it is stored.
On receipt of an invoice from the publisher, the accounts
department checks its store of delivery notes. If the
corresponding delivery note is found then an instruction to pay
the publishers is made, and subsequently a cheque is sent. If no
corresponding delivery note is found, the invoice is stored in a
pending file.
A Case Study
Conference centre booking system

A conference centre takes bookings from clients who wish to hold courses or conferences at
the
centre. When clients make bookings they specify how many people are included in the
booking,
and of these, how many will be resident during the booking, and how many will require
catered or
non-catered accommodation at the centre.
The centre contains a number of facilities which may be required by clients making bookings
as
follows:
A. There are 400 bedrooms for clients who will be resident during the Course or
conference.
B. A maximum of 250 catered people can be handled at any one time.
C. Six main lecture theatres providing seating for 200 people.
D. Twenty seminar rooms each able to accommodate 25 people.
E. Video conference facilities. The video conference facilities consist of four separate
video conference networks. Each video conference network has a large screen
based
in one of the main lecture theatres, along with 3 satellite screens each of which is
based in one of the seminar rooms.
Shipping company example
The London and Ireland Shipping Company PLC (LISC) was founded in
1852 and owns a fleet of cargo ships. The company had historically run
passenger liners, but recent policy decisions involved the sale of all
passenger-carrying vessels. The company currently has 14 vessels,
including one oil tanker and one tugboat operating out of Liverpool.
Most of the vessels are registered in Liberia for tax reasons.
Each ship has one or more holds divided into spaces. The holds are
defined by steel bulkheads and the spaces are defined by shelf racks or
other physical dividers. Sister ships, built by the same shipbuilders and
to the same designs have similar names, such as Pride of Ireland, Queen
of Ireland, Song of Ireland and Warrior of Ireland. Sister ships also have
identical cargo storage facilities.
LISC issues contracts to agents for one or more manifests (lists of cargo
items to be shipped). LISC's charges for cargo carried are based on the
number of spaces the cargo requires for storage. The types of cargo
typically carried by LISC include grain, coal and ores (carried only in
ships equipped with bulk cargo holds). They also transport sacked grain,
heavy cases, containers (which may be carried on deck), pallets and so
on.
Shipping company example
Cargo items may take up less than one space in a hold, or one or more
spaces, depending on the size of the item. A space may therefore contain
several small cargo items.
The ships owned by LISC are kept as busy and as full as possible, in order
to maximise the profits that each vessel makes and minimise running &
operating costs. LISC's ships ply most of the seas of the world, but tend to
operate mainly in the Mediterranean, the North and Mid Atlantic and the
Indian Ocean. Different ships require different crew complements.
LISC intends to create a computer based information system that will be
able to perform the following tasks:
record the voyages of each ship with the start and end ports.
record the cargo held by a ship on each voyage
keep records of their employees and the ships they are assigned to
producing invoices for agents and customers
keep a record of customers' payments on invoices
analyse the efficiency of use of cargo space and of percentage wasted
cargo space for ships voyages
Film Club Case Study:
Film Club UK is a company that owns or leases a number of
small cinemas in the UK. They have commissioned a database
designer to design a database solution to enable them to
maintain details about their cinemas and the films that they
show. Note that it is possible to have two cinemas in the same
location with the same name (there used to be two Odeons in
Newcastle). It is also possible to have different films with the
same title (for example, different versions of a Shakespeare
play).

Films are scheduled for one or more showings at a cinema


within a season. Season details are to be notified in advance
of the dates and times of showings, takings, etc. to be notified
later.

Any one film may have more than one season at any one
cinema (for example, a cinema showing The Snowman each
Christmas). At present, all cinemas are single-screen.

Potrebbero piacerti anche