Sei sulla pagina 1di 39

Lecture Notes on Database

Three

Chapter

DATABASE DESIGN

Overview
Defining Goals and Objectives
Analyzing the Existing/Current System
Identifying Entities
Identifying Attributes
Identifying Entities and Relationships
Determining and Defining Business Rules
Creating Data Structures/Tables
Defining Relationships
Classifying Relationships
Determining and Defining Views
Reviewing the Design
Testing
Entity Relationship Notations
Naming Entities and Attributes
Create the Entity-Relationship Diagram (First Draft)
Refining the Entity-Relationship Diagram

Terrence Brunton

53

Lecture Notes on Database


Three

Chapter

Defining Goals and Objectives


Every database is created for a specific purpose, whether it's to solve a specific business
problem, to manage the daily transactions of a business or organization, or to be used as
part of an information system. By identifying the goals and objectives of the database you
will ensure that the appropriate design is created and the appropriate data is collected to
support the intended purpose of the database.
It may be helpful to distinguish between goals and objectives. The goals can be thought
of as the bigger picture, more strategic in terms of the scope of the project. These goals
are critical in terms of fitting the database into the overall information system used in the
organization as a data store. For example, the database may be located within a data flow
diagram that depicts as range of processes within an organization.
The top management must therefore engage in discussion with the Database Analyst to
determine the goals. These goals must be clearly documented by the Analyst and
approved by the client and management.
Objectives are more in the realm of the instrumental implementation of the database,
related to the intermediate steps that must be completed in order for goals to be achieved.
Objectives must be developed by Analysts interacting with management and user
personnel. Some key objectives may be identified, such as the production of a blueprint
of the database, the database design, and the database implementation in MS Access.

Analyzing the Existing System


Having defined goals and objectives the database Analyst should then examine the
existing information systems in place in the organization. There may be an existing
database, which is often described as a legacy database.
Alternatively, the organization may simply have a loose collection of paper-based records
such as forms, index cards, and manila folders. In each of the above situations, a close
study of the existing records will yield valuable information on the type of data being
collected, stored and used. The database Analyst will be concerned to identify how data
is collected by paper forms, and how it is presented to users by reports. In the case of a
computerized database, the Analyst will also examine how the data is collected on-screen,
and how reports are presented on-screen. The people in the organization who use the
system should also be approached to provide information on the database.
Questionnaires and interviews can be used to source information from users.

Terrence Brunton

54

Lecture Notes on Database


Three

Chapter

Finding Data & Procedures


There are a number of ways of finding out about existing data procedures and problems.
These include:
1. Observation: spending some time in the department concerned, seeing at first
hand the procedures used, workloads and bottlenecks
2. Reading the documentation associated with the system
3. Asking clerical staff to keep special counts during a trial period to establish where
problems might lie
4. Questionnaires: these can be useful when a lot of people will be affected by a new
system
5. Interviews: the most common and most useful way of fact finding. Interviews
must be well planned and consideration given to factors such as:

Whom to interview
When to interview
What to ask
Where to hold the interview

Investigating Hard Data


Documentation
Hard Data is generally classified as quantitative or qualitative documents. Facts and
figures, financial information, organizational contexts, organizational problems, and
organizational structure are revealed through a study of documents. The Analyst needs to
understand also that organization members may construct the meanings taken from hard
data personally. The Analyst asks whom the documents were produced for originally, and
why they were kept. He/she seeks to understand the role of the document in the
organization.

Terrence Brunton

55

Lecture Notes on Database


Three

Chapter

Generally, well-documented organizations with procedures may be less flexible than


others may with less documentation, since documentation is often used to perform a
control function. It follows that the changing of documentation facilitates organization
change. Since documentation can provide an easily accessed picture of where the
organization has been, and where it is going, the Analyst needs to start with a study of the
documentation.

Quantitative Documents
Reports used for operational decision-making - These are generally paper reports on
inventory status, sales, production and cash flow. Mid-level management personnel and
supervisors typically use these.
Performance Reports - These generally work within a framework of budgeting and
exception reporting. They include ABC analysis and variance analysis and are important
for trend analysis.
Records and Data Capture Forms - These include 'raw data' generally or paper used as
'raw material' or inputs to the information system. Analysis of such 'raw data' is
particularly significant in the small business because of inherent source data input
inefficiencies.

Qualitative Documents
Memos, Bulletin Boards, Procedure Manuals and Policy Handbooks - These
documents are usually rich in revealing the expectations for behavior of others that their
writers hold.
Policy and Procedure Manuals - these are particularly important since they document
systems already in place in the organization. These can be compared with observations of
actual procedures practiced in the organization.

Using Questionnaires
Questionnaires are an information-gathering technique that allows the study of attitudes,
beliefs, behaviors and characteristics of key people in the organization. These are less
appropriate for the small business with few employees, and will not be considered in
depth.
Terrence Brunton

56

Lecture Notes on Database


Three

Chapter

The main issues are choice of recipients, open or closed questions, and choice of
language, scaling, and administering the questionnaire.
When designing questionnaires, students should:
1.
2.
3.
4.
5.
6.
7.

Clarify the purpose of the enquiry


Devise clear, unambiguous questions
Use language intelligible to the respondent
Avoid leading questions
Follow a logical sequence in questions
Avoid questions that tax the memory too much
Do not use multiple-choice questions where one of the offered answers appears to
confer some status to respondents
8. Avoid questions on topics which respondents will be reluctant to answer
9. Confine questions to the personal experience of respondents
10. Introduce some control questions

Interviewing
Before you interview someone else you must, in effect, interview yourself. You need to
know your biases, and how they will affect your perceptions. Your education, intellect,
upbringing, and emotions all serve as powerful filters for what you will be hearing in
your interview.
Before you start, visualize why you are doing this, what you will ask, what you want to
find out. You should try to anticipate how you will make the interview fulfilling for the
interviewee (your respondent), as well as how to put the respondent at ease and receptive
to your needs, and if possible identify some benefits for your respondent and discuss
these early in the process.
An information-gathering interview is a directed conversation with a specific purpose
that uses a question-and-answer format. The objective is to get the opinions of the
interviewee and his/her feelings about the current state of the system, organizational and
personal goals, and informal procedures.
Opinions of key employees are extremely important, since they can reveal key problems
in the organization. Feelings are also important since they reveal emotions and attitudes.
Goals are important for pointing out the future direction of the organization.
A record of the interview should be made. A tape recorder can be used to do this if it is
acceptable to the respondent, or notes can be taken.

Terrence Brunton

57

Lecture Notes on Database


Three

Chapter

Planning the Interview


Read background material - Read background material about the interviewee and the
organization. Pay attention to the language and culture of the organization to facilitate
communication with your respondent. Avoid wasting time in the interview by asking
general background questions.
Establish interviewing objectives - From the background material above, identify four
to six key areas concerning information processing and decision-making behavior you
want to question. Include information sources, information formats, information quality
and decision-making style.
Decide whom to interview
Prepare the Interviewee
1. Call and make an appointment.
2. Submit questionnaire to respondent before the interview.
3. Keep the interview short (30 - 45 minutes).
Decide on Question Types and Structure
1.
2.
3.
4.
5.

Open-ended Questions
Closed Questions
Probing Questions
AVOID Leading Questions and Double-Barreled Questions
Think about and write down Questions before the interview.

Interviews may be successful, but there are a number of problems that could occur, for
example:
1. Interviewers have different skills and abilities
2. The way the question is asked may influence answers
3. Interviewees may give answers they think are expected of them or they
may not be very motivated and give inaccurate answers.

Terrence Brunton

58

Lecture Notes on Database


Three

Chapter

Identifying Entities
When the Analyst seeks to determine the data which he wants to model for the proposed
system, he must think about the real world objects which exist in the organization such
as the people, places, things and events which the Analyst may find in the organization.
By way of example, let us consider a Video Club. Say we are looking at the renting of
DVDs. Some of the entities we may find in such a situation would be customer, DVD,
movie title, and actor. These are the more obvious ones.
There may also be others which provide even more detail about objects such as the
number of units of a particular movie title in stock or the status of a particular title, e.g.
whether it is out of stock, out on rental or never stocked. Thinking about the entities as
representing information about people we should remember that there are some standard
aspects of people that we want to represent in data, e.g. name, ID#. These we call
attributes. Examples of people are student, employee, customer, manager, player,
umpire, driver, gardener, etc. So we can identify people (or rather groups of similar
people) in an organization.
What about places. What are some of the kinds of places that we may encounter in an
organization? Let us take a typical organization in Trinidad & Tobago, a company that is
involved in the distribution of consumer goods, such as pharmaceuticals or furniture,
which are purchased from a manufacturer and sold to retailers. One of the places we
would surely encounter would be a warehouse. Other places in such an environment
may be offices, trucks, and delivery addresses. Another example of a place with which
all students would be familiar is the class Room.
In the business setting, things, material things, called goods are artifacts these are
creations, i.e. things that people make. These goods, e.g. widgets, bolts and nuts, drugs,
toys, food items, etc., are also objects of our attention. They are objects about which we
need to collect and maintain data. We want to know how many are in stock/were sold,
we are also interested in the price of an item. They are entities.
Events - these include events that you may be familiar with, e.g. parties and sports
meetings, but within the organization events may also be meetings, classes, interviews,
over the counter purchases. Any non-material thing for which people gather together
for a period of time can be represented as an event.
A transaction is also a type of event. So if you want to keep information on a particular
transaction, say between a customer and a customer sales representative then that
transaction is also an event. And we call it an event when we want to maintain some data
relevant to that specific confluence of people and time. Relevance is a key concept - we
need to consider what information is relevant. What information will be useful to the
organization? The analyst determines relevance by making a selection of what data is to
be represented in the database.
Terrence Brunton

59

Lecture Notes on Database


Three

Chapter

Entities are the main data objects about which data is to be collected and stored.
Specific examples of entities are:
INVENTORY

CUSTOMER

ORDER

INVOICE

Identifying Attributes
So that for each of the entities that we are concerned with we have further to determine
the characteristics of an entity that we wish to represent. With regard to INVENTORY
what we choose to represent is quantity in stock, the movie title, actor, etc.
Certain types of entities suggest that a certain minimum number of attributes must be
represented, e.g. CUSTOMER - since the customer is a person we would expect that a
name must be represented and here we have name, address, phone no., fax no., and email. For an Invoice we would also expect certain types of attributes; for instance,
invoice no., customer name, name of item, quantity of item, etc.
Attributes describe the entities. Attributes may be descriptors or identifiers. Descriptors
represent non-unique characteristics of an entity. Identifiers are also called keys and
describe unique characteristics of an entity.

Terrence Brunton

60

Lecture Notes on Database


Three

Chapter

The following examples show entities with their attributes identified.


INVENTORY
Movie Title
Category
Quantity in stock
Actor

CUSTOMER
Name
Address
Phone No.
Fax
e-mail

INVOICE
Invoice Number
Customer Name
Customer Address
Quantity of item
Name of item
Item Price
Extended Price
Total

ORDER
Order Number
Supplier Name
Supplier Address
Supplier Phone No.
Items Ordered
Delivery Date

Identifying Entities, Attributes and Relationships


In order to begin constructing the basic model, the modeler must analyze the information
gathered during the requirements analysis for the purpose of:

classifying data objects as either entities or attributes


identifying and defining relationships between entities
naming and defining identified entities, attributes, and relationships
documenting this information in the data document

To accomplish these goals the modeler must analyze narratives from users, notes from
meeting, policy and procedure documents, and, if lucky, design documents from the
current information system.
Although it is easy to define the basic constructs of the ER model, it is not an easy task to
distinguish their roles in building the data model. What makes an object an entity or

Terrence Brunton

61

Lecture Notes on Database


Three

Chapter

attribute? For example, given the statement "employees work on projects". Should
employees be classified as an entity or attribute? Very often, the correct answer depends
upon the requirements of the database. In some cases, employee would be an entity, in
some it would be an attribute.
While the definitions of the constructs in the ER Model are simple, the model does not
address the fundamental issue of how to identify them. Some commonly given
guidelines are:
entities contain descriptive information
attributes either identify or describe entities
relationships are associations between entities
These guidelines are discussed in more detail below.

Entities
Attributes
o Validating Attributes
o Derived Attributes and Code Values
Relationships
Naming Entities and Attributes
Defining Entities
Recording Information in Design Document

Entities
There are various definitions of an entity:
"Any distinguishable person, place, thing, event, or concept, about which
information is kept"
"A thing which can be distinctly identified"
"Any distinguishable object that is to be represented in a database"
"...anything about which we store information (e.g. supplier, machine tool,
employee, utility pole, airline seat, etc.). For each entity type, certain
attributes are stored"
These definitions contain common themes about entities:
an entity is a "thing", "concept" or, object". However, entities can sometimes
represent the relationships between two or more objects. This type of entity is
known as an associative entity.
Terrence Brunton

62

Lecture Notes on Database


Three

Chapter

entities are objects which contain descriptive information. If a data object you
have identified is described by other objects, then it is an entity. If there is no
descriptive information associated with the item, it is not an entity. Whether or not
a data object is an entity may depend upon the organization or activity being
modeled.
an entity represents many things which share properties. They are not single
things. For example, King Lear and Hamlet are both plays that share common
attributes such as name, author, and cast of characters. The entity describing these
things would be PLAY, with King Lear and Hamlet being instances of the entity.
entities which share common properties are candidates for being converted to
generalization hierarchies (see below).
entities should not be used to distinguish between time periods. For example, the
entities 1st Quarter Profits, 2nd Quarter Profits, etc. should be collapsed into a
single entity called Profits. An attribute specifying the time period would be used
to categorize by time.
not every thing the users want to collect information about will be an entity. A
complex concept may require more than one entity to represent it. Other "things"
users think important may not be entities.

Attributes
Attributes are data objects that either identify or describe entities. Attributes that identify
entities are called key attributes. Attributes that describe an entity are called non-key
attributes. Key attributes will be discussed in detail in a later section.
The process for identifying attributes is similar to identifying entities except now you
want to look for and extract those names that appear to be descriptive noun phrases.

Validating Attributes
Attribute values should be atomic, that is, present a single fact. Having disaggregated
data allows simpler programming, greater reusability of data, and easier implementation
of changes. Normalization also depends upon the "single fact" rule being followed.
Common types of violations include:
simple aggregation - a common example is Person Name which concatenates first
name, middle initial, and last name. Another is Address which concatenates, street
address, city, and zip code. When dealing with such attributes, you need to find
out if there are good reasons for decomposing them. For example, do the endusers want to use the person's first name in a form letter? Do they want to sort by
zip code?
complex codes - these are attributes whose values are codes composed of
concatenated pieces of information. An example is the code attached to
automobiles and trucks. The code represents over 10 different pieces of

Terrence Brunton

63

Lecture Notes on Database


Three

Chapter

information about the vehicle. Unless part of an industry standard, these codes
have no meaning to the end user. They are very difficult to process and update.
text blocks - these are free-form text fields. While they have a legitimate use, an
over reliance on them may indicate that some data requirements are not met by
the model.
mixed domains - this is where a value of an attribute can have different meaning
under different conditions

Derived Attributes and Code Values


Two areas where data modeling experts disagree is whether derived attributes and
attributes whose values are codes should be permitted in the data model.
Derived attributes are those created by a formula or by a summary operation on other
attributes. Arguments against including derived data are based on the premise that derived
data should not be stored in a database and therefore should not be included in the data
model. The arguments in favor are:
derived data is often important to both managers and users and therefore should
be included in the data model
it is just as important, perhaps more so, to document derived attributes just as you
would other attributes
including derived attributes in the data model does not imply how they will be
implemented
A coded value uses one or more letters or numbers to represent a fact. For example, the
value Gender might use the letters "M" and "F" as values rather than "Male" and
"Female". Those who are against this practice cite that codes have no intuitive meaning to
the end-users and add complexity to processing data. Those in favor argue that many
organizations have a long history of using coded attributes, that codes save space, and
improve flexibility in that values can be easily added or modified by means of look-up
tables.

Relationships

Terrence Brunton

64

Lecture Notes on Database


Three

Chapter

Relationships are associations between entities. Typically, a relationship is indicated by a


verb connecting two or more entities. For example:
employees are assigned to projects
As relationships are identified they should be classified in terms of cardinality,
optionality, direction, and dependence. As a result of defining the relationships, some
relationships may be dropped and new relationships added. Cardinality quantifies the
relationships between entities by measuring how many instances of one entity are related
to a single instance of another. To determine the cardinality, assume the existence of an
instance of one of the entities. Then determine how many specific instances of the second
entity could be related to the first. Repeat this analysis reversing the entities. For
example:
employees may be assigned to no more than three projects at a time; every
project has at least two employees assigned to it.
Here the cardinality of the relationship from employees to projects is three; from projects
to employees, the cardinality is two. Therefore, this relationship can be classified as a
many-to-many relationship.
If a relationship can have a cardinality of zero, it is an optional relationship. If it must
have a cardinality of at least one, the relationship is mandatory. Optional relationships are
typically indicated by the conditional tense. For example:
an employee may be assigned to a project
Mandatory relationships, on the other hand, are indicated by words such as must have.
For example:
a student must register for at least three courses each semester
In the case of the specific relationship form (1:1 and 1:M), there is always a parent entity
and a child entity. In one-to-many relationships, the parent is always the entity with the
cardinality of one. In one-to-one relationships, the choice of the parent entity must be
made in the context of the business being modeled. If a decision cannot be made, the
choice is arbitrary.

Determining and Defining Business Rules

Terrence Brunton

65

Lecture Notes on Database


Three

Chapter

Determining and defining Business Rules is the fifth phase of the database design
process. As the database developer you hold interviews, identify limitations on various
aspects of the database, establish Business Rules, and define and implement Validation
Tables.
The way an organization views and uses its data will dictate a set of limitations and
requirements that have to be built into the database. Your interviews with users and
management will determine what specific limitations and requirements will be imposed
on the data, data structures, or relationships. You then establish and document these
specifications as Business Rules.
The interviews held with users will reveal specific limitations on various aspects of the
database. For example, a user working with an Order Processing database is very aware
of specific details, such as the fact that a SHIP DATE must be later than an ORDER
DATE, that there must always be a DAYTIME PHONE NUMBER, and that a METHOD
OF SHIPMENT should always be indicated. On the other hand, management interviews
are intended to reveal general limitations on various aspects of the database. The office
manager for an entertainment agency, for example, is familiar with general issues, such as
the fact that an agent can represent no more than twenty entertainers and that promotional
information for each entertainer must be updated every year.
Next, you define and implement Validation Tables, if necessary, to support certain
Business Rules. For example, if certain fields are found to have a finite range of values
owing to the manner in which they are used by the organization, Validation Tables are
used to ensure the consistency and validity of the values stored in those fields.
The level of integrity established by Business Rules at this point is significant because it
relates directly to the way the organization views and uses its data. As the organization
grows, its perspective on the data will change, which means that the Business Rules must
change as well. This means that determining and establishing Business Rules is an
ongoing, iterative process. Constant diligence is necessary at all times to maintain this
level of integrity properly.

Creating the Data Structures/Tables

Terrence Brunton

66

Lecture Notes on Database


Three

Chapter

Creating the data structures for the database is the third phase in the database design
process. You define tables and fields, establish keys, and define field specifications for
every field.
Tables are the first structures you define in the database. The various subjects that each of
the tables represents are determined from the mission objectives stated in the first phase
of the design process, together with the data requirements gathered in the second phase of
the design process. Once you have identified the subjects, you establish them as tables,
and then you associate each field from the field list compiled in the second phase with an
appropriate table. You then review each table to ensure that it represents only one subject
and that it contains no duplicate fields.
Next you go on to review the fields within each table. If you find multipart or multivalued fields in a table, you modify the table so that each field stores only a single value.
A field that does not represent a characteristic of the subject of the table is moved to a
more appropriate table or deleted entirely. When your review is complete, you establish a
Primary key that will uniquely identify each record within the table.
The final step in this phase is to establish field specifications for each field in the
database. At this point, you conduct interviews with users and management to help
identify any specific field characteristics that may be important to them. You also review
and discuss any characteristics that they may be unfamiliar with. When the interviews are
completed, you define and document field specifications for each field. You then review
the table structures and field specifications with users and management once more for
possible refinements. When the refinements, if any, are completed, your tables are ready
for the next phase.

Defining Relationships
In the fourth phase of the database design process you establish table relationships. You
conduct interviews with users and management once again, identify relationships,
identify relationship characteristics, and establish relationship-level integrity.
Working with users and management to identify relationships is extremely helpful
because you cannot possibly be familiar with every aspect of the data being used by the
organization. Most people have a good perspective of the data they work with and can
usually identify relationships among the data rather easily. Therefore interviewing users
yields very useful information.
Once the relationships have been identified, you need to establish the logical connection
for each relationship. Depending on the type of relationship, you use either a Primary key
or a "linking" table to make the connection between a pair of tables based on the type of
relationship you want to establish. Next you'll determine the type of participation and the
degree of participation for each relationship. In some cases, these participation

Terrence Brunton

67

Lecture Notes on Database


Three

Chapter

characteristics will be obvious due to the nature of the data stored in the tables. In other
cases, the type of participation and degree of participation will be based on specific
Business Rules.
A relationship represents an association between two or more entities:
Examples of relationships are:
Student
Customer

attends
places

Class
Order

These would be represented in diagrams as follows:


STUDENT

CUSTOMER

attends

places

CLASS

ORDER

Classifying Relationships
Relationships are classified by their degree, connectivity, cardinality, direction, type, and
existence. Not all modeling methodologies use all these classifications.
Degree of a Relationship
The degree of a relationship is the number of entities associated with the relationship.
The n-ary relationship is the general form for degree n. Special cases are the binary, and
ternary, where the degree is 2, and 3, respectively.
Binary relationships, the association between two entities are the most common type in
the real world. A recursive binary relationship occurs when an entity is related to itself.
An example might be "some employees are married to other employees".

Terrence Brunton

68

Lecture Notes on Database


Three

Chapter

A ternary relationship involves three entities and is used when a binary relationship is
inadequate. Many modeling approaches recognize only binary relationships. Ternary or
n-ary relationships are decomposed into two or more binary relationships.
Connectivity and Cardinality
The connectivity of a relationship describes the mapping of associated entity instances in
the relationship. The values of connectivity are "one" or "many". The cardinality of a
relationship is the actual number of related occurrences for each of the two entities. The
basic types of connectivity for relations are: one-to-one, one-to-many, and many-to-many.
A one-to-one (1:1) relationship is when, at most, one instance of an entity A is associated
with one instance of entity B. For example, employees in the company are each assigned
their own office. For each employee there exists a unique office and for each office there
exists a unique employee.
A one-to-many (1:N) relationship is when for one instance of entity A, there are zero, one,
or many instances of entity B, but for one instance of entity B, there is only one instance
of entity A. An example of a 1:N relationship is:
a department has many employees
each employee is assigned to one department
A many-to-many (M:N) relationship, sometimes called non-specific, is when for one
instance of entity A, there are zero, one, or many instances of entity B and for one
instance of entity B there are zero, one, or many instances of entity A. An example is:
employees can be assigned to no more than two projects at the same time;
projects must have assigned at least three employees
A single employee can be assigned to many projects; conversely, a single project can
have assigned to it many employees. Here the cardinality for the relationship between
employees and projects is two and the cardinality between project and employee is three.
Many-to-many relationships cannot be directly translated to relational tables but instead
must be transformed into two or more one-to-many relationships using associative
entities.

Direction
The direction of a relationship indicates the originating entity of a binary relationship.
The entity from which a relationship originates is the parent entity; the entity where the
relationship terminates is the child entity.
Terrence Brunton

69

Lecture Notes on Database


Three

Chapter

The direction of a relationship is determined by its connectivity. In a one-to-one


relationship the direction is from the independent entity to a dependent entity. If both
entities are independent, the direction is arbitrary. With one-to-many relationships, the
entity occurring once is the parent. The direction of many-to-many relationships is
arbitrary.
Type
An identifying relationship is one in which one of the child entities is also a dependent
entity. A non-identifying relationship is one in which both entities are independent.
Existence
Existence denotes whether the existence of an entity instance is dependent upon the
existence of another, related, entity instance. The existence of an entity in a relationship is
defined as either mandatory or optional. If an instance of an entity must always occur for
an entity to be included in a relationship, then it is mandatory. An example of mandatory
existence is the statement "every project must be managed by a single department". If the
instance of the entity is not required, it is optional. An example of optional existence is
the statement, "employees may be assigned to work on projects".
Generalization Hierarchies
A generalization hierarchy is a form of abstraction that specifies that two or more entities
that share common attributes can be generalized into a higher-level entity type called a
supertype or generic entity. The lower-level of entities become the subtype, or categories,
to the supertype. Subtypes are dependent entities.
Generalization occurs when two or more entities represent categories of the same realworld object. For example, Wages_Employees and Classified_Employees represent
categories of the same entity, Employees. In this example, Employees would be the
supertype; Wages_Employees and Classified_Employees would be the subtypes.
Subtypes can be either mutually exclusive (disjoint) or overlapping (inclusive). A
mutually exclusive category is when an entity instance can be in only one category. The
above example is a mutually exclusive category. An employee can either be wages or
classified but not both. An overlapping category is when an entity instance may be in two
or more subtypes. An example would be a person who works for a university could also
be a student at that same university. The completeness constraint requires that all
instances of the subtype be represented in the supertype.
Generalization hierarchies can be nested. That is, a subtype of one hierarchy can be a
supertype of another. The level of nesting is limited only by the constraint of simplicity.
Subtype entities may be the parent entity in a relationship but not the child.

Terrence Brunton

70

Lecture Notes on Database


Three

Chapter

Determining and Defining Views


Determining and establishing Views is the sixth phase of the database design process.
Once more, you'll need to conduct interviews, identify various ways of looking at the
data, and establish the Views.
You'll ask users and management to identify the various ways that they look at the data in
the database. Whereas one group may view the data from a shared perspective, another
group within the organization may use a different perspective; some individuals have
unique ways of visualizing the data based on the work they perform. For example, some
individuals need to retrieve data from several tables at the same time in order to see
summary information; others only need to see specific fields from a certain table.
Once you have identified the various ways of seeing the data, you establish them
formally as Views. Each view is defined using the appropriate table or tables, and, in the
case of multi-table Views, fields from the appropriate tables are assigned to the View.
Once you have established all of the Views, you'll need to identify criteria for certain
Views so that they will only display specific records.

Reviewing the Design


The seventh and last phase in the database design process is reviewing the final database
structure for data integrity. First, you'll review each table to ensure that it meets the
criteria of a properly designed table, and you will initially check the fields within each
table for proper structure. Any inconsistencies or problems will be resolved and reviewed
once more. After the appropriate refinements are made, you'll check table-level integrity.
Second, you review and check field specifications for each field. You then make
refinements to fields as necessary and check field-level integrity after any needed
refinements have been made. This review reaffirms the field-level integrity identified and
established earlier in the database design process.
Third, you have to review the validity of each relationship, confirming the type of
relationship, as well as the type of participation and degree of participation for each table
within the relationship. You then study relationship integrity to ensure that there are
matching values between shared fields, and that there are no problems inserting,
updating, or deleting data in any of the tables within the relationship.
Finally, you go over the Business Rules to confirm the limitations placed on various
aspects of the database, as identified earlier in the database design process. If there are
any other limitations that have come to light since the last set of personnel interviews,
you establish them as new Business Rules and add them to the existing set of Business
Rules.

Terrence Brunton

71

Lecture Notes on Database


Three

Chapter

Once the entire database design process is complete, the logical database structure is
ready to be implemented in an RDBMS software program. However, the process is never
really complete because the database structure will always need refinement as the
organization grows.
The above was adapted from Chapter Four, Database Design for Mere Mortals, by
Michael J. Hernandez, 1997

ER Notation
There is no standard for representing data objects in ER diagrams. Each modeling
methodology uses its own notation. The original notation used by Chen is widely used in
academic texts and journals but rarely seen in either CASE tools or publications by nonacademics. Today, there are a number of notations used, among the more common are
Bachman, crow's foot, and IDEFIX.
All notational styles represent entities as rectangular boxes and relationships as lines
connecting boxes. Each style uses a special set of symbols to represent the cardinality of
a connection. The notation used in this document is from Martin. The symbols used for
the basic ER constructs are:
entities are represented by labeled rectangles. The label is the name of the entity.
Entity names should be singular nouns.
relationships are represented by a solid line connecting two entities. The name of
the relationship is written above the line. Relationship names should be verbs.
attributes, when included, are listed inside the entity rectangle. Attributes which
are identifiers are underlined. Attribute names should be singular nouns.
cardinality of many is represented by a line ending in a crow's foot. If the crow's
foot is omitted, the cardinality is one.
existence is represented by placing a circle or a perpendicular bar on the line.
Mandatory existence is shown by the bar (looks like a 1) next to the entity if an
instance is required. Optional existence is shown by placing a circle next to the
entity that is optional. Examples of these symbols are shown in the diagram
below:
Many
One

Entity

DEPARTMENT

Relationship Name

Entity Name

PROJECT

manages

DeptID
Department Name

ProjectID

Key Attribute

Relationship

Terrence Brunton
Attribute

Mandatory
Existence

Optional
Existence

72

Lecture Notes on Database


Three

Chapter

Naming Entities and Attributes


The names should have the following properties:

unique
have meaning to the end-user
contain the minimum number of words needed to uniquely and accurately
describe the object

For entities and attributes, names are singular nouns while relationship names are
typically verbs.
Some authors advise against using abbreviations or acronyms because they might lead to
confusion about what they mean. Others believe using abbreviations or acronyms are
acceptable provided that they are universally used and understood within the
organization.
You should also take care to identify and resolve synonyms for entities and attributes.
This can happen in large projects where different departments use different terms for the
same thing.
Defining Entities
Complete and accurate definitions are important to make sure that all parties involved in
the modeling of the data know exactly what concepts the objects are representing.
Definitions should use terms familiar to the user and should precisely explain what the
object represents and the role it plays in the enterprise. Some authors recommend having
the end-users provide the definitions. If acronyms, or terms not universally understood
are used in the definition, then these should be defined.

Terrence Brunton

73

Lecture Notes on Database


Three

Chapter

While defining entities, the modeler should be careful to resolve any instances where a
single entity is actually representing two different concepts (homonyms) or where two
different entities are actually representing the same "thing" (synonyms). This situation
typically arises because individuals or organizations may think about an event or process
in terms of their own function.
An example of a homonym would be a case where the Marketing Department defines the
entity MARKET in terms of geographical regions while the Sales Departments thinks of
this entity in terms of demographics. Unless resolved, the result would be an entity with
A. different
ONE-TO-ONE
two
meanings and properties.
Conversely,
an example of a synonym would be the Service
Department might have
EMPLOYEE
WORKSTATION
identified an entity calledisCUSTOMER
assigned
while the Help Desk has identified the entity
CONTACT. In reality, they may mean the same thing, a person who contacts or calls the
organization for assistance with a problem. The resolution of synonyms is important in
order to avoid redundancy and to avoid possible consistency or integrity problems.
Some examples of definitions are:
Every
employee is A
assigned
notisall
Employee
person one
whoworkstation;
works for and
paid by the organization.
workstations are assigned to employees.

Est_Time

The number of hours a project manager estimates that project will


require to complete. Estimated time is critical for scheduling a
project and for tracking project time variances.

Assigned
Employees in the organization may be assigned to work on no
B. ONE-TO-MANY
more than three projects at a time. Every project will have at least
two employees assigned to it at any given time.
DEPARTMENT
PROJECT
is responsible

Create the Entity Relationship Diagram (First Draft)


Once entities and relationships have been identified and defined, the first draft of the
entity relationship diagram can be created. This section introduces the ER diagram by
A department may be responsible for many projects but each
demonstrating how to diagram binary relationships. Recursive relationships are also
project is the responsibility of one department.
shown.
Binary Relationships
The illustrations below show examples of how to diagram one-to-one, one-to-many, and
many-to-many relationships.
C. MANY-TO-MANY
EMPLOYEE

PROJECT

is assigned

has assigned
Terrence Brunton
Employees may be assigned to many projects; every
project has assigned at least one employee.

74

Lecture Notes on Database


Three

Chapter

One-To-One
Drawing A. shows an example of a one-to-one diagram. Reading the diagram from left to
right represents the relationship every employee is assigned a workstation. Because
every employee must have a workstation, the symbol for mandatory existence (in this
case the crossbar) is placed next to the WORKSTATION entity. Reading from right to
left, the diagram shows that not all workstation are assigned to employees. This
condition may reflect that some workstations are kept for spares or for loans. Therefore,
we use the symbol for optional existence, the circle, next to EMPLOYEE. The
cardinality and existence of a relationship must be derived from the "business rules" of
the organization.
For example, if all workstations owned by an organization were assigned to employees,
then the circle would be replaced by a crossbar to indicate mandatory existence. One-to-

Terrence Brunton

75

Lecture Notes on Database


Three

Chapter

one relationships are rarely seen in "real-world" data models. Some practitioners advise
that most one-to-one relationships should be collapsed into a single entity or converted to
a generalization hierarchy.

One-To-Many
Drawing B. shows an example of a one-to-many relationship between DEPARTMENT
and PROJECT. In this diagram, DEPARTMENT is considered the parent entity while
PROJECT is the child. Reading from left to right, the diagram represents departments
may be responsible for many projects. The optionality of the relationship reflects the
"business rule" that not all departments in the organization will be responsible for
managing projects. Reading from right to left, the diagram tells us that every project
must be the responsibility of exactly one department.

Many-To-Many
Drawing C. shows a many-to-many relationship between EMPLOYEE and PROJECT.
An employee may be assigned to many projects; each project must have many employees.
Note that the association between EMPLOYEE and PROJECT is optional because, at a
given time, an employee may not be assigned to a project. However, the relationship
between PROJECT and EMPLOYEE is mandatory because a project must have at least
two employees assigned. Many-To-Many relationships can be used in the initial drafting
of the model but eventually must be transformed into two one-to-many relationships. The
transformation is required because many-to-many relationships cannot be represented by
the relational model. The process for resolving many-to-many relationships is discussed
in the next section.

Recursive relationships
A recursive relationship is when an entity is associated with itself. The diagram below
shows an example of the recursive relationship.
manages

EMPLOYEE
is managed
Terrence Brunton

76

Lecture Notes on Database


Three

Chapter

An employee may manage many employees


and each employee is managed by one employee.

Refining The Entity-Relationship Diagram


This section discusses four basic rules for modeling relationships

Entities Must Participate In Relationships


Entities cannot be modeled unrelated to any other entity. Otherwise, when the model was
transformed to the relational model, there would be no way to navigate to that table. The
exception to this rule is a database with a single table.

Resolve Many-To-Many Relationships


Many-to-many relationships cannot be used in the data model because they cannot be
represented by the relational model. Therefore, many-to-many relationships must be
resolved early in the modeling process. The strategy for resolving many-to-many
relationship is to replace the relationship with an association entity and then relate the
two original entities to the association entity. This strategy is demonstrated below where
the many-to-many relationship is shown:
Employees may be assigned to many projects.
Each project must have assigned to it more than one employee.
A. Many-to-Many Relationship Unresolved
In addition to the implementation problem, this relationship presents other problems.
Suppose we wanted to record information about employee assignments such as who
EMPLOYEE
PROJECT
assigned them,
the start date of the assignment, and the finish
date for the assignment.
Given the present relationship, these attributes could not be represented in either
EMPLOYEE or PROJECT without repeating information. The first step is to convert the
relationship assigned to a new entity we will call ASSIGNMENT. Then the original
entities, EMPLOYEE and PROJECT, are related to this new entity preserving the
cardinality and optionality of the original relationships. The solution is shown in the
diagram below:
B. Many-to-Many Relationship Resolved

EMPLOYEE
Terrence Brunton

ASSIGNMENT

PROJECT
77

Lecture Notes on Database


Three

Chapter

A. Ternary Relationship Unresolved


Notice that this changes the semantics of the original relation to:
EMPLOYEE
employees may be given assignments to projects
and projects must be done by more than one employee assignment.
A many to many recursive relationship is resolved in similar fashion.
SKILLL into Binary Relationships
Transform Complex Relationships

PROJECT

Complex relationships are classified as ternary, an association among three entities, or nary, an association among more than three, where n is the number of entities involved.
For example,
Employees can use different skills on any one or more projects.
Each project uses many employees with various skills.
B. Ternary Relationship Resolved
Complex relationships cannot be directly implemented in the relational model so they
shouldEMPLOYEE
be resolved early in the modeling
process. The strategy for resolving
PROJECT
SKILLcomplex
relationships is similar to resolving many-to-many relationships. The complex
relationship is replaced by an association entity with each of the original entities being
related to this new association entity through binary relationships. The solution is shown
in the diagram below.

Terrence Brunton

EMP/PRJ/SKILL

78

Lecture Notes on Database


Three

Chapter

A. Unresolved Redundant Relationship


Eliminate redundant
assignedrelationships
A redundant relationship is a relationship between two entities that is equivalent in
meaning to another relationship between those same two entities that may pass through
an intermediate entity.
DEPARTMENT
EMPLOYEE
WORKSTATION
has
assigned
For example, the diagram below shows a redundant relationship between
DEPARTMENT and WORKSTATION. This relationship provides the same information
as the relationships DEPARTMENT has EMPLOYEES and EMPLOYEES assigned
WORKSTATION. The diagram below shows the solution, which is to remove the
redundant relationship DEPARTMENT assigned WORKSTATIONS.

B. Redundant Relationship Resolved

DEPARTMENT
Terrence Brunton

has

EMPLOYEE

assigned

WORKSTATION
79

Lecture Notes on Database


Three

Chapter

Example of documentation for database for Health Services Unit


UWI, St. Augustine Campus

BACKGROUND

The Health Services Unit is located at the St. Augustine campus of the University of the
West Indies.
The Health Services Unit includes several different sections. The staff members
comprise of medical officers, a nurse, a clerk, a pharmacist and student counselors, which
are all interlinked. The Health Service Unit has the following objectives:

To provide a wide variety of readily accessible, basic health services, including


quality medical diagnosis and treatment, counselling and mental health services.

To provide pharmaceutical supplies.

Terrence Brunton

80

Lecture Notes on Database


Three

Chapter

To provide quality, up-to-date health education programs, materials and services.

To provide preventive medical, and mental health services and health counseling.

To serve as a resource to other campus and community organizations concerned


with health and safety issues.

To assume an active role in Campus Health, Safety and Prevention issues.

To assist students in achieving their optimal physical and emotional wellness by


promoting healthy life choices through health education counselling programs and
activities, which instill knowledge and skills for lifelong learning.

The Health Service Unit offers the following four Clinics - the Walk-in Clinic, Sexual
Health Clinic, Immunization Clinic and Nutrition and Wellness Clinic. There is also
Counseling available as well as a Pharmacy

Proposed System
A thorough analysis of the information gathered about the current system at the Health
Service Unit (HSU) was carried out to determine the compatibility of the HSUs
operations for a database application.
This proposed database focuses on the student medical records generated by the day-today operations of three Clinics: Walk-In-Clinic, Sexual Health Clinic and Immunization
Clinic as follows:
Walk-In-Clinic
Doctor Visits
Prescriptions
Referrals
Sexual Health Clinic
Visits for pap smears, consultations etc.
Contraceptive transactions.
Immunization Clinic
Immunizations

Terrence Brunton

81

Lecture Notes on Database


Three

Chapter

The proposed system has been designed in such a way to integrate the records so that
they can be accessed from one platform, a facility virtually impossible in the paper-based
system that exists. Thus medical personnel at the HSU can access all information about a
patient and make a more informed assessment of the patients medical condition.
The system also allows the staff to generate reports very easily without actually having to
dedicate resources to review medical cards. The system was also designed to be as user
friendly as possible, and this reflects in the design of the database.
It is advisable that a training session be conducted for all users of the system. On-the-job
training may be pursued with the use of the User Guide. The recommended time to
conduct this training session is during the May to August period when student visits are
minimal.
The trainer should be advised of possible resistance to change from personnel who are
unfamiliar with computer technology. This may lead to the resistance to learn due to
negative perceptions about ability, and low self esteem. This can be combated however
by making personnel aware of this phenomenon.
Also, a refresher course should be conducted with changes/developments in the system,
or with any change in personnel. Staff could also benefit from an additional development
training session once a year.

Benefits vs. Limitations


It was proposed that the database approach for organizing, storing and retrieving data
would provide benefits to the Unit as follows:
1. The use of the relational database design model avoids the duplication of data in
various records. For example, storing student profile information (address, phone
number, gender etc.) in one table only instead of separately in the various clinics.
2. Broader data sharing, which is especially useful in allowing any user of the system to
retrieve information about the availability of patient products. For instance, the nurse can
identify the quantity of Hepatitis B (antigen) in stock, or the doctor will be able to
identify whether a prescribed drug is in stock without having to leave his office or make a
telephone call.
3. The database system provides controls that can be used to protect confidential
information, which is critical for a medical provider.

Terrence Brunton

82

Lecture Notes on Database


Three

Chapter

4. Database processing enables increased efficiency of operations. Ad hoc queries may be


made with easy and quick retrieval of information, thereby boosting the productivity of
the staff members at the Health Service Unit.
5. Strategic planning and decision-making can be based on the extensive supply of
information available. For instance, the medical practitioner will be able to identify areas
in which educational programmes are needed and identify which student groups appear
more susceptible to particular ailments or disorders and thus recommend appropriate
treatment.
Some possible limitations of database processing for the HSU were also considered:
1. To facilitate the implementation
of the new system there is a need for hardware and
attends
software. Each doctor, nurse, secretary, and receptionist will need to have ready access to
a computer. Training adds to the cost of the system and would consume time that could
uses
be spent attending to patients.
WALK-IN
MEDICAL
CLINIC
2. The integration of large volumes of organizational
data into a single database is risky
CODES
used by
as the magnitude
of potential loss in the event of unforeseen circumstances is high.
PRODUCT
Backup and recovery procedures to minimize loss would be a necessary additional cost.
used by

3. The behavioural implications of the proposed system are important. Resistance to


change, lack of interest in the utilization of technology and the perceived is
loss
of control
broken
used possible
by
down
into
by organizational members are also
limitations
to
the
application
of
the
proposed
IMMUNIZATION
system.
CLINIC
PRESCRIPTION

DIAGNOSIS

Entity Relationship Diagram


attends
gets

CONTRACEPTIVE

STUDENT
uses
attends
issues
attends
to

gets

REFERRAL

EMPLOYEE
Terrence Brunton

SEXUAL
HEALTH
CLINIC

83

Lecture Notes on Database


Three

Chapter

Walk-in Clinic
Visit Number
Student ID
Date/Time
Symptoms
Diagnosis
Additional Notes

Product
Product ID
Product Name
Supplier
Category Name
Quantity
Unit Price

Medical Code
Medical Code
Diagnosis
Diagnosis
Category

Immunization Clinic

Prescription

Imm. Visit No.


Student ID
Date/Time
Product ID
Product Name
Dose

Prescription No.
Gen. Visit No.
Relational
Schema
Product ID
Quantity
Dosage
Other Instrust

Diagnosis Category
Diagnosis Category
ID

Contraceptive
Contraceptive Trans.
Sex Clinic Visit No.
Product ID
Quantity
Student ID

Student
Student ID
Last Name
First Name
Address
Phone No.
Medical History

Sexual Health Clinic

Employee
Employee ID
Last Name
First Name
Terrence
Brunton
Job Description
Title of Courtesy
Birth Date

Referral
Referral Number
Gen. Visit No.
Student ID
Institution Name
Instructions

Visit Number
Student ID
Date/Time
Visit Category
Diagnosis
Post Dia. Activity

84

Lecture Notes on Database


Three

Chapter

Normalization
Banner Student Profile: Table
Un normalized
Student ID
Blood type

First Name
Gender

Last Name
Nationality

Faculty

Degree
Option

Enrollment
Status

Date of Birth
Residence/Term
Address
Phone Contact

Height
Marital
Status
E-mail
Address

Weight
Home
Address
Photograph

1NF (non repeating groups)


Student ID
Height
Nationality

Terrence Brunton

First Name

Last Name

Date of Birth

Weight
Marital Status

Blood type
Home Address

Gender
Faculty

85

Lecture Notes on Database


Three
Degree Option

Chapter

Enrollment Status

Photograph

(Repeating groups)
Student ID

Phone Contact

E-mail Address

2NF

Student ID

Faculty

Degree Option

Degree Option

Enrollment Status

Enrollment Status

3NF
Faculty

Contraceptives: Table
Un normalized
Sexual clinic visit #

Quantity

Student ID

Product ID

1NF
Sexual clinic visit #
Student ID

Product ID

Quantity

2NF
Product ID
Student ID
3NF
Quantity
Product ID

Terrence Brunton

86

Lecture Notes on Database


Three

Chapter

Employees: Table
Un normalized
Employee ID
Home Phone

First Name
Office Phone

Hire Date

Country

Last Name
Date of Birth
Office Phone Mobile
Extension
E-mail Address Notes

Address
Title of
Courtesy
Photograph

1NF (non repeating groups)


Employee ID
Home Address

First Name
Home Phone

Last Name
Office Phone

Title of Courtesy

Hire Date

Country

Date of Birth
Office Phone
Extension
Photograph

Mobile

E-mail Address

Notes

(Repeating groups)
Employee ID

The table is already in 2NF so we move on.


3NF
Office Phone Extension
Office Phone

Immunization Clinic: Table


Un normalized
Immunization visit#
Product ID

Student ID
Product name

Date
Dosage

1NF
Immunization visit#
Product ID

Terrence Brunton

Date
Product name

Student ID
Dosage

87

Lecture Notes on Database


Three

Chapter

2NF
Immunization visit#

Product ID
Student ID
Dosage

Product name
3NF

Product name

Dosage

Product ID
Prescriptions: Table
Un normalized
Prescription #
Quantity

General Visit #
Dosage

Product ID
Other Instructions

Student ID

1NF (non repeating groups)


Prescription #

General Visit #

Student ID

(Repeating groups)
Prescription #

Product ID

Quantity

Dosage

Other
Instructions

2NF
The table is already in 2NF so we move on.
3NF
Quantity

Dosage

Other Instructions

Product ID
Products: Table

Terrence Brunton

88

Lecture Notes on Database


Three

Chapter

Un normalized
Product ID

Product Name

Supplier ID

Category Name

Unit Price

Units in stock

Units on order

Re-order level

Quantity per
unit
Discontinued

1NF (non repeating groups)


Product ID
Quantity per unit

Product Name
Units in stock

Supplier ID
Re-order level

(Repeating groups)
Product ID

Unit Price

Units on order

2NF (remove partial dependencies)


There are none so we move on.
3NF
There are no non-key dependencies.

Terrence Brunton

89

Lecture Notes on Database


Three

Chapter

Referrals: Table
Un normalized
Referral #
Institution Name

General visit #
Medical practitioner

Student ID
Instructions

1NF
General visit#
Referral #
Student ID

Institution Name

2NF
The table is already in 2NF so we move on.
3NF
Medical practitioner

Instructions

Sexual Health Clinic: Table


Un normalized
Sexual clinic visit #
Student ID

Date/Time
Diagnosis

Visit category
Employee ID
Post
Diagnosis
Action/Notes

1NF
Sexual clinic visit #
Employee ID

Date/Time
Student ID

Visit category

2NF (remove partial dependencies)


There are none so we move on.

Terrence Brunton

90

Lecture Notes on Database


Three

Chapter

3NF
Diagnosis

Post Diagnosis Action/Notes

Student ID

Walk in Clinic: Table


Un normalized
General visit #
Diagnosis

Student ID
Additional Notes

Date/Time
Employee ID

Symptoms

Student ID
Additional Notes

Date/Time
Employee ID

Symptoms

1NF
General visit #
Diagnosis
2NF
Symptoms

Diagnosis

Additional Notes

Student ID
3NF
Diagnosis

Additional Notes

Symptoms

Terrence Brunton

91

Potrebbero piacerti anche