Sei sulla pagina 1di 150

Data Modeling and Database Design

Minder Chen, Ph.D.


mchen@gmu.edu
is assigned to
contains
staffed by
subcontract
member
is a member of
belongs to
Employee
Employee number
First name
Last name
Employee function
Employee salary
Team
Team number
Specialty
Division
Division number
Division name
Division address
Task
Task name
Task cost
Project
Project number
Project name
Project label
Start date
End date
Customer
Customer number
Customer name
Customer address
Customer activity
Customer telephone
Customer fax
Minder Chen, 1993~2002
Data Modeling - 2 -
Data Modeling and Database Design Course Outline
INTRODUCTION
Introduction to Data Modeling
Database Development Life Cycle Overview
ENTITY AND RELATIONSHIP
Develop the Subject Area Diagram
Develop Preliminary Data Model: Entity & Relationship
Identification
ATTRIBUTES AND SUBTYPES
Attributes Identification and Definition
Develop Fully Attributed Data Model
Identifiers
Data Modeling Exercise
Partitioning and Entity Subtypes
NORMALIZATION
Normalization
Normalization Exercise
De-normalization
DATA MODEL EVALUATION AND MAPPING TO RELATIONAL DBMS
Refine a Data Model: Analysis and Simplification
Transform to Physical Data Base Design
PowerDesigner: Data Architect
Pysical DB Design and Data Warehouse DB Design
Minder Chen, 1993~2002
Data Modeling - 3 -
References
Data Modeling and Database Design
1. Batini, Ceri, Navathe, Conceptual Database Design, Redwood City, CA: The
Benjamin/Cummings Publishing Company, Inc., 1992.
2. Teorey, T. J., Database Modeling and Design: The Entity-Relationship
Approach, Morgan Kaufmann Publishers, Inc., 1990.
3. Thomas A. Bruce, Designing Quality Databases with IDEF1X Information
Models, Dorset House Publishing, NY: New York, 1991.
4. Texas Instruments, A Guide to IE Using IEF, 2nd edition, Part No. 2739756-0001,
1990.
5. Martin, James, Information Engineering Book II: Planning and Analysis,
Prentice-Hall Inc., 1989.
6. Dave Ensor, Ian Stevenson, Oracle Design, O'Reilly & Associates, 1997
7. Rob Gillette, etc., Physical Database Design for Sybase SQL Server, Prentice
Hall, 1995.
8. Ralph Kimball, The Data Warehouse Toolkit, Wiley, 1996.

JAD References
1. August, J. H.. J oint Application Design: The Group Session Approach to
System Design. Englewood Cliffs, NY, Prentice Hall, Inc., 1991.
2. Wood, J. and Silver, D. J oint Application Design: How to Design Quality
Systems in 40% Less Time. New York, NY, John Wiley & Sons, 1989.
3. Andrews, D. C. and Leventhal, N. S., Fusion: Integrating IE, CASE, and J AD: A
Handbook for Reengineering the Systems Organization, Englewood Cliffs, NJ:
Yourdon Press, 1993.
Minder Chen, 1993~2002
Data Modeling - 4 -
Data Modeling and Database Design: INTRODUCTION
Systems Development Life Cycle (SDLC)
in a Client/Server Environment
Introduction to Data Modeling
Database Development Life Cycle
Overview
Minder Chen, 1993~2002
Data Modeling - 5 -
Rationales for Data Modeling
Data is the foundation of modern information
systems enabled by data base technologies.
Data in an organization exist and can be described
independently of how these data are used.
Data should be managed as a corporate-wide
resource.
The types of data used in an organization do not
change very much.
Data have certain inherent properties which lead to
correct structuring.
If we structure data according to their inherent
properties, the structure (i.e., data models) will be
stable.
Minder Chen, 1993~2002
Data Modeling - 6 -
History of Data Modeling
Importance of Entity-Relationship Modeling Technique
Database
Data modeling and enterprise-wide data
Data quality
Data updating and accessing tools and procedure
Data sharing culture
ER modeling technique was first developed by Peter Chen
in 1976
A conceptual/logical data modeling tool
A user-oriented approach
A graphic-based method
ER modeling technique is the major data modeling method
in Information Engineering and is widely supported by
most of CASE tools.
Data modeling is the foundation of most database-centered
transaction processing systems and data warehouse
systems
Minder Chen, 1993~2002
Data Modeling - 7 -
CSC Development Strategies
RE-CREATE new business process &
systems from scratch
RE-ENGINEER business process &
systems
RE-DESIGN current systems
RE-HOST current systems
RE-IMAGE current systems
HIGH
LOW
Risk
Long Term Reward
Short Term Costs
Degree of Change

Minder Chen, 1993~2002
Data Modeling - 8 -
Distribution of Business Function (Logic)
Data
Space
Presentation
Service
Presentation
Logic
Function
Logic
Data
Logic
Data
Service
Presentation
Space
Client
Server
Presentation logic
Local input validation
Output production logic
Local peripheral drivers
Performance critical processing
Functions that access data
on the server
Functions that need input
from multiple users
Functions that coordinate
the work of several user
Issues:
Distribution of data
Platform-specific capabilities and interoperability
Connectivity capabilities/platform
Frequency of change to codes
Configuration management
Minder Chen, 1993~2002
Data Modeling - 9 -
C/S Development Methodology
User
Interface
Application
Logic
C/S
Architecture
Conceptual
Analysis
Logical
Design
Physical
Design
SDLC
Work
Flow
Form
Sequences
Forms,
Screens
Process
Flow
Object
Interaction
Model
Programs,
Procedures
performance =>
rules=>
Source: David Vaskevitch, Client/Server Strategies, IDG Books, 1993.
Information
& Data Base
Data
Model
Database
Schema
Tables,
Indexes
Minder Chen, 1993~2002
Data Modeling - 10 -
Client/Server Application Development Methodology
Requirements
Information
& Data Base
Processes
Behavior
Workflow
User Interface
Architecture
Application
Design and
Development
Source: David Vaskevitch, Client/Server Strategies, IDG Books, 1993.
Minder Chen, 1993~2002
Data Modeling - 11 -
Data Modeling (Data Base Design) Process
Information Requirements
Conceptual
DB Design
Logical
DB Design
Physical
DB Design
Conceptual (Enterprise) DB Schema
Logical DB Schema
Physical DB Schema
A conceptual DB schema is a high-
level description of the database,
independent of the particular DBMS.
A logical DB schema is a description of
the structure of the database that can be
processed by a DBMS: relational, network,
or hierarchical.
A physical DB schema is a description of
the implementation of the database in
external memory; it describes the storage
structures and access methods used in
order to effectively access and maintain
data.
Source: Batini, C., Ceri, S., and Navathe, S. B., Conceptual Database Design: An Entity-
Relationship Approach, The Benjamin/Cummings Publishing Company, Inc., 1992.
Minder Chen, 1993~2002
Data Modeling - 12 -
Multiple Perspectives
DATA ACTIVITY
EMPLOYEE
HIRE
EMPLOYEE
PAY
EMPLOYEE
PROMOTE
EMPLOYEE
FIRE
EMPLOYEE
......
....
......
....
ONE
BUSINESS
We do
these things
We use
this data
Minder Chen, 1993~2002
Data Modeling - 13 -
Member Agreement
is enrolled under;
applies to
Club
established by;
established
Member
Order

Product
Promotion
sponsors;
is sponsored by
is featured in;
features
generates;
generated by
sells;
is sold on
placed by;
places
Data Model (Entity Relationship Diagram)
Minder Chen, 1993~2002
Data Modeling - 14 -
Entity Relationship Diagram: Subject Area and Entity Type
Subject Area and Subject Area
Diagram
Entity Types
Entity Instances
Finding Entity Types
Evaluating Entity Types
Minder Chen, 1993~2002
Data Modeling - 15 -
Subject Area (Submodel)
A natural area of interest to the business that is centered
on a major resource, inputs, outputs, or activity of the
business.
It contains a set of entity types.
We start the data modeling in the ISP stage by identifying
subject areas with names and descriptions.
In BAA stage, subject areas are used to as high level
grouping of entity types.
Naming: a subject area is a noun in plural form and often
has the name as the central entity type in the subject area.
Examples:
Project Member Task
Project
Projects
Minder Chen, 1993~2002
Data Modeling - 16 -
Subject Area Diagram
Customers
Purchase
Orders
Buyers
Raw-materials Products
Sales-persons
: Subject Area
: Association
Legends
Orders Suppliers
Minder Chen, 1993~2002
Data Modeling - 17 -
Entity Types
Definition:
An entity is an object or event, real or abstract, about
which we would like to store data. Entity is the
abbreviation of entity type. It represent a set of
entity instances which can be described by the
same set of attribute types. The value of the same
attribute for each entity instance may be different.
Identifying Entity Types
What information is required by the business?
Things that are of interest to the business that need
to be remembered in order to manage and track
them.
Things belong to the same entity type have common
characteristics.
Minder Chen, 1993~2002
Data Modeling - 18 -
Naming Entity Types
The name of each entity is in singular form
a noun
an adjective + a noun
a noun + a noun => (noun string)
an adjective + a noun + a noun
Examples
Customer, Customer Order, Product, Hourly Employee, Project,
Department, Unfilled Customer Order
Be clear and concise
Avoid abbreviation
Be consist with users terminology
Identify synonyms
Customer Client
Product Merchandise
Supplier Vendor
Teacher Faculty
Use one name as the official name and document others
as aliases


Minder Chen, 1993~2002
Data Modeling - 19 -
Exercise: Entity Type Naming
Courses
Department
Customer Order
PO
Minder Chen, 1993~2002
Data Modeling - 20 -
Properties of Entity Types
Name
Description
Identifier
Properties: Estimated number (Max., Min.,
Average) of entity instances
Expected growth rate of entity instances
Subject Area in which the Entity Type
resides
Attributes that describe the Entity Types
Examples of entity type instances
Minder Chen, 1993~2002
Data Modeling - 21 -
Definition of an Entity Type
A poor definition of Customer: Anyone
that buys something from the company.
Can employees be a customer?
Can a leasor be a customer?
If the company sold a subsidiary to another
company, does the new owner consider a
customer?
Good definition should be:
Compatible
Precise
Concise
Clear
Complete
Minder Chen, 1993~2002
Data Modeling - 22 -
Good Definition
Compatible
Customer: An ORGANIZATION that purchase
PRODUCTs for personal use.
Distributor: An ORGANIZATION that purchase
PRODUCTs for resale.
Precision:
With appropriate qualifiers
Example: An ORGANIZATION is considered to have
purchase a PRODUCT when we receive a valid
PURCHASE ORDER from it.
Complete
ORGANIZATION, PRODUCT, PURCHASE ORDER
need to be defined.
Concise and Clear
Use modular definition
Minder Chen, 1993~2002
Data Modeling - 23 -
Example of Entity Type Descriptions
Entity Type Description
Customer
Information about all persons or organizations who
purchases
Product All goods manufactured and sold
Raw-material Components used to manufacture Products.
Supplier Vendors of Raw Materials.
Buyer
Company personnel responsible for purchasing
Raw-Materials from Suppliers
Minder Chen, 1993~2002
Data Modeling - 24 -
Entity Type and Entity Instance (Occurrence)
Entity Types Entity Instance
Vendor ABC Co.
Employee John Smith
Course Intro. to IE
Department Marketing Department

Minder Chen, 1993~2002
Data Modeling - 25 -
Exercise: Entity Types or Entity Instances?
Maryland
Organization Unit
Customer
President
Bill Clinton
Department of Commerce
Address
Minder Chen, 1993~2002
Data Modeling - 26 -
Finding Entity Types
Interviews with users
JAD workshops
Business forms
Reports
Computer files using reverse engineering
Operation manuals
Minder Chen, 1993~2002
Data Modeling - 27 -
Where to Look for an Entity Type?
Tangible or Intangible Things
The nouns that are used to describe the problem domain will often
correspond to the major Entity Types of the system, at least at a
high level.
Examples: Product, Sensor, and Employee, Department, and Sale
Office.
Resources
Any resources that an organization needs to manage should be
represented as an Entity Type. Information assists the efficient
and effective use of other resources through improved decision.
Examples: Inventory, Machine, Bank Account, and Customer.
Roles Played
Roles can be played by persons or organizational units.
Examples: Customers, Managers, and Account representatives.
Events
Events are incidents that occur at points in time. An event often
involved an interaction between two Entity Types or an action that
changes the status of an Entity Type.
Examples: Sale, Delivery, and Registration of a motor vehicle.

Minder Chen, 1993~2002
Data Modeling - 28 -
BIAIT: Business Information Analysis and Integration Technique
Analysis of Orders
Ordered entities can be a thing, a space, or a skill.
View the order from supplier side.
If an organization receives no orders, it has no reason
for existing.
An organization unit can receive multiple types of
orders.
4 questions about the Supplier:
Billing (Cash)?
Deliver Late (Immediate)?
Profile customer?
Negotiate price (Fixed)?
3 questions about the Ordered Entity:
Rented (Sold)?
Tracked?
Made to order (Stock)?
Source: Carlson, W. M., "BIAIT: Business Information Analysis and Integration Technique -
The New Horizon," Data Base, Vol. 10, No. 4, 1979, pp. 3-9.
Minder Chen, 1993~2002
Data Modeling - 29 -
Criteria for Evaluating an Entity Type
Need to be remembered by the information system in order
to be functional.
Can be operated on: CREATE, READ, UPDATE, DELETE.
Has a set of operations/services that always apply to
change the status of each occurrence of an Entity Type.
Carry a set of attributes that always apply to describe each
occurrence of an Entity Type.
Have at least one relationship with other entity type.
Exist more than one entity occurrence (instance) in an
Entity Type.
Have at least a unique identifier.
Domain-based requirements: Something that the system
must have in order to operate. These may be clearly
specified in the problem description or known from subject
matter experts.
Minder Chen, 1993~2002
Data Modeling - 30 -
Entity Relationship Modeling and Diagramming
Relationships
Entity Relationship Diagramming
Notation
Attributes
Identifiers
Partitioning and Entity Subtypes
Minder Chen, 1993~2002
Data Modeling - 31 -
Relationship (Type)
Definition
A Relationship Type is an association among Entity
Types. It indicates that there is a business
relationship between these Entity Types.
Relationship Membership is the participation of an
Entity Type in a Relationship.
In IE, a Relationship Type can involve only two Entity
Types (binary relationship). Some other modeling
techniques allow n-ary relationships.
Examples
CUSTOMER places ORDER
ORDER is placed by CUSTOMER
EMPLOYEE works on PROJECT
PROJECT has project member EMPLOYEE
Minder Chen, 1993~2002
Data Modeling - 32 -
Paring (Relationship Instance)
Entity Types Entity Instance
Student
Student#1
Student#2
Course
Course#A
Course#B
Course#C
Course#D
Relationship Relationship Paring
Student
takes
Course
Student#1 takes Course#A
Student#1 takes Course#B
Student#1 takes Course#D
Student#2 takes Course#A
Student#2 takes Course#C
Student#2 takes Course#D
Relationship paring is a pair of Entity Instances of two
Entity Types associated by a Relationship Type between
these two Entity Types.
Minder Chen, 1993~2002
Data Modeling - 33 -
Relationship Instances Grouping
Definition: A collection of pairings of
a Relationship Membership in which
an Entity Instance is involved.
Examples:
Student#1 takes Course#A, #B, and #D
Student#2 takes Course#A, #C, and #D
Course#A is taken by Student#1 and
Student#2
Minder Chen, 1993~2002
Data Modeling - 34 -
Relationship Cardinality
E1
E2
E1
E1
E2
E2
One-to-Many
Many-to-Many
One-to-One
1:1
1:M
M:N
Minder Chen, 1993~2002
Data Modeling - 35 -
Relationship Cardinality
The number of Entity Instances involved in the Relationship
Instances Grouping in a Relationship Type.
Three Forms of Cardinality
1. One-to-one (1:1)
DEPARTMENT has MANAGER
Each DEPARTMENT has one and only one MANAGER
Each MANAGER manages one and only one DEPARTMENT

2. One-to-many (1:m)
CUSTOMER places ORDER
Each CUSTOMER sometimes (95%) place one or more ORDERs
Each ORDER always is placed by exactly one CUSTOMER

3. Many-to-many (m:n)
INSTRUCTOR teaches COURSE
Each INSTRUCTION teaches zero, one, or more COURSEs
Each COURSE is taught by one or more INSTRUCTORs
Minder Chen, 1993~2002
Data Modeling - 36 -
Entity Relationship Diagram (ERD): Notations
Entity-X Entity-Y
relationship-description
reversed-relation-description
Example
Cardinality
indicator
Department
Manager
is-managed-by
manages
Translate into two structured statements
min max
zero
one
many
Each Entity-X relationship-description cardinality-indicator (one-or-many) Entity-Y
Each Entity-Y reversed-relationship-description (zero-or-one) Entity-Y
Graphical Notations
Minder Chen, 1993~2002
Data Modeling - 37 -
Optionality of Relationship Memberships
Whether all entity instances of both entity
types need to participate in relationship
pairing.
Optionality:
Mandatory
Optional
Example:
CUSTOMER membership is optional
ORDER membership is mandatory
CUSTOMER
ORDER
places
is placed by
Minder Chen, 1993~2002
Data Modeling - 38 -
Relationship Statements
CUSTOMER
ORDER
places
is placed by
Cardinality
indicator
one
one or more
Graphical Notations
Optionality
indicator
zero (sometimes)
one (always)
Each CUSTOMER sometimes places one or more ORDER.
Each ORDER always is placed by one CUSTOMER.
Each Entity X optionality relationship cardinality Entity Y
Minder Chen, 1993~2002
Data Modeling - 39 -
Defining Relationships
Name
Description
Property
Cardinality volumes
Optionality percentage: % of Entity Type X's
instances pairing with Entity Type's Y's
instances
Transferability: A relationship is transferable if
an entity instance can change its pairing within
the same relationship.
TRANSFERABLE: An EMPLOYEE can change to a
different DEPARTMENT.
NON-TRANSFERABLE: An ORDER cannot be
transferred to another CUSTOMER.
Minder Chen, 1993~2002
Data Modeling - 40 -
ERD: More Examples

Employee


Project

manages
is-managed-by
works-for
has-project-members
Part
is-consists-of
contained-in
Customer
Order
places
belongs-to
Product
is-contained-in
contains
(a)
(b)
(c)
Involuted or Looped
Relationship
Parallel
Relationship
Minder Chen, 1993~2002
Data Modeling - 41 -
ERD: Alternative Notations
Order
places
belongs-to
Alternative Notations:
Order
places
belongs-to
Order
Customer
places
Customer
belongs-to
Order
places Customer
1
M
Customer
Minder Chen, 1993~2002
Data Modeling - 42 -
Identifying Relationships
Association between entity types
Entity types that are used on the
same forms or documents.
A description in a business document
that has a verb that relates two entity
types
has
consists of
uses
Minder Chen, 1993~2002
Data Modeling - 43 -
Attributes
Definition
Characteristics that could be used to describe Entity Types and
Relationship Types. However, in IE, relationship types are not
allowed to have attributes.
Naming Conventions:
Names that have business meaning
Don't use abbreviation or possessive case, e.g., PN and
Customer's name
Don't include entity type name because IEF will prefix the attribute
name with entity type name automatically
Use standard format:
Entity Type Name (Qualifiers) Domain Name
Customer Name
Employee Starting Date
Examples
Customer has customer name, address, and telephone number
Product has quantity-on-hand, weight, volume, color, and name.
Employee has SSN, salary, and birthday.
Employee-works-for-project has percentage-of-time, starting-date.
Minder Chen, 1993~2002
Data Modeling - 44 -
Attributes: Notations
Student
Student ID
Student Name
Birth date
Student ID
Course no.
Birth date
enrollment
Student(Student ID, Student Name, Birth Date)
Finding Attributes:
Attributes are identified progressively during BAA phase.
Data Analysis
Activity Analysis
Interaction Analysis
Current Systems Analysis
Employee
Employee number
First name
Last name
Employee function
Employee salary
studentID
name
phone
Student
Minder Chen, 1993~2002
Data Modeling - 45 -
Attribute Value
Definition
Attribute Values are instances of Attributes used to describe
specific Entity Instances
Examples
Customer Number: 011334
Customer Name: Minder Chen
State: VA
Order Total: $23,000
Sale tax: $250
An attribute of an entity type should have only one value
at any given time. (No repeating group)
Avoid using complex coding scheme for an attribute.
For example: PART Number: X-XXX-XXX
Part Type Material Sequence Number
Minder Chen, 1993~2002
Data Modeling - 46 -
Type & Instance
OBJECT TYPE OCCURRENCE
Entity Type Entity Instance
Entity Entity Instance
Entity Type Entity

Relationship (Type) Pairing (Relationship Instance)

Attribute (Type) (Attribute) Value
Minder Chen, 1993~2002
Data Modeling - 47 -
Attribute Source Categories
Basic
Definition: An Attribute Value that cannot be deduced
or calculated.
Examples: Student name and Birthday
Derived
Definition: The Attribute Value can be calculated or
deduced from relationship Groupings or from the
values of other Attributes. The value of a Derived
Attribute changes constantly.
Examples: Student Age, Account Balance, Number of
courses taken.
Designed
Definition: The Attribute is created to overcome the
system constraints. The value of a Designed
Attribute does not change.
Examples: Student ID, Course number.
Minder Chen, 1993~2002
Data Modeling - 48 -
Data Types

Minder Chen, 1993~2002
Data Modeling - 49 -
Properties of Attributes
Name
Description
Attribute Source Category: Basic, Derived, Designed
Domain or data type: Text, Number, Date, Time, Timestamp
Optionality: Mandatory or optional
Length and/or precision
Permitted Values (Legal Values)
Ranges
A set of values (Code Table)
Default value or algorithm

Tools such as PowerBuilder has additional properties for
tables columns called extended attributes
Validation Rule
Editing Format
Reporting Format
Column Heading
Form Label
Code Table
Minder Chen, 1993~2002
Data Modeling - 50 -
Composite Attribute
Definition:
Example:
Telephone Number =
Area code + Exchange + Extension
There is no support of composite attribute
type most of CASE tools. In such case,
an composite attribute must be stored as
an entity type.
Minder Chen, 1993~2002
Data Modeling - 51 -
Domain
A collection of values which can be taken by one
or more attributes.
Date is the domain for Ordered Date, Student's
Birthday, Employee Starting Date.
A used defined domain can have customized
validation rules and formats.
CASE tools such as IEF only supports the
following basic domains:
Text
Number
Date
Time
Timestamp
Minder Chen, 1993~2002
Data Modeling - 52 -
Identifiers
The identifier of an entity type is a set of
attributes and/or relationships whose
values can uniquely identify an entity.
Entity types should have one identifier.
Identifiers may consist of
A single attribute: Student ID
A set of attributes: Students ID + Course ID
An attribute and a relationship membership
(implemented as a foreign Key): Order Item No +
Order Has Order Item
Minder Chen, 1993~2002
Data Modeling - 53 -
Identifying Relationship
order
order
item
is part of
contains
customer
product
places
is placed by
is ordered by
has
ORDERS
Symbol for
Identifying Relationship
Minder Chen, 1993~2002
Data Modeling - 54 -
Data Modeling Case Study
The following is description by a pharmacy owner:
"Jack Smith catches a cold and what he suspects is a
flu virus. He makes an appointment with his family
doctor who confirm his diagnosis. The doctor
prescribes an antibiotic and nasal decongestant
tablets. Jack leaves the doctor's office and drives to
his local drug store. The pharmacist packages the
medication and types the labels for pill bottles. The
label includes information about customer, the doctor
who prescribe the drug, the drug (e.g., Penicillin),
when to take it, and how often, the content of the pill
(250 mg), the number of refills, expiration date, and the
date of purchase."

Please develop a data model for the entities and relationships
within the context of pharmacy. Also develop a definition
for "prescription". List all your underlying assumptions
used in your data models.
Minder Chen, 1993~2002
Data Modeling - 55 -
Data Modeling Case Study
Given the following narrative description of entities and
their relationships, prepare a draft entity relationship
diagram (ERD). Be sure any reasonable assumptions that
you are making.
Burger World Distribution Center serves as a supplier
to 45 Burger World franchises. You are involved with
a project to build a database system for distribution.
Each franchise submits a day-by-day projection of
sales for each of Burger World's menu products - the
products listed on the menu at each restaurant - for
the coming month. All menu product require
ingredients and/or packaging items. Based on
projected sales for the store, the system must
generate a day-by-day and ingredients need and then
collapse those needs into one-per-week purchase
requisitions and shipments.

Minder Chen, 1993~2002
Data Modeling - 56 -
Data Modeling Process
List entity types
Create relationships
Pick a central entity type
Work around the neighborhood
Add entity types to the diagram
Build relationships among them
Determine cardinalities of relationships
Find/Create identifiers for each entity type
Add attributes to the entity type in the data
model
Analyze and revise the data model
Minder Chen, 1993~2002
Data Modeling - 57 -
Classifying Attribute and Partitioning
An Entity Subtype A collection of Entities of the same
type to which a narrower definition and additional
Attributes and Relationships apply. An Entity Subtype
inherits (retains) all the Attributes and Relationships of
its parent Entity Type.
Classifying Attribute: An attribute of the Base Entity Type
whose values partition the Entity Instances into
Subtypes.
Partitioning: A basis for subdividing one entity type into
subtypes. The process of dividing an Entity Type into
several Subtypes based on a Classifying Attribute is
called Partitioning.
The Classifying Attribute is recorded as a property of the
Partitioning and it appears on the diagram.
Minder Chen, 1993~2002
Data Modeling - 58 -
Characteristics of Partitioning
Optionality:
Mandatory: Every Entity instances of the Entity Type
must fall into one of the Subtype categories.
Optional: Not every Entity instances of the Entity
Type must fall into one of the Subtype categories.
Entity Life Cycle: The states through which an
Entity Type can pass are used for Partitioning.
Enumeration:
Fully enumerated
Not fully enumerated
Classifying Attributes and Values
Classifying Attribute: Type
D: Domestic Subtype
F: Foreign Subtype
Minder Chen, 1993~2002
Data Modeling - 59 -
Partitioning and Entity Subtype: Notation
Employee
Seminar
Lecturer
ATTRIBUTE:
Employee ID
Name
Birthday
ATTRIBUTE:
Teaching Quality Indicator
Teaches
Staff
Type
Wage
Hourly
Status
Minder Chen, 1993~2002
Data Modeling - 60 -
Alternative Notations for Subtypes
employeeID
name
phone
full-time-emp
employeeID (FK)
salary
part-time-emp
employeeID (FK)
hourly-rate
employee type
Complete Category
All categories shown
Savings
Rate
Checking
Fees
Account
Account Number
Name
IDEF1X
PowerDesigner
Minder Chen, 1993~2002
Data Modeling - 61 -
Entity Subtype Partitioning
Order
Taken
Scheduled
Order Status
Shipped
Billed
Paid
Life Cycle Partitioning
Minder Chen, 1993~2002
Data Modeling - 62 -
Normalization
A data base is a model or an image of the
reality.
Logical Data Base Design is a process of
modeling and capturing the end-user
views of an application domain and
synthesis them into a data base structure.
Normalization is a logical data base design
method.
The basis for normalization is the
functional dependencies among attributes
in a table.
Minder Chen, 1993~2002
Data Modeling - 63 -
SQL Terminology
p_no product_name quantity price
101 Color TV 24 500
201 B&W TV 10 250
202 PC 5 2000
CREATE TABLES
(p_no CHAR(5) NOT NULL,
product_name CHAR(20),
quantity SMALLINT,
price DECIMAL(10, 2));

Create a table in SQL
Product Table
Row
Column
Minder Chen, 1993~2002
Data Modeling - 64 -
SQL Terminology
Set Theory Relational DB File Example
Relation Table File Product_table
Attribute Column Data item Product_name
Tuple Row Record Product_101's info.
Domain Pool of legal values Data type DATE
Minder Chen, 1993~2002
Data Modeling - 65 -
SQL Principles
The result of a SQL query is always a table (View
or Dynamic Table)
Rows in a table are considered to be unordered
Dominate the markets since late 1980s
Can be used in interactive programming
environments
Provide both data definition language (DDL) and
data manipulation language (DML)
A non-procedural language
Can be embedded in 3GL:
Embedded SQL
Dynamic SQL
Minder Chen, 1993~2002
Data Modeling - 66 -
SQL: Data Definition Language (DDL)
CREATE
DROP
TABLE
VIEW
INDEX
DATABASE
ALTER
TABLE
Minder Chen, 1993~2002
Data Modeling - 67 -
SQL: Introduction
A relational data base is perceived by its users
as a collection of tables
E. F. Codd 1969
Dominate the markets since late 1980s
Strengths:
Simplicity
End-user orientation
Standardization
Value-based instead of pointer-based
Endorsed by major computer companies
Most CASE products support the development
of relational data base centered applications
Minder Chen, 1993~2002
Data Modeling - 68 -
SQL: Data Manipulation Language (DML)
SELECT
UPDATE
INSERT
DELETE
p_no product_name quantity price
101 Color TV 24 500
201 B&W TV 10 250
202 PC 5 2000
SELECT [DISTINCT] column(s)
FROM table(s)
[WHERE conditions]
[GROUP BY column(s) [HAVING condition]]
[ORDER BY column(s)]
The Generic Form of the SELECT Statement
Minder Chen, 1993~2002
Data Modeling - 69 -
Database Table
The following code retrieves only the Last Name and the
Employee ID where the Employee ID is greater than 5. The
records are retrieved in descending order.
SELECT LastName, EmployeeID
FROM Employees
WHERE EmployeeID > 5
ORDER BY EmployeeID DESC

Minder Chen, 1993~2002
Data Modeling - 70 -
WHERE Clause
WHERE: Use the Where clause to limit the
selection. The # symbol indicates literal date
values.

SELECT * FROM Employees
WHERE LastName = "Smith"

SELECT Employees.LastName FROM Employees
WHERE Employees.State in ('NY','WA')

SELECT OrderID FROM Orders
WHERE OrderDate BETWEEN #01/01/93# AND
#01/31/93#

Minder Chen, 1993~2002
Data Modeling - 71 -
Keys
A key, also called identifier, is an Attribute or a
Composite Attribute that can be used to
uniquely identify an instance of an entity type.
Examples:
Entity Type Key
Warehouse Warehouse Number
Product Product Number

Student Student ID or SSN
Ship Name and Port of Registration
Stock of Product Product Number and Warehouse No.
Minder Chen, 1993~2002
Data Modeling - 72 -
Types of Key
Primary Key: A unique key is an attribute or a
set of attributes that has been used by the DBMS
as the identifier of a table.
Candidate (Alternative) Key: An attribute or a set
of attributes that could have been used as the
primary key of a table.
Secondary (Index) Key: An attribute or a set of
attributes that has been used to construct the
data retrieval index.
Concatenated (Combined or Composite) Key: A
set of attributes that has been used as the key.
Foreign Key: An attribute or a set of attributes
that is used as the primary key in another table.
Minder Chen, 1993~2002
Data Modeling - 73 -
Purposes of Normalization
Avoid maintenance problems such as
Update .
Insert: There may be no place to insert new
information.
Delete: Some important information will be
lost by deletion.
Update: Inconsistency may occur because
of the existence of data redundancy.
Provide maximum flexibility to meet future
information needs by keeping tables
corresponding to object types in their
simplified forms.
Minder Chen, 1993~2002
Data Modeling - 74 -
A Common Sense Approach to Normalization
Don't rush to put all the information in one
table.
Create a table to correspond to a class of
a simple object type that should exist by
itself, i.e., "one fact in one place."
Include common fields (links) as ways of
joining information from several related
tables.
Avoid redundancy by using links to
retrieve data from related tables.
Minder Chen, 1993~2002
Data Modeling - 75 -
Normalization Theory
Normalization is a process of systematically
breaking a complex table into simpler ones.
It is built around the concept of normal forms.
A relation is in a particular normal form if it
satisfies a specific set of constraints such as
dependencies among attributes in the relation.
For x is an integer and x > 1,
if a relation is in x-NF than it is in (x-1)-NF.
Higher order normal forms are usually more
desirable than lower order normal forms.
Normalization process usually starts from
complex relations which are usually drawn
from some existing documents such as
business forms.
Minder Chen, 1993~2002
Data Modeling - 76 -
A Business Form

Minder Chen, 1993~2002
Data Modeling - 77 -
An Informal Example of Normalization
A CUSTOMER ORDER contains the following
information:
OrderNo
OrderDate
CustNo
CustAddress
CustType
Tax
Total
one or more than one Order-Item which has
ProductNo
Description
Quantity
UnitPrice
Subtotal.
Minder Chen, 1993~2002
Data Modeling - 78 -
Solution
Unnormalized table
Remove repeating group
1st NF
2nd NF
3rd NF
Remove partial FD
Remove transitive FD
(OrderNo, OrderDate, CustNo, CustAddress, CustType, Tax, Total)

(OrderNo, ProductNo, Description, Quantity, UnitPrice, Subtotal)

(ProductNo, Description, UnitPrice)

(OrderNo, ProductNo, Quantity, UnitPrice, Subtotal)

(OrderNo, OrderDate, CustNo, Tax, Total)

(CustNo, CustAddress, CustType)

(OrderNo, OrderDate, CustNo, CustAddress, CustType, Tax, Total,
1{ProductNo, Description, Quantity, UnitPrice,Subtotal}n)

Minder Chen, 1993~2002
Data Modeling - 79 -
Unnormalized Form
A relation that has multi-valued attributes (repeating
groups).
Normalization Process: Remove Multi-value Attributes
If an unnormalized relation R has a primary key K and a
multi-value attribute M, the normalization process is:
The multi-value attribute M should be removed from R.
A new relation will be created with (K,M) as the primary key of
the relation.
There may be some other attributes associated with this new
relation.
R will then be at least in 1NF.
Example: An Employee relation has an attribute
language-spoken. For some employees there may be
more than one language that they can speak.
EMP (employeeID, empName, empAddress, (language1, language2, ...))

EMP (employeeID, empName, empAddress)
EMP-LANGUAGE (employeeID, language, skillLevel)
Minder Chen, 1993~2002
Data Modeling - 80 -
How Do You Remove the Repeating Groups?
CREATE TABLE MEM_CONDITION (
MEMBER# VARCHAR2(12) NOT NULL,
CASE# VARCHAR2(16) NOT NULL,
DIAG_ARRAY_1 VARCHAR2(6) NOT NULL,
DIAG_ARRAY_2 VARCHAR2(6) NOT NULL,
DIAG_ARRAY_3 VARCHAR2(6) NOT NULL,
DIAG_ARRAY_4 VARCHAR2(6) NOT NULL,
DIAG_ARRAY_5 VARCHAR2(6) NOT NULL,
DIAG_EX_ARRAY_1 VARCHAR2(2) NOT NULL,
DIAG_EX_ARRAY_2 VARCHAR2(2) NOT NULL,
DIAG_EX_ARRAY_3 VARCHAR2(2) NOT NULL,
DIAG_EX_ARRAY_4 VARCHAR2(2) NOT NULL,
DIAG_EX_ARRAY_5 VARCHAR2(2) NOT NULL,
DRUG_ARRAY_1 VARCHAR2(12) NOT NULL,
DRUG_ARRAY_2 VARCHAR2(12) NOT NULL,
DRUG_ARRAY_3 VARCHAR2(12) NOT NULL,
DRUG_ARRAY_4 VARCHAR2(12) NOT NULL,
DRUG_ARRAY_5 VARCHAR2(12) NOT NULL,
LC_ARRAY_1 VARCHAR2(4) NOT NULL,
LC_ARRAY_2 VARCHAR2(4) NOT NULL,
LC_ARRAY_3 VARCHAR2(4) NOT NULL,
LC_ARRAY_4 VARCHAR2(4) NOT NULL,
LC_ARRAY_5 VARCHAR2(4) NOT NULL,
MEM_REVIEW VARCHAR2(4) NOT NULL,
OP# VARCHAR2(4) NOT NULL,
PROC_ARRAY_1 VARCHAR2(6) NOT NULL,
PROC_ARRAY_2 VARCHAR2(6) NOT NULL,
PROC_ARRAY_3 VARCHAR2(6) NOT NULL,
PROC_ARRAY_4 VARCHAR2(6) NOT NULL,
PROC_ARRAY_5 VARCHAR2(6) NOT NULL,
PROV_ARRAY_1 VARCHAR2(12) NOT NULL,
PROV_ARRAY_2 VARCHAR2(12) NOT NULL,
PROV_ARRAY_3 VARCHAR2(12) NOT NULL,
PROV_ARRAY_4 VARCHAR2(12) NOT NULL,
PROV_ARRAY_5 VARCHAR2(12) NOT NULL,
REC_TYPE VARCHAR2(2) NOT NULL,
SP_ARRAY_1 VARCHAR2(4) NOT NULL,
SP_ARRAY_2 VARCHAR2(4) NOT NULL,
SP_ARRAY_3 VARCHAR2(4) NOT NULL,
SP_ARRAY_4 VARCHAR2(4) NOT NULL,
SP_ARRAY_5 VARCHAR2(4) NOT NULL,
TRANSCODE VARCHAR2(2) NOT NULL,
TT_ARRAY_1 VARCHAR2(4) NOT NULL,
TT_ARRAY_2 VARCHAR2(4) NOT NULL,
TT_ARRAY_3 VARCHAR2(4) NOT NULL,
TT_ARRAY_4 VARCHAR2(4) NOT NULL,
TT_ARRAY_5 VARCHAR2(4) NOT NULL,
VOID VARCHAR2(2) NOT NULL,
YMDEFF VARCHAR2(8) NOT NULL,
YMDEND VARCHAR2(8) NOT NULL,
YMDTRANS VARCHAR2(8) NOT NULL,
PRIORITY VARCHAR2(2) NOT NULL
);
Minder Chen, 1993~2002
Data Modeling - 81 -
Functional Dependency
Notation: R.X => R.Y
Definition: Attribute Y of Relation R is
functionally dependent on the
Attribute X of Relation R when there
is each value of R.Y associated with
no more than one value of R.X. R.X
and R.Y may be composite attributes.
Description:
R .Y is functionally dependent on R.X
R.X functionally determines R.Y
Minder Chen, 1993~2002
Data Modeling - 82 -
Full & Partial Dependency
R.A => R.B
If B is not functionally dependent on
any subset of A (other than A itself), B
is fully dependent on A in R.
If B is functionally dependent on a
subset of A (other than A itself), B is
partially dependent on A in R.
Minder Chen, 1993~2002
Data Modeling - 83 -
First Normal Form (1NF)
A relation R is in the first normal form (1NF) if and only if all
attributes of any tuple in R contain only atomic values.
Normalization Process:
Remove Partial Functional Dependencies
If R is in 1NF and has a composite primary key (K1,K2), an attribute
P is functionally dependent on K1 (K1 => P) (i.e., P is partially
dependent on (K1, K2)), the normalization process is:
The attribute P should be removed from R and a new relation will
be created with K1 as the primary key and P as a non-key attribute.
A relation that is in 1NF and not in 2NF must have a composite
primary key.
Example
Supplier-Part relation has attributes supplier#, part#, qty, city,
distance, where (supplier#, part#) is the key.
City is partially dependent on supplier#.
SUPPLIER-PART (supplier#, part#, qty, city, distance)

SUPPLIER-PART (supplier#, Part#, qty)
SUPPLIER (supplier#, city, distance)
Minder Chen, 1993~2002
Data Modeling - 84 -
Non-loss Decomposition
Normalization is a reduction (decomposition)
process that replaces a relation by suitable
projections. Each of the projection is a new
relation that is in a further normalized form than
the original relation. The collection of
projections is equivalent to the original relation.
The original relation can always be recovered by
taking the natural join of these projections.
Any information that can be derived from the
original relation can also be derived from the
further normalized relations. The converse is not
true.
The process is reversible because no
information is loss in the reduction process.
Minder Chen, 1993~2002
Data Modeling - 85 -
Transitive Dependency
In a relation R,
if R.A =>R.B and R.B => R.C
then attribute C is said to be transitively
dependent on attribute A.
Minder Chen, 1993~2002
Data Modeling - 86 -
Second Normal Form (2NF)
A relation R is in the second normal form (2NF) if and
only if it is in 1NF and every non-key attribute is fully
dependent on the primary key.
Normalization Process: Remove Transitive
Dependencies
If R is in 2NF and has two non-key attributes A1 and A2
where A2 is functionally dependent on A1 (A1 => A2).
The A2 should be removed from R and a new relation
will be created with A1 as the primary key and A2 as a
non-key attribute.
Example
Supplier relation has attributes supplier#, city, distance, where
supplier# is the key and distance to a supplier can be
determined by the city of the supplier.
SUPPLIER (supplier#, city, distance, quality_level)

SUPPLIER (Supplier#, city, quality_level)
CITY-DISTANCE (city, distance)
Minder Chen, 1993~2002
Data Modeling - 87 -
Third Normal Form (3NF)
A relation R is in the third normal form (3NF) if
and only if the non-key attributes (if there is any)
are fully dependent on the primary key of R (i.e.,
R is in its 2NF) and are mutually independent.
Heuristic to Check Whether a Relation Is in 3NF
All the non-key attributes (which are not multi-value
attributes) are dependent on the (primary) key, the
whole key, and nothing but the key.

All the non-key attributes have atomic value and dependent on the key
(1NF - No multi-value attribute),
the whole key, (2NF - No Partially Functional Dependency)
and nothing but the key (3NF - No Transitive Functional Dependency)
Explanation
Minder Chen, 1993~2002
Data Modeling - 88 -
Normalization Process
F G
Unnormalized Form
A
F G H
1NF
H A
B C D E
2NF
A
F G
3NF
F H
3NF
A
B C D E
A
B
D E
3NF
3NF
remove transitive dependencies
remove partial dependencies
remove repeating groups
D
C
Minder Chen, 1993~2002
Data Modeling - 89 -
Normalization: Pros and Cons
Pros
Reduce data redundancy & space required
Enhance data consistency
Enforce data integrity
Reduce update cost
Provide maximum flexibility in responding ad hoc queries
Cons
Many complex queries will be slower because joins have to be
performed to retrieve relevant data from several normalized
tables
Programmers/users have to understand the underlying data
model of an database application in order to perform proper
joins among several tables
The formulation of multiple-level queries is a nontrivial task.
Minder Chen, 1993~2002
Data Modeling - 90 -
J oin Two Tables
SELECT Categories.CategoryName, Products.ProductName
FROM Categories, Products
WHERE Products.CategoryID = Categories.Category ID


Minder Chen, 1993~2002
Data Modeling - 91 -
Tables in Relational DB
ID
ID ID
Identify Primary Keys and Foreign Keys in the
following Tables!!!
Minder Chen, 1993~2002
Data Modeling - 92 -
J oin Tables
SELECT Orders.OrderID, Orders.CustID,
LastName, Firstname, Orders.ItemID, Description
FROM Customer, Orders, Inventory
WHERE Customer.CustID = Orders.CustID AND
Orders.ItemID = Inventory.ItemID
ORDER BY CustID, Orders.ItemID
ID ID
Minder Chen, 1993~2002
Data Modeling - 93 -
Foreign Keys & Primary Keys in a Sample Access Database

Minder Chen, 1993~2002
Data Modeling - 94 -
An Example of a Complex Query

SELECT customer_name, customer_phone
FROM customer
WHERE customer_number IN
SELECT customer_number
FROM order
WHERE order_no IN
SELECT order_no
FROM orderItem
WHERE product_number = 007
Please list name and phone number of customers
who have ordered product number 007.
Minder Chen, 1993~2002
Data Modeling - 95 -
Denormalization
The process of intentionally backing away from
normalization to improve performance. Denormalization
should not be the first choice for improving performance
and should only be used for fine tuning a database for a
particular application.
Requirements
Prior normalization
Knowledge of data usage
Benefits
Minimize the need for joins
Reduce number of tables
Reduce number of foreign keys
Reduce number of indices
Knowledge of Data Usage
How often are two data items needed together
How many rows are involved
How volatile is denormalized data
How important is visibility of data to users
What is the minimum response time and frequency of an query
Minder Chen, 1993~2002
Data Modeling - 96 -
De-normalization: An Example
Where:
R1 (ProductNo, SupplierNo, Price)
R2 (SupplierNo, Name, Address, Phone)
R1*R2 (ProductNo, SupplierNo, Name, Address, Phone, Price)
R2 should be kept to prevent data loss.
Data redundancy in R1*R2 and R2 could cause potential
data inconsistency problems if the redundant data in
these two tables are not maintained properly.
R1
R2
JOIN
R1 * R 2
R2
Denormalization
Minder Chen, 1993~2002
Data Modeling - 97 -
Data Model Refinement and Transformation
Data Model Refinement
Associative Entity Type
Removing Many-to-Many Relationships
Keys
Transformation to Relational Databases
Minder Chen, 1993~2002
Data Modeling - 98 -
Refinement of a Data Model: Analysis and Simplification
Isolated Entity Type
Solitary Entity Type
One-to-One Relationship
Redundant Relationship
Multi-Valued Attributes
Attribute with Attributes
Many-to-Many Relationship
Minder Chen, 1993~2002
Data Modeling - 99 -
Isolated Entity Type
An Entity Type that does not participate in a
Relationship.
Since every Entity Type should participate in at
least one Relationship, there exist two
alternatives:
Identify a relevant Relationship
Remove the Entity Type from the model
Minder Chen, 1993~2002
Data Modeling - 100 -
Solitary Entity Type
An Entity Type that has only one Entity Instance.
Examples: Computer Center, Sales Tax, and Current
Order Number. Solitary Entity Types may be too
restrictive.
Alternatives:
Introduce another Entity Type with a wider scope.
Computer Center ==> Organization Unit

Define it as an Attribute of an Entity Type.
Sales Tax ==> Sales Tax of Order

Define it as a data element in an parameter table. A parameter
table has only one row.
Current Order Number ==> Current Order Number of Parameter
Table
Minder Chen, 1993~2002
Data Modeling - 101 -
Evaluate One-to-One Relationship
Purchase
Request
Purchase
Order
becomes
has request
Maybe Incorrect
Purchase
Order
Correct
It may be an unnecessary relationship between
two Entity Types if they have the same attribute
and relationships (i.e., they are identical).
It should be then combined into one Entity Type.
Minder Chen, 1993~2002
Data Modeling - 102 -
Redundant Relationship
order
is part of
contains
customer
places
is placed by
is ordered by
has
ORDERS
Is this relationship redundant?
has ordered
product
order
item
Differences in timing of an entity type in its life cycle:
Implemented as separate entity types or use subtypes
Use value of attributes or additional attributes to differentiate them


Minder Chen, 1993~2002
Data Modeling - 103 -
Redundant Relationship
Product
stocks
Redundant
Non-redundant
Warehouse
is contained in
contains
Order
Order Line
is contained in
contains
Order History
is contained in
contains
Customer
is contained in
contains
is placed by
places
is held as
holds
is held in
contains
Stock
Product
Minder Chen, 1993~2002
Data Modeling - 104 -
Multi-Valued Attribute
Definition
An Attribute that may have more than one value at a time is called a
multi-valued attribute.
Solution:
Create an Entity Type for the multi-valued attribute
Example:
Languages spoken by an Employee

Employee(ID, Name, Phone, Languages)
Employee(111, John Smith, 201-999-8888, (English, Chinese))


Employee(ID, Name, Phone)
Employee(111, John Smith, 210-999-8888)

Employee_language(ID, Language)
Employee_language(111, English)
Employee_language(111, Chinese)
Minder Chen, 1993~2002
Data Modeling - 105 -
Attribute with Attributes
An Attribute that can be described by other
Attributes is called an attribute with
attributes.
Example:
College Degree by an Employee
(John Smith has a College Degree in Computer
Sciences from George Mason University)
Solution:
Create an Entity Type to avoid an Attribute with
Attributes.
Add new attributes to the existing Entity Type.
Minder Chen, 1993~2002
Data Modeling - 106 -
Associative Entity Type
An Associative Entity Type is an Entity Type
whose existence is meaningful only if it
participates in several (>=2) Relationship Types
at the same time.
Associative Entity Types are often introduced to
represent additional information in many-to-
many Relationships or to decompose a many-to-
many Relationship into two one-to-many
Relationships.
Associative Entity Types are also used to
represent n-ary Relationships in a binary data
model.
Minder Chen, 1993~2002
Data Modeling - 107 -
Remove Many-to-Many Relationship
Order
contains
belongs-to
Order
Product
contains
is contained in
Why?
has
belongs to
How?
A many-to-many relationship can be decomposed into two
one-to-many Relationships by creating an Associative Entity
Type between the existing two Entity Types.
There is no place to attach Attributes that are required to describe a
many-to-many Relationship.
It is difficult to translate many-to-many Relationships into relational
tables automatically.
Given
Product
Order Line
Minder Chen, 1993~2002
Data Modeling - 108 -
Remove Many-to-Many Relationships: Exercises
Supplier
has-sources
offers
Remove the many-to-many relationship from the
following ER diagrams
Course
takes
is-taken-by
Part
Student
consists-of
is-contained-in
(a)
(b)
(c)
Product
Minder Chen, 1993~2002
Data Modeling - 109 -
Bills of Material
Part
consists-of
is-a-component-in
Product Structure
Product-Structure(Parent Part No, Child Part No, Quantity)
A
B
C
D E
D F
2
1
1
3
2
2
A B 2
A C 1
B D 1
B E 3
C D 2
C F 2
Minder Chen, 1993~2002
Data Modeling - 110 -
Using an Associative Entity Type to Represent an N-ary Relationship
Product Project
Supplier
is used in
Product Usage
supplies
uses
Product Project
Supplier
involved in
product usage
involved in
product usage
involved in
product usage
Product Usage is an Associative Entity Type for a 3-ary Relationship.
Minder Chen, 1993~2002
Data Modeling - 111 -
Translate Data Models to Relational Tables
Order
Product
contains
is contained in
Order Line
has
belongs to
Given
Key: Order#
Attribute:
Order date
Customer ID
Sale Person ID
Key: Order#+Product#
Attribute:
Quantity
Unit Price

Key: Product#
Attribute:
Description
Qty-on-hand
Unit Price


CREATE TABLE ORDER
(OrderNo CHAR(10) NOT NULL,
OrderDate DATE,
CustomerID CHAR(10),
SalePersonID CHAR(10));
Relational Tables Created
Minder Chen, 1993~2002
Data Modeling - 112 -
Transformation of Data Models to Relational Database Tables
The entire, or part of, a data (entity-relationship)
model can be translated into a normalized
database design.

Objects Created
At most one relational database
One or more relations (tables)
Data structures (DDL) representing the elements
(attributes) and the primary key of each relation
Data type of each data elements
Minder Chen, 1993~2002
Data Modeling - 113 -
Heuristics of Transformation
A table is created for each Entity Type in the ER diagram.
A table is created for each multi-valued attribute.
Relationship Types are implemented as tables or as foreign
keys in other tables.
Many-to-many relationship types are translated into tables.
Foreign keys are used for implementing one-to-one and
one-to-many Relationship Types.
For one-to-many Relationship Types, the foreign key is
placed in the table that represents the Entity Type on the
"many" end of the Relationship Type.
For identifying one-to-many Relationship Types, the PK of
the "one" table migrate to the "many" table as a FK and the
FK is also part of the PK of the "many" table.
For non-identifying one-to-many Relationship Types, the
PK of the "one" table migrate to the "many" table as a FK
and the FK is a non-key attribute of the "many" table.
Minder Chen, 1993~2002
Data Modeling - 114 -
PowerDesign: Data Architect
Generation/Reverse Engineering:
CDM, PDM
Target DBMS
Generation & Reverse Engineering:
Triggers & Stored Procedures
Database Structure
Target
4GL Tool
Extended Attributes
Database Structure
Generation & Reverse Engineering:
http://www.powersoft.com/
Minder Chen, 1993~2002
Data Modeling - 115 -
PowerDesigner

Minder Chen, 1993~2002
Data Modeling - 116 -
A Sample Conceptual Data Model

Is member of
supervises
Is manager of
Uses
Subcontract
composes composed of
Division
Division number
Division name
Division address
Employee
Employee number
First name
Last name
Employee function
Employee salary
Customer
Customer number
Customer name
Customer address
Customer activity
Customer telephone
Customer fax
Project
Project number
Project name
Project label
Team
Team number
Speciality
Task
Task name
Task cost
Material
Material number
Material name
Material type
Participate
Start date
End date
Conceptual Data Model
Project : Management
Model : Project Management
Author : User Version 6.x 7/21/98
Activity
Start date
End date
Minder Chen, 1993~2002
Data Modeling - 117 -
Notations
Division
Division number
Division name
Division address
Employee
Employee number
First name
Last name
Employee function
Employee salary
Employee
Employee number
First name
Last name
Employee function
Employee salary
Entity
R
e
l
a
t
i
o
n
s
h
i
p

One-to-many
Minder Chen, 1993~2002
Data Modeling - 118 -
More on Relationships
member
is a member of
Employee
Employee number
First name
Last name
Employee function
Employee salary
Team
Team number
Specialty
A project 'contains one or more tasks, and a task's
existence is dependent on the project.
Many-to-many cardinality

Project
Project number
Project name
Project label
Task
Task name
Task cost
Minder Chen, 1993~2002
Data Modeling - 119 -
Advanced Concepts

Savings
Rate
Checking
Fees
Account
Account Number
Name
Employee
Employee number
First name
Last name
Employee function
Employee salary
Reflexive relationship
Subtype
composes composed of
Material
Material number
Material name
Material type
Minder Chen, 1993~2002
Data Modeling - 120 -
Define Entities

Minder Chen, 1993~2002
Data Modeling - 121 -
Define Attributes

Minder Chen, 1993~2002
Data Modeling - 122 -
Check Parameters

Minder Chen, 1993~2002
Data Modeling - 123 -
Relationship Definition

Minder Chen, 1993~2002
Data Modeling - 124 -
Dependent (Identifying Relationship)
Check the box to
indicate a
dependent
relationship. "One
to many" and
"mandatory" are
automatically
chosen as the
cardinality and
optionality.
At the physical data
model level, the
parent entity type's
primary key (PK) will
become part of the
dependent child
entity type's PK. It
is also a foreign key.
Minder Chen, 1993~2002
Data Modeling - 125 -
Inheritance (Super-Type and Sub-Type)
Minder Chen, 1993~2002
Data Modeling - 126 -
Generate Physical Data Model

Minder Chen, 1993~2002
Data Modeling - 127 -
Physical Data Model
DIVNUM = DIVNUM
EMPLOYEE
EMPNUM <pk>
DIVNUM <fk>
EMPFNAM
EMPLNAM
EMPFUNC
EMPSAL
DIVISION
DIVNUM <pk>
DIVNAME
DIVADDR
DIVNUM automatically migrates as a foreign key.
belongs to
Employee
Employee number
First name
Last name
Employee function
Employee salary
Division
Division number
Division name
Division address
Conceptual
Data Model
Physical
Data Model
T
r
a
n
s
f
o
r
m
a
t
i
o
n

Do not define FK
as an attribute.
Minder Chen, 1993~2002
Data Modeling - 128 -
Dependent Relationship
PRONUM = PRONUM
PROJECT
PRONUM <pk>
CUSNUM <fk>
EMPNUM <fk>
ACTBEG
ACTEND
PRONAME
PROLABL
TASK
PRONUM <pk,fk>
TSKNAME <pk>
ACTBEG
ACTEND
TSKCOST
Project
Project number
Project name
Project label
Task
Task name
Task cost
Conceptual
Data Model
Physical
Data Model
T
r
a
n
s
f
o
r
m
a
t
i
o
n

Minder Chen, 1993~2002
Data Modeling - 129 -
Physical Data Model
PRONUM = PRONUM
TSKNAME = TSKNAME
EMPNUM = EMPNUM
MATNUM = CPN_MATNUM
MATNUM = CPD_MATNUM
DIVNUM = DIVNUM
EMPNUM = EMPNUM
MATNUM = MATNUM
PRONUM = PRONUM
EMPNUM = EMPNUM
EMPNUM = EMP_EMPNUM
EMPNUM = EMPNUM
TEANUM = TEANUM
CUSNUM = CUSNUM
DIVISION
DIVNUM <pk>
DIVNAME
DIVADDR
EMPLOYEE
EMPNUM <pk>
EMP_EMPNUM <fk>
DIVNUM <fk>
EMPFNAM <ak>
EMPLNAM <ak>
EMPFUNC <ak>
EMPSAL
CUSTOMER
CUSNUM <pk>
CUSNAME
CUSADDR
CUSACT
CUSTEL
CUSFAX
PROJECT
PRONUM <pk>
CUSNUM <fk>
EMPNUM <fk>
ACTBEG
ACTEND
PRONAME
PROLABL
TEAM
TEANUM <pk>
TEASPE
TASK
PRONUM <pk,fk>
TSKNAME <pk>
ACTBEG
ACTEND
TSKCOST
MATERIAL
MATNUM <pk>
MATNAME
MATTYPE
PARTICIPATE
PRONUM <pk,fk>
TSKNAME <pk,fk>
EMPNUM <pk,fk>
PARBEG
PAREND
MEMBER
TEANUM <pk,fk>
EMPNUM <pk,fk>
USED
MATNUM <pk,fk>
EMPNUM <pk,fk>
COMPOSE
CPD_MATNUM <pk,fk>
CPN_MATNUM <pk,fk>
Physical Data Model
Project : Management
Model : Project Management
Author : User Version 6.x 7/21/98
EMPLOYE_MATERIAL
MATERIAL.MATNAME char(30)
PROJ.EMPLOYEE.EMPNUM numeric(5)
PROJ.EMPLOYEE.EMPFNAM char(30)
PROJ.EMPLOYEE.EMPLNAM char(30)
PROJ.EMPLOYEE.EMPFUNC char(30)
MATERIAL
PROJ.EMPLOYEE
USED
Minder Chen, 1993~2002
Data Modeling - 130 -
References (Relationships at the Physical Data Model)

Minder Chen, 1993~2002
Data Modeling - 131 -
Referential Integrity
The arrow is
pointing from
the table
containing
the foreign
key to the
table where
the foreign
key is used
as a primary
key.
Minder Chen, 1993~2002
Data Modeling - 132 -
Deletion Rules


Update Constraints
Delete Constraints
None
Restrict
Cascade
Set null
Set Default
Minder Chen, 1993~2002
Data Modeling - 133 -
Generation of Oracle SQL DLL
-- ============================================================
-- Database name: PROJECT
-- DBMS name: ORACLE Version 8
-- Created on: 7/21/98 8:59 PM
-- ============================================================

-- ============================================================
-- Table: DIVISION
-- ============================================================
create table ADMIN.DIVISION
(
DIVNUM numeric(5) not null
constraint CKC_DIVNUM_DIVISION check (DIVNUM >= '1'),
DIVNAME char(30) not null,
DIVADDR char(80) null ,
constraint PK_DIVISION primary key (DIVNUM)
)
/








-- ============================================================
-- Table: CUSTOMER
-- ============================================================
create table PROJ.CUSTOMER
(
CUSNUM numeric(5) not null
constraint CKC_CUSNUM_CUSTOMER check (
CUSNUM >= '1'),
CUSNAME char(30) not null,
CUSADDR char(80) not null,
CUSACT char(80) null ,
CUSTEL char(12) null ,
CUSFAX char(12) null ,
constraint PK_CUSTOMER primary key (CUSNUM)
)
/






-- ============================================================
-- Table: TEAM
-- ============================================================
create table PROJ.TEAM
(
TEANUM numeric(5) not null
constraint CKC_TEANUM_TEAM check (TEANUM >= '1'),
TEASPE char(80) null ,
constraint PK_TEAM primary key (TEANUM)
)
/

-- ============================================================
-- Table: MATERIAL
-- ============================================================
create table PROJ.MATERIAL
(
MATNUM numeric(5) not null
constraint CKC_MATNUM_MATERIAL check (MATNUM >= '1'),
MATNAME char(30) not null,
MATTYPE char(30) not null,
constraint PK_MATERIAL primary key (MATNUM)
)
/

-- ============================================================
-- Table: EMPLOYEE
-- ============================================================
create table PROJ.EMPLOYEE
(
EMPNUM numeric(5) not null
constraint CKC_EMPNUM_EMPLOYEE check (
EMPNUM >= '1'),
EMP_EMPNUM numeric(5) null ,
DIVNUM numeric(5) not null,
EMPFNAM char(30) null ,
EMPLNAM char(30) not null,
EMPFUNC char(30) null ,
EMPSAL numeric(8,2) null ,
constraint PK_EMPLOYEE primary key (EMPNUM),
constraint AK_EMP_AK1_EMPLOYEE unique (EMPLNAM, EMPFNAM,
EMPFUNC)
)
/

-- ============================================================
-- Index: CHIEF_FK
-- ============================================================
create index PROJ.CHIEF_FK on PROJ.EMPLOYEE (EMP_EMPNUM asc)
/

-- ============================================================
-- Index: BELONGS_TO_FK2
-- ============================================================
create index PROJ.BELONGS_TO_FK2 on PROJ.EMPLOYEE (DIVNUM asc)
/

-- ============================================================
-- Table: PROJECT
-- ============================================================
create table PROJ.PROJECT
(
PRONUM numeric(5) not null
constraint CKC_PRONUM_PROJECT check (
PRONUM >= '1'),
CUSNUM numeric(5) not null,
EMPNUM numeric(5) null ,
ACTBEG timestamp null
constraint CKC_ACTBEG_PROJECT check (
ACTBEG is null or ((activity.begindate < activity.enddate))),
ACTEND timestamp null
constraint CKC_ACTEND_PROJECT check (
ACTEND is null or ((activity.begindate < activity.enddate))),
PRONAME char(30) not null,
PROLABL char(80) null ,
constraint PK_PROJECT primary key (PRONUM)
)
/

-- ============================================================
-- Index: SUBCONTRACT_FK
-- ============================================================
create index PROJ.SUBCONTRACT_FK on PROJ.PROJECT (CUSNUM asc)
/

-- ============================================================
-- Index: IS_RESPONSIBLE_FOR_FK
-- ============================================================
create index PROJ.IS_RESPONSIBLE_FOR_FK on PROJ.PROJECT (EMPNUM
asc)
/

-- ============================================================
-- Table: TASK
-- ============================================================
create table PROJ.TASK
(
PRONUM numeric(5) not null,
TSKNAME char(30) not null,
ACTBEG timestamp null
constraint CKC_ACTBEG_TASK check (ACTBEG is null or
((activity.begindate < activity.enddate))),
ACTEND timestamp null
constraint CKC_ACTEND_TASK check (ACTEND is null or
((activity.begindate < activity.enddate))),
TSKCOST numeric(8,2) not null,
constraint PK_TASK primary key (PRONUM, TSKNAME),
constraint CKT_TASK check (
(task.begindate < min(participate.begindate)
and
task.enddate < max(participate.enddate)))
)
/

-- ============================================================
-- Index: BELONGS_TO_FK
-- ============================================================
create index PROJ.BELONGS_TO_FK on PROJ. TASK (PRONUM asc)
/

-- ============================================================
-- Table: PARTICIPATE
-- ============================================================
create table PROJ.PARTICIPATE
(
PRONUM numeric(5) not null,
TSKNAME char(30) not null,
EMPNUM numeric(5) not null,
PARBEG timestamp null
constraint CKC_PARBEG_PARTICIP check (PARBEG is null or
(((task.begindate < min(participate.begindate)
and
task.enddate < max(participate.enddate)) and
(participate.begindate < participate.enddate)))),
PAREND timestamp null
constraint CKC_PAREND_PARTICIP check (PAREND is null or
(((task.begindate < min(participate.begindate)
and
task.enddate < max(participate.enddate)) and
(participate.begindate < participate.enddate)))),
constraint PK_PARTICIPATE primary key (PRONUM, TSKNAME, EMPNUM),
constraint CKT_PARTICIPATE check (
((task.begindate < min(participate.begindate)
and
task.enddate < max(participate.enddate)) and
(participate.begindate < participate.enddate)))
)
/

-- ============================================================
-- Index: WORKS_ON_FK
-- ============================================================
create index PROJ.WORKS_ON_FK on PROJ. PARTICIPATE (EMPNUM asc)
/

-- ============================================================
-- Index: IS_DONE_BY_FK
-- ============================================================
create index PROJ.IS_DONE_BY_FK on PROJ. PARTICIPATE (PRONUM asc,
TSKNAME asc)
/

-- ============================================================
-- Table: MEMBER
-- ============================================================
create table PROJ.MEMBER
(
TEANUM numeric(5) not null,
EMPNUM numeric(5) not null,
constraint PK_MEMBER primary key (TEANUM, EMPNUM)
)
/

-- ============================================================
-- Index: MEMBER_FK
-- ============================================================
create index PROJ.MEMBER_FK on PROJ.MEMBER (TEANUM asc)
/

-- ============================================================
-- Index: IS_MEMBER_OF_FK
-- ============================================================
create index PROJ.IS_MEMBER_OF_FK on PROJ.MEMBER (EMPNUM asc)
/

-- ============================================================
-- Table: USED
-- ============================================================
create table PROJ.USED
(
MATNUM numeric(5) not null,
EMPNUM numeric(5) not null,
constraint PK_USED primary key (MATNUM, EMPNUM)
)
/

-- ============================================================
-- Index: USED_FK
-- ============================================================
create index PROJ.USED_FK on PROJ.USED (MATNUM asc)
/

-- ============================================================
-- Index: USES_FK
-- ============================================================
create index PROJ.USES_FK on PROJ.USED (EMPNUM asc)
/

-- ============================================================
-- Table: COMPOSE
-- ============================================================
create table PROJ.COMPOSE
(
CPD_MATNUM numeric(5) not null,
CPN_MATNUM numeric(5) not null,
constraint PK_COMPOSE primary key (CPD_MATNUM, CPN_MATNUM)
)
/

-- ============================================================
-- Index: COMPOSES_FK
-- ============================================================
create index PROJ.COMPOSES_FK on PROJ.COMPOSE (CPD_MATNUM asc)
/

-- ============================================================
-- Index: COMPOSED_OF_FK
-- ============================================================
create index PROJ.COMPOSED_OF_FK on PROJ.COMPOSE (CPN_MATNUM
asc)
/

alter table PROJ.EMPLOYEE
add constraint FK_EMPLOYEE_CHIEF_EMPLOYEE foreign key
(EMP_EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/

alter table PROJ.EMPLOYEE
add constraint FK_EMPLOYEE_BELONGS_T_DIVISION foreign key
(DIVNUM)
references ADMIN.DIVISION (DIVNUM)
/

alter table PROJ.PROJECT
add constraint FK_PROJECT_SUBCONTRA_CUSTOMER foreign key
(CUSNUM)
references PROJ.CUSTOMER (CUSNUM)
/

alter table PROJ.PROJECT
add constraint FK_PROJECT_IS_RESPON_EMPLOYEE foreign key
(EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/

alter table PROJ.TASK
add constraint FK_TASK_BELONGS_T_PROJECT foreign key (PRONUM)
references PROJ.PROJECT (PRONUM)
/

alter table PROJ.PARTICIPATE
add constraint FK_PARTICIP_WORKS_ON_EMPLOYEE foreign key
(EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/

alter table PROJ.PARTICIPATE
add constraint FK_PARTICIP_IS_DONE_B_TASK foreign key (PRONUM,
TSKNAME)
references PROJ.TASK (PRONUM, TSKNAME)
/

alter table PROJ.MEMBER
add constraint FK_MEMBER_MEMBER_TEAM foreign key (TEANUM)
references PROJ.TEAM (TEANUM)
/

alter table PROJ.MEMBER
add constraint FK_MEMBER_IS_MEMBER_EMPLOYEE foreign key
(EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/

alter table PROJ.USED
add constraint FK_USED_USED_MATERIAL foreign key (MATNUM)
references PROJ.MATERIAL (MATNUM)
/

alter table PROJ.USED
add constraint FK_USED_USES_EMPLOYEE foreign key (EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/

alter table PROJ.COMPOSE
add constraint FK_COMPOSE_COMPOSES_MATERIAL foreign key
(CPD_MATNUM)
references PROJ.MATERIAL (MATNUM)
/

alter table PROJ.COMPOSE
add constraint FK_COMPOSE_COMPOSED__MATERIAL foreign key
(CPN_MATNUM)
references PROJ.MATERIAL (MATNUM)
/


Minder Chen, 1993~2002
Data Modeling - 134 -
Referential Integrity
alter table PROJ.EMPLOYEE
add constraint FK_EMPLOYEE_CHIEF_EMPLOYEE foreign key (EMP_EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/
alter table PROJ.EMPLOYEE
add constraint FK_EMPLOYEE_BELONGS_T_DIVISION foreign key (DIVNUM)
references ADMIN.DIVISION (DIVNUM)
/
alter table PROJ.PROJECT
add constraint FK_PROJECT_SUBCONTRA_CUSTOMER foreign key (CUSNUM)
references PROJ.CUSTOMER (CUSNUM)
/
alter table PROJ.PROJECT
add constraint FK_PROJECT_IS_RESPON_EMPLOYEE foreign key (EMPNUM)
references PROJ.EMPLOYEE (EMPNUM)
/
alter table PROJ.TASK
add constraint FK_TASK_BELONGS_T_PROJECT foreign key (PRONUM)
references PROJ.PROJECT (PRONUM)
/
Minder Chen, 1993~2002
Data Modeling - 135 -
Physical Database Design Activities
Define Tables & Columns
Define Keys
Identify Critical Transactions
Add Columns:
Redundant columns
Derived data columns
Manipulate Tables:
Collapse tables
Supertypes & subtypes
Add Tables:
Derived data
tables
Handle Integrity Issues:
Row uniqueness & Domain restrictions
Referential integrity & Generate sequence numbers
Derived and redundant data
Controlling Access
Manage Objects:
Sizes
Placement
Source: Gillete, Rob, etc., Physical
Database Design for Sybase SQL
Server, Prentice Hall, 1995.
Minder Chen, 1993~2002
Data Modeling - 136 -
Architecture of Data Warehouse

Corporate
Operational
Database
Data Warehouse
End User
Access and
OLAP front-
end Tools
EIS
DSS
Report Writers
Spreadsheets
Summarized
Detailed
Past
Current
Data
Replication
& Cleansing
Informational
Database
Data extraction
Data filtering
Table joining
Translation
Re-Formatting
Projecte
d
Derived
Data Bridging/
Transformation
Metadata
Info. Directory
Minder Chen, 1993~2002
Data Modeling - 137 -
Operational vs. Informational Databases
Data
Content
Data
organizations
Data
Volatility
Data
normalization
Access
frequency
Data
Update
Usage
Response
Time
Current value
Application by application
Dynamic
Fully normalized for
transaction processing
High
Updated on a record and field
basis
Highly structured
transaction processing
Sub-second to 2-3 seconds
Archival data, summarized
data, calculated data
Subject areas across
enterprise
Static until refreshed
Joined views suitable for
business analysis
Low - Medium
Access only;
no direct update
Highly unstructured, heuristic
or analytical processing
Several seconds to minutes
Operational Database Informational Database
Characteristics
Minder Chen, 1993~2002
Data Modeling - 138 -
Relational View
Multidimensional View
Excel Pivot
Table Wizard
Minder Chen, 1993~2002
Data Modeling - 139 -
Dimensional Model
Product
Key
Name
Description
Size
Price
Promotion
Key
Description
Discount
Media
Market Region
Key
Description
District
Region
Demographics
Time
Key
Weekday
Holiday
Fiscal
Sale

Product Key
Market Key
Promotion Key
Time Key
Dollars
Units
Price
Cost
Time
Region
Product
Minder Chen, 1993~2002
Data Modeling - 140 -
Modeling a Data Warehouse
MDM: Multidimensional Modeling
A logical model of business information
Easy to understand
Applicable to relational and multidimensional
databases
Extremely useful for analysis
A tried-and-tested techniques
Why?
An OLTP (On-Line Transaction Process) design of an
order processing system may have dozens or
hundreds of tables. It becomes difficult for
business managers to understand the design in
order to analyze the data.
Minder Chen, 1993~2002
Data Modeling - 141 -
Approach
Designed around numeric data:
values
counts
weights
occurrence
An example of a MDM problem statement:
"What is my profitability by customer over
time, by organization?"

Minder Chen, 1993~2002
Data Modeling - 142 -
The Classic Star Schema
Market ID
description
region
state
district
city
Product ID
description
supplier ID
brand
color
size
Period ID
description
year
quarter
month
current flag
resolution
sequence
Market ID
Product ID
Period ID
dollars
units
price
Market Dimension
Product Dimension
Fact Table Period Dimension
Each dimension is described by its own table
and the facts are arranged in a single large
table with a concatenated primary key
comprises the individual keys of each
dimension.
Minder Chen, 1993~2002
Data Modeling - 143 -
Snow Flake Structure

Product identifier = Product identifier
Brand identifier = Brand identifier
Year identifier = Year identifier
Quarter identifier = Quarter identifier
Month identifier = Month identifier
Week identifier = Week identifier
Country identifier = Country identifier
Region identifier = Region identifier
Time identifier = Time identifier
Customer identifier = Customer identifier
Store identifier = Store identifier
Customer
Customer identifier <pk> int
Customer name char(30)
Customer address char(80)
Customer activity char(80)
Customer phone number char(12)
Customer fax number char(12)
Sale
Time identifier <fk> int
Customer identifier <fk> int
Store identifier <fk> int
Product identifier <fk> int
Sale total real
Sale revenu real
Store
Store identifier <pk> int
Region identifier <fk> int
Store name char(50)
Store address char(80)
Store manager char(30)
Store phone number char(20)
Store FAX number char(20)
Store financial services type char(10)
Store photo services type char(10)
Day
Week identifier <fk> int
Time identifier <pk> int
Date datetime
Day of week char(30)
Day number in month int
Product
Product identifier <pk> int
Brand identifier <fk> int
Product description char(80)
Product category char(30)
Product unit price int
Region
Region identifier <pk> int
Country identifier <fk> int
Region name char(30)
Country
Country identifier <pk> int
Country name char(80)
Year
Year identifier <pk> int
Year name char(30)
Month
Month identifier <pk> int
Quarter identifier <fk> int
Month name char(10)
Quarter
Quarter identifier <pk> int
Year identifier <fk> int
Quarter name char(10)
Week
Week identifier <pk> int
Month identifier <fk> int
Week name char(30)
Week number in year int
Brand
Brand identifier <pk> int
Brand name char(30)
Minder Chen, 1993~2002
Data Modeling - 144 -
Steps to Build MDM
Pick a business subject area
Weekly sales reports, monthly financial
statements, insurance claim costs.
Asking six fundamental questions:
What business process is being modeled?
At what level of detail (granularity) is "active"
analysis conducted?
What do the measures have in common (the
"dimensions")?
What are the dimensions' attributes?
Are the attributes stable or variable over time
and is their "cardinality" bounded or
unbounded?

Minder Chen, 1993~2002
Data Modeling - 145 -
Issues
Active analysis
Mechanical manipulation: Pivoting, Drilling
down, Graphing
Agent-based manipulation: Alert reporting,
exception reporting
Workflow manipulation: Publishing,
distributing documents.
Cardinality means "how many"
A relational database usually has "unbounded"
cardinality
A multidimensional database usually has
"bounded" cardinality. Complete
reorganization is needed to change cardinality.
Minder Chen, 1993~2002
Data Modeling - 146 -

A Data Model for an Electronic Commerce Application
dept_id = parent_id
sku = sku
pfid = pfid
shopper_id = shopper_id
pfid = pfid
shopper_id = shopper_id
pfid = pfid
order_id = order_id
pfid = pfid
pfid = pfid
dept_id = dept_id
basket
shopper_id char(32)
date_changed datetime
marshalled_order image
dept
dept_id int
parent_id int
name varchar(255)
description text
date_changed datetime
product_attribute
pfid varchar(30)
attribute_id tinyint
attribute_index tinyint
attribute_value varchar(20)
product_family
pfid varchar(30)
dept_id int
manufacturer_id int
name varchar(255)
short_description varchar(255)
long_description text
image_filename varchar(255)
intro_date datetime
date_changed datetime
list_price int
monogramable tinyint
product_variant
sku int
pfid varchar(30)
attribute0 tinyint
attribute1 tinyint
attribute2 tinyint
attribute3 tinyint
attribute4 tinyint
promo_cross
pfid varchar(30)
related_pfid varchar(30)
description varchar(255)
promo_price
promo_name varchar(255)
promo_type int
promo_description text
promo_rank int
active int
date_start datetime
date_end datetime
shopper_all int
shopper_column varchar(64)
shopper_op varchar(2)
shopper_value varchar(64)
cond_all int
cond_column varchar(64)
cond_op varchar(2)
cond_value varchar(64)
cond_basis char(1)
cond_min int
award_all int
award_column varchar(64)
award_op varchar(2)
award_value varchar(64)
award_max int
disjoint_cond_award int
disc_type char(1)
disc_value real
promo_upsell
pfid varchar(30)
related_pfid varchar(30)
description varchar(255)
receipt
order_id char(26)
shopper_id char(32)
total int
status tinyint
date_entered datetime
date_changed datetime
marshalled_receipt image
receipt_item
pfid varchar(30)
sku int
order_id char(26)
row_id int
quantity int
adjusted_price int
shopper
shopper_id char(32)
created datetime
name varchar(235)
password varchar(20)
street varchar(50)
city varchar(50)
state varchar(30)
zip varchar(15)
country varchar(20)
phone varchar(16)
email varchar(50)
Minder Chen, 1993~2002
Data Modeling - 147 -
Attribute 0 of pfid 14 is size and
the attribute value 1 is Grande
and 2 is Tall and 3 is Short
Minder Chen, 1993~2002
Data Modeling - 148 -
Web-based Build-To-Order Application

Minder Chen, 1993~2002
Data Modeling - 149 -
Data Model for Build-To-Order Application

Minder Chen, 1993~2002
Data Modeling - 150 -
http://www.oracle.com/tools/jdeveloper/documents/jsptwp/index.html?content.html
Auction Web
Site's Data Model

Potrebbero piacerti anche