Sei sulla pagina 1di 27

THEORY

DATA: The term data referred to known as facts that could be recorded and stored on computer
media.
Data consists of facts, text, graphics, images, sound, and video segments that have
meaning in the user environment.

Cust name
Facts Address
Phone no

Text Documents

Numeric Digits

images Photos

Sound Audio files

Video Video files

Graphics Excel graphs

DATABASE: Database is organized collection of logically related data.


Organized means the data are structured to be easily stored, manipulated, and retrieved
by users
Related means that data describe a domain of interest to a group of users can use the data
to answer questions concerning.

sr. no Database Data

1 Time table Day Hour Subject

2 Sales person Cust name Address Phone no

INFORMATION : Data that have been processed in such a way as to increase the
knowledge of the person who use the data

EX:
Mon 4th SQL programming
Tue 1st SQL programming

Thu 3rd MDBMS

Fri 2nd MDBMS

METADATA: Data that describes the properties or characteristics of data

Data item Value

Name Type Length Min Max Description

Course id and
Course Alphanumeric 30
name

Hour Integer 1 6 Hours in a day

Subject Alphanumeric 15 Book name

Here data type, length, minimum and maximum values are consider as metadata
It means metadata describes the properties of data but do not include that data.

DISADVANTAGES OF FILE PROCESSING SYSTEMS

1. PROGRAM DATA DEPENDENCE: File descriptions are stored with in each application
program that accesses a given file. If any change to file structure requires changes to the file
description for all programs that access the file.

For ex suppose it is decided to change the customer address field length in the records from 30 to
40 characters the file descriptions in each program it is effected would have to be
modified. It is often difficult even to locate all programs effected by such changes

2. DUPLICATION OF DATA: Applications are often developed independently in file


processing system unplanned duplicate data files rule the data.
Orderfilling system contains an inventory master file while the invoicing system
contains an inventory pricing file. These files undoubtedly contains product description, unit
price, quantity on hand. This duplication is wasteful and required additional storage space and
increased effort to keep all these files up to date

3. LIMITED DATA SHARING: In the traditional file processing system each application has
its own private files and users have little opportunity to share data outside their own application

for ex users in the accounting dept have access to the invoicing system and its files but they do
not have access the Orderfilling system or the pay roll system. It is very difficult to find a
requested report from several incompatible files in separate systems.

4.LENGTHY DEVELEPMENT TIMES: In the traditional file processing system there is little
opportunity to use previous development efforts. Each new application requires new file formats
and descriptions. So writing the file access logic for each new program required lengthy
development times.

5. EXCESSIVE PROGRAM MAINTENANCE: The preceding factors create heavy program


maintenance. 80 percent of the total information system
development budget may be devoted to program maintenance in such organization.
For ex if an organization develops many separately managed databases with little or no
coordination of the metadata then all the above said problems can occur

THE RANGE OF DATABASE APPLICATIONS: The range of database application can be


divided into five categories

1. PERSONLA DATABASES: Personal databases are designed to support one user.


Ex: 1. Personal computers
2. Lap tops
Personal digital assistants (PDAs) has incorporated personal databases into handheld
devices. These are not only functioning computing devices but also as cellular phones, ax
senders and web browsers. Simple database application that store customer information can be
used from a PC or PDA. It can be easily transferred from one device to the other for backup and
work purposes.

Figure 1.7 from pg no 15.

Disadvantages: The data cannot be easily shared with other users. These are limited for very
small organizations

2.WORKGROUP DATABASES: A workgroup is relatively small people who collaborate on


the same project or application or on a group of similar projects or applications. A workgroup
typically comprises fewer than 25 persons. The group of persons allowed to be easily shared the
data and developments of database.

All the workgroup members are linked by a local area network (LAN). The database is
stored on a central device called the database server is also connected to the network. Different
types of group members (developer or project manager) may have different user views of the
shared database

Figure 1.8 page no 16

3. DEPARTMENT DATABASES: A department is a functional unit within an organization.


A department is generally larger than a workgroup typically between 25 and 100 persons.
Department databases are designed to support various functions and activities of a department.

4. ENTERPRIZE DATABASES: An enterprise database is one whose scope is the entire


organization or enterprise. Such databases are intended to support organization wide operations
and decision making. An organization may have several enterprise databases. A single
operational enterprise database is impractical for many medium to large organizations. An
enterprise database need information from many supporting departments.

It has two major developments


1. Enterprise resource planning (ERP) systems
2. Data warehousing implementations

ENTERPRISE RESOURCE PLANNING SYSTEMS: A business management system that


integrates all functions of the enterprise such as manufacturing, sales, finance, marketing,
inventory, accounting and human resources. ERP systems are software applications that provide
the data necessary for the enterprise to examine and manage its activities.

All ERP systems are heavily dependent on databases to store the data required by the ERP
applications.

DATA WAREHOUSE: An integrated decision support database whose content is derived from
the various operational databases.
Data warehouses collect their content from various operational databases including personal,
workgroup and department databases. Data warehouses provide user to work with
historical data.

Fig 1.9 pg. no 19

INTERNET, INTRANET, EXTRANET DATABASES:

INTERNET: It is a global network of public computers connects users of multiple plat forms
Telephone wire, cable and satellite connect millions of computers around the world to
each other.

WEB BROWSER: A worldwide network that connects users of multiple platforms easily
through an interface known as a web browser

INTERNET DATABASE: A database attached to a web browser is called Internet database

INTRANET: Use of Internet protocols to establish access to company data and information that
is limited to the organization.
EXTRANET: use of Internet protocols to establish limited access to company data and
information by the company’s customers and suppliers.

ADVANTAGES OF DATABASE APPROACH:

PROGRAM DATA INDEPENDENCE:

The separation of data descriptions (metadata) from the application programs that use the data is
called data independence. With the database approach data descriptions are stored in a
central location called the repository. This property of the database system allows an
organization’s data to change with out changing the application programs that process the
data

MINIMAL DATA REDUNDANCY:


The database approach does not eliminate redundancy entirely. But it allows the designer
to carefully control the type and amount of redundancy. For ex each order in the order table
contains a customer-id to establish the relationship between orders and customers.

IMPROVED DATA CONSISTENCY: By eliminating data redundancy we greatly reduce the


opportunities for inconsistency. If a customer is stored only once we can not disagree on the
stored values. Updating data values is greatly simplified when each value is stored in only one
place only. We avoid wasted storage space that results from redundant storage.

USER VIEW
A logical description of some portion of the database that is required by a user to perform some task.

IMPROVED DATA SHARING: A Database is a shared corporate resource. Authorized


internal and external are granted permission to use the database. Each user is provided one or
more user views to facilitate this use. A user view is often a form or report that comprises data
from more than one table.

INCREASED PRODUCTVITY OF APPLICATION DEVELOPMENT:

A major advantage of the data base approach is that it greatly reduce the cost and time for
developing new business applications.

Assuming that data database and the related data capture and maintenance applications have
already been designed and implemented the programmer can concentrate on the specific
functions required for new applications.

ENFORCEMENT OF STANDARDS
When the database approach is implemented with full management the database
administration function should be granted single point authority and responsibility for
establishing and enforcing standards. These standards will include naming conventions, data
quality standards, uniform procedures for accessing, updating, and protecting data.

IMPROVED DATA QUALITY


Database designers can specify integrity constraints that are enforced by DBMS.
A constraint is a rule that cannot be violated by database users.
IMPROVED DATA ACCESSIBILITY AND RESPONSIVENESS:
In a relational database end users without programming experience can have often
retrieve and display data.

Select * from product where product_name= ”computer desk” is SQL command to display the
information about computer desks

REDUCED PROGRAM MAINTAINANCE:


Stored data must be changed for a variety of reasons new data item types are added, data
formats are changed and so on. For ex in year 2000 problem common two digit year fields were
extended to four digits to rectify that

COSTS AND RISKS OF THE DATABASE APPROACH

NEW SPECIALIZED PERSONNEL:

Data base approach need to hire and train individuals to design and implement databases,
provide database administration services and manage a staff of new people. Because of the rapid
changes in technology these new people will have to be retrained or upgraded on a regular basis.

INSTALLATION AND MANAGEMENT COST AND COMPLEXITY

A Multiuser database management system is a large and complex suite of software that has high
initial costs, requires a staff of trained personnel to install and operate. Installing such a system
may also require upgrades to the hardware and data communication systems in the organization
Substantial training is normally required on an ongoing basis to keep up with new releases and
upgrades.

CONVERSION COSTS:
The term legacy system is widely used to refer to older applications in an organization
that ate based on file processing system. The cost of converting these older systems to modern
database technology measured in terms of dollars, time and organizational commitment.

NEED FOR EXPLICIT BACKUP AND RECOVERY:


A shared corporate database must be accurate and available at all times. This requires that
comprehensive procedures to be developed and used for providing backup copies of data and for
restoring a database when damage occurs. A modern database management system normally
automates many more of the backup and recovery tasks

ORGANAIZATIONAL CONFLICT:
A shared database requires a consensus on data definitions and ownership as well as
responsibilities for accurate data maintenance. Experience has shown that conflicts on data
definitions, data formats and coding, rights to update shared data. Handling these issues requires
organizational commitment and organizationally astute database administrators

COMPONENTS OF THE DATABASE ENVIRONMENT

1. COMPUTER AIDED SOFTWARE ENGINEERING (CASE) TOOLS


Automated tools used to design databases and application programs
2. REPOSITORY: A centralized knowledge base of all data definitions, data relationships,
screen and report formats, and other system components. A repository contains an
extended set of metadata important for managing databases.

3. DATABASE MANAGEMENNT SYSTEMS (DBMS)

A commercial software application that is used to create, maintain, and provide


controlled access to user databases.

4. DATABAE: Database is organized collection of logically related data.


The repository contains definitions of data where as the database contains occurrences
of the data

5. APPLICATION PROGRAM
Computer programs that are used to create and maintain the database and provide
information to users

6. USER INTERFACE
Languages, menus, and other facilities by which users interact with various system
components, such as CASE tools, application programs, the DBMS and the repository

7. DATA ADMINISTRATORS
Persons who are responsible for the overall information resources of an organization.
Data administrators use CASE tools for system requirements analysis and program design

8. SYSTEM DEVELOPERS:
Persons such as systems analysts and programmers who design new application
programs. System developers often use CASE tools for system requirements analysis and
program design

9. END USERS
Persons throughout the organization who add, delete, and modify data in the database and
who request or receive information from it. All user interactions with the database must
be routed through the DBMS.
THE DATA BASE DEVELEPMENT PROCESS

DATABASE DEVELOPMENT WITH IN INFORMATION SYSTEMS DEVELOPMENT

In many organizations database development begins with enterprise data modeling.

ENTERPRISE DATA MODELING: The first step in database development, in which the
scope and general contents of the organizational databases are specified
Fig 2.1 page no 37

INFORMATION SYSTEMS ARCHITECTURE (ISA):


A conceptual blueprint or plan that expresses the desired future structure for the
information systems in an organization.
It consists of six key components

1. DATA : represented in fig2.1


2. PROCESS: That manipulates data. These can be represented by data flow diagrams
3. NETWORK: Which transports data around the organization and between the
organization
4. PEOPLE: Who perform processes and are the source and receiver of data and
information
5. EVENTS AND POINTS IN TIME: When processes are performed. These can
be shown by state transition diagram.
6. REASONS: For events and rules that govern the processing of data. Some
diagrammatic tools exist for rules such as decision tables

INFORMATION ENGENEERING
It is a data oriented methodology to create and maintain information systems. Because of
the data orientation information engineering can be helpful how databases are identified and
defined. Information engineering follows top down planning

Information engineering includes four steps


1. Planning 2. Analysis 3. Design 4. implementation

TOP DOWN PLANNING:


A generic information systems planning methodology that attempts to gain a broad
understanding of the information system needs of the entire organization

INFORMATION SYSTEMS PLANNING:


The goal of information systems planning is to align information technology with the
business strategies of the organization. This planning phase includes three sections

1. IDENTIFYING STRATEGIC PLANNING FACTORS

Planning factors Examples


Maintain 10% per year growth rate
Organizational Goals Maintain 15% before tax return on investment
Avoid employee layoffs be a responsible corporate citizen
High quality products
Critical success
On time deliveries of finished products
factors
High productivity of employees

Inaccurate sales forecast


Problem areas Increasing competition
Stockouts of finished products

For ex the problem area of in accurate sales forecast might cause information system managers
to place additional historical sales data, new market research data or data concerning results from
test trails of new products in organizational databases

IDENTIFYING CORPORATE PLANNING OBJECTS


The corporate planning objects define the business scope

1.Organizational units
The various departments of the organization
Ex: sales, orders, accounting, manufacturing

2.Organizational locations
The places where business operations occur

Ex: corporate head quarters, Durango plant, western regional sales office, Lumber mill

3. Business functions
A related group Business process that support some aspect of the mission of an
enterprise
Ex: Business planning, product development materials management Marketing and sales

4. Entity types
Major categories of data about the people, places and things managed by the organization

Ex: customer product, raw materiel order work center invoice

5. Information systems: The application software and supporting procedures for handling
sets of data

Ex: Transaction processing system Management information system


Order tracking Sales management
Order processing Inventory control
Plant scheduling Production scheduling

DEVELOPING AN ENTERPRIZE MODEL


A comprehensive enterprise model consists of a functional break down (or
decomposition) model of each business function
Functional decomposition: An iterative process of breaking down the description of a system
into finer and finer details in which one function is described in greater detail by a set of other,
supporting functions
Fig 2.2 in pg. 39
An example of decomposition of an order fulfillment. Many databases are necessary to
handle the full set of business functions and supporting functions. A particular database may
support only a subset of the supporting functions

An enterprise data model shows not only the entity types but also the relation ships
between data entities. A common format for showing the interrelation ship between planning
objects is matrixes

A wide variety of planning matrixes as follows

LOCATION TO FUNCTION: indicates which business functions are being performed at


which business locations

UNIT TO FUNCTION: identifies which business functions are performed by or are the
responsibility of which business units

INFORMATION SYSTEM TO DATA ENTITY: Explains how each information system


interacts with each data entity
Ex: whether each system creates, retrieves, updates or deletes data in each entity

SUPPORTING FUNCTIONS TO DATA ENTITY: Identifies which data are captured, used,
updated, or deleted with in each function

INFORMATION SYSTEM TO OBJECTIVE: Shows which information systems support


each business objective

SYSTEMS DEVELOPMENT LIFE CYCLE (SDLC):

A traditional process for conducting an information systems development project is called the
systems development life cycle. The SDLC is a complete set of steps that a team of
information systems professionals including database designers and programmers. They
are used specify, develop, maintain, and replace information systems.

THREE SCHEMA ARCHITECTURE DEVELOPMENT

1. CONCEPTUAL SCHEMA(during the analysis phase)


2. EXTERNAL SCHEMA OR USER VIEW( During the analysis and logical design phase)
3. PHYSICAL OR INTERNAL SCHEMA( During the physical design phase)
CONSEPTUAL SCHEMA:
A conceptual schema is a detailed specification of the overall structure of organizational
data that is independent of any database management technology
A conceptual schema defines the whole database without reference to how data are stored in a
computers secondary memory. Specifications for the conceptual schema are stored as metadata
in a repository or data dictionary

EXTERNAL SCHEMA OR USER VIEW:


A user view was defined as a logical description of some portion of the database that is
required by a user to perform some task.
A user view is defined in both logical (technology independent) terms as well as
programming language terms (that is consistent with the syntax of the programming language).
The original description of a user view is a computer screen displays a business transaction
Ex: subscription renewal form

PHYSICAL SCHEMA:
A physical schema contains the specifications for how data from a conceptual schema are
stored in a computer’s secondary memory

THREE TIRED DATABASE LOCATION ARCHITECTURE

Three tiers more commonly considered

1. Data on a client server,


2. Data on a application server or web server
3. Data on a data base server.

1. CLIENT TIER: A desktop or laptop computer, which concentrates on managing the user
system interface and localized data also called the presentation tier. Web scripting tasks may
be executed on this tier.

2. APPLICATION/WEB SERVER TIER:


Processes HTTP protocol, scripting tasks, performs calculations and provides access to
data is called the process services tier

3. ENTERPRISE SERVER (DATABASE SERVER)


Performs sophisticated calculations and manages the merging of data from multiple
sources across the organization is called the data services tier

Three tired architecture for databases and information related to the concept of client/server
architecture

CLIENT /SERVER ARCHITECTURE


A local area network based environment in which database software on a server (called a
database server or database engine) performs database commands sent to it from client work
stations and application programs on each client concentrate on user interface functions
It allows for simultaneous processing on multiple processors for the same application.
,

It is possible to take advantage of the best data processing features of each computer platform.
You can mix client technologies and share common data
MODELING DATA IN THE ORGANAIZATION

MODELING THE RULES OF THE ORGANAIZATION


Business rules and policies govern creating, updating, and removing data in an
information processing and storage system
For ex a student in a university must have a faculty adviser forces data in a database.
Business rules and policies are not universal. Different universities may have different policies
for student advising. The policies of an organization may change over time. A university may
decide that a student does not have to be assigned a faculty adviser until a student choose a major

THE ROLE OF A DATABASE ANALYST

Identify and understand that those rules govern the data

Represent those rules so that they can be unambiguously understood by information systems
developers and users

Implement those rules in database technology

OVERVIEW OF BUSINESS RULES


A business rule is a statement that defines or constraints some aspect of the business. It is
intended to assert business structure or to control or influence the behavior of the business

Ex: A student may register for a section of a course only if he or she has successfully completed
the prerequisites for that course

A preferred customer qualifies for a 10 percent discount unless he has an overdue account
balance

CHARACTERISTICS OF GOOD BUSINESS RULES:

CHARACTERISTIC EXPLANATION
A business rule is a statement of policy, not how policy is enforced or
DECLARATIVE conducted. The rule does not describe a process or
implementation, but rather describes what a process validates
With the related organization, the rule must have only one interpretation
PRECISE
among all interested people, and its meaning must be clear
A business rule marks one statement, not several, no part of the rule can
ATOMIC
stand its own as a rule(the rule is indivisible)
A business rule must be internally consistent. That is not containing
CONSISTENT
conflicting statements.
A business rule must be able to be stated in natural language, but it will be
EXPRESSIBLE stated in a structured natural language, so that there is no
misinterpretation

DISTINCT Business rules are not redundant, but a business rule may refer to other rules

A business rule is stated in terms business people can understand, and since
BUSINESS it is a statement of business policy, only business people can
ORINETED modify or invalidate a rule. Thus a business rule is owned by
business

DATANAMES AND DATA DEFINATIONS


Data objects must be named and defined before they can be used unambiguously

DATA NAMES
RELATE TO BUSINESS, NOT TECHNICAl (HARDWARE OR SOFTWARE)
CHARACTERISTICS
Customer is a good name but file10, bit7 and payroolreportsortkey are not good names

BE MEANINGFUL: The data name should be meaningful for documentation purpose. So avoid
using generic words like “has” , “is”, “person” , “it”

BE UNIQUE: The name used for every distinct object words should be included in the data
name. Ex: Home address, campus address

READABLE: The name is structured most naturally.


Ex: Grade point average is a good name
Average grade point relative to a is a awkward name

COMPOSED OF WORDS TAKEN FROM AN APPROVED LIST:


Each organization chooses vocabulary from significant words for data names
Ex: maximum, never upper limit, ceiling, or highest
Alternative or alias names can also be included in the complete set of database
documentation. Words in the vocabulary may also have approved
Ex: CUST for customer

REPEATABLE:
Different people or the same person at different times should develop exactly or almost
the same name. This means there is a standard hierarchy for data names
Ex: Birth date of a student would be StudentBirthDate
Birth date of a employee would be EmployeeBirthDate

DATA DEFINATIONS: A definition is considered a type of business rule. A definition is an


explanation of a term or a fact. A term is a word or phase that has a specific meaning for the
business.
Ex: course, section, and rental car, flight reservation and passenger

Terms are often the keywords used to form data names

FACT: An association between to are more terms

A course is a module of instruction in a particular area


The sentence contains two terms 1. Module of instruction, 2. Subject area

A customer may request a model of car from a rental branch on a particular date
Here model, rental, request associates the four underlined terms
THE ER MODEL

ENTITY RELATIONSHIP MODEL: (E-R MODEL): A logical representation of the data for an
organization or for a business area. The E-R model is expressed in terms of entities in the
business environment.

ENTITY RELATIONSHIP DIAGRAM: A graphical representation of an ER model. An E-R


model is normally expressed as an entity relation ship diagram, which is a graphical
representation.

Entities are represented by the rectangle


Relationships between entities are represented by the diamond symbol connected by lines to the
related entities.

CUSTOMER: A person or organization who has ordered or might order products


EX: VRS & YRN COLLEGE
PRODUCT: A type of furniture made by Pine Valley Furniture, which may be ordered by
customer

ORDER: The transaction associated with the sale of the one or more products to a customer and
identified by a transaction number from sales or accounting

ITEM: A type of component that goes into making one or more products and can be supplied by
one or more supplies
Ex: ball bearing

SUPPLIER: Another company that may provide items to pine valley furniture

SHIPMENT: The transaction associated with items received in the same package by pine valley
furniture from a supplier.

A SUPPLIER may supply many ITEMS (by “may supply “ we mean the supplier may not supply
any items). Each ITEM is supplied by any number of SUPPLIERS.
(BY is supplied we mean must be supplied by at least one supplier.

Each item must be used in the assembly of at least one PRODUCT and may be used in many
products. Conversely each product must use one or more items.
A SUPPLIER may send many SHIPMENTS. On the other hand each shipment must be sent by
exactly one SUPPLIER. A supplier may be able to supply an item, but may not yet have
sent any shipments of that item.

A shipment must include one or more ITEMS. An ITEM may be included on several
SHIPMENTS.

A CUSTMOER may submit any number of orders. However each order must be submitted by
exactly one customer.

An ORDER must request one (or more) PRODUCTS. A given PRODUCT may not be requested
on any order

ENTITIES: An entity is a person, place, object, event or concept in the user environment
about which the organization wishes to maintain data.

Ex:
Person : EMPLOYEE, STUDENT, PATIENT
Place : STORE,WAREHOUSE, STATE
Object : MACHINE, BUILDING, AUTOMOBILE
Event : SALE, RGISTRATION, RENEWAL
Concept : ACCOUNT, COURSE, WORK CENTER

ENTITY TYPE: A collection of entities that share a common properties or characteristics


We use capital letters for names of entity type(s). In an e-r diagram the entity name is placed inside the box repres enting the entity type.

ENTITY INSTANCE: A Single occurrence of an entity type

Entity type: employee


Attributes:

Empno char(10)
Name char(25)
Address char(30)
City char(10)

Two instances of EMPLOYEE


642.12 534-34
q p
100 pacific 450 red wood
san francisco redwood city

STRONG ENTITY TYPE: An entity that exists independently of other entity types
Ex: employee, student, automobile and course

WEAK ENTITY TYPE: An entity type whose existence depends on some other entity type
Ex: Class
Class cannot be uniquely identified without a course number
Weak Entities are indicated by double lined rectangle

IDENTIFYING OWNER: The entity type on which the weak entity type depends

Identifying owner => course


the class doesn’t exist without the course

IDENTIFYING RELATION SHIP: The relation ship between a week entity type and its owner

Identifying relationship => instance-of


the relation ship is indicated by the double-lined diamond symbol.

NAMING AND DEFINING ENTITY TYPES:

1. SINGULAR NOUN: An entity type name is singular noun (Such as customer student or
automobile)

2. SPECIFIC TO THE ORGANIZATION: An entity type name should be specific to the


organization. One organization may use the entity type name CUSTOMER and another
organization may use the term client. The name should be descriptive for the organization
and distinct from all other entity type names within that organization.

3. CONCISE: An entity type name should be concise using as few words as possible. For
example in a University database an entity type REGISTRATION for the event of a student
registering for a class. Sufficient name for this entity type is student registration for class

4. ABBREVIATION OR SHORT NAME: An abbreviation or short name should be specified


for each entity type name and the abbreviation may be sufficient to use in the E-R diagram

5. Event entity types should be named for the result of event. The event of a project manager
assigning an employee to work on a project is an ASSIGNMENT.

ATTRIBTES: A property or characteristic of an entity that is of interest to the organization

STUDENT: Student_Id, Student_Name, Home_Adress


AUTOMOBILE: Vehicle_Id, Color, Weight
EMPLOYEE: Employee_Id, Employee_Name, Weight, Horsepower

In naming attributes we use initial capital letter followed by lower case letters. If an attribute
name consists of two words we use an underscore character to connect the words and we start
each word with a capital letter for ex: employee_Name
In ER diagram we represent an attribute by placing its name in an ellipse with a line connecting
to its associative entity

Entity I

Student_Id = 455
Student_Name = Smith
Home_Adress = 452 walnut street
Phone =303-839
Major =DBMS

Entity 2

Student_Id = 555
Student_Name = Thoms
Home_Adress = 944 mapel street
Phone =631-391
Major =VCPP

COMPOSITE ATTRIBUTE: An attribute that can be broken down into component parts
Ex: Address
It can be broken down into Street_adress, city, State and Postal_code

SIMPLE ATTRIBUTE: An attribute that can not be broken down into smaller components
Ex: Color, Weight

SINGLE-VALUED ATTRIBUTE : An attribute that holds a single-value for a single entity.

Ex: Customer 1, Branch 11

MULTI-VALUED ATTRIBUTE: An attribute that may take on more than one value
for a given entity instance.

Ex: Tel_No: 234-5678 and 456-7839


Rollno and regno

DERIVED ATTRIBUTE: An attribute whose values can be calculated from related attribute values

For ex: The employee entity has a Date_Employed attribute. If users need to know how many years a person has been employed that
value can be calculated using date_employed and today’s date

IDENTIFIER: An attribute or (combinations of attributes) that uniquely identifies individual


instances of an entity type.

Entity type Identifier

Student Student_id
Automobile Vehicle_Id

Student_Id is not a identifier because many students have same name or may change their
namee. It is underlined in the E-R diagram

Student_Nam Other_Attribute
Student_Id e s
STUDENT

COMPOSITE IDENTIFIER: An identifier that consists of a composite attribute.

Entity flight
Composite identifier: Flight _Id

Flight_Id in turn has component attributes Flight_number and Date. This combination is required
to uniquely identify individual occurrences of Flight.

Naming and Defining Attributes:

1. NOUN: An attribute name is a noun (such as customer_Id, Age Product_Minimum_price or


major.

2. Unique: An attribute name should be unique. No two attributes of the same entity type may
have the same name. For clarity purpose no tow attributes across all entity types have the
same name

3. SHOULD FOLLOW A SPECIFIC FORMAT:

A common format is [entity type name {[qualifier],} class


Where […] is an optional
{…} Indicates that the clause may repeat

Ex: cust_id

CLASS: class is a phrase from list of phrases defined by the organization that are permissible
characteristics of entities

Ex: Name Nm
Identifier ID
Date Dt /*entities*/
Amount Amt

Qualifier: A qualifier is a phrase from list of phrases defined by the organaization.

Ex: Maximum Max


Hourly Hrly /*attributes*/
State St

A relationship type: A relationship type is a meaningful association between (or among) entity
types.
Relationship instances: An association between (or) among entity instances where each
relationship instance includes exactly one entity from each participating entity type

Attributes on relationships: Attributes may be associated with a many to many (or one to one)
relationship as well as with an entity. Suppose an organization wishes to record the date (month
and year) when an employee completes each course.

Relationship instances:

Employee Course

Chen C++

Melton Java

Ritche COBOL

Celko Basic

Gosling SQL

Fig 3.10 page 96 perl

Associative entity: An entity type that associates the instances of one or more entity types and
contains attributes that are peculiar to the relationship between those entity instances
The associative entity is represented with the diamond relationship symbol encloses within the
entity box. The purpose of this symbol is to preserve the information that the entity was initially
specified as a relationship on the E-R diagram

Employee
EMPLOYEE A CERTIFI B COURSE
CATE

Degree of a relationship: The degree of a relationship is the number of entity types that
participate in that relationship.

The three most common relationship degrees in E-R models are Unary (degree1), Binary
(degree2) and ternary (degree3)

Unary Relationship: A relationship between the instances of a single entity type. Unary relation
ships are also called recursive relationships). Is_married_to is shown as a one to one relation ship
between instances of the PERSON entity type.

In the second example “manages” is shown as one_to_many relationship between instances of


the employee entity type. Using this relationship we could identify the employees who report a
particular manager.
Binary Relationship: A relationship between the instances of two entity types and is the most
common type of relationship encountered in data modeling.

Ex1: Indicates (one to one) that an employee is assigned one parking place and each parking
place is assigned to only one employee

Ex2: indicates (one to many) that a product line may contain several products and each product
belongs

Ex3: (Many to Many) that a student may register for more than one course and that each course
may have many student registrants.

Ternary relation ship: A ternary relation ship is a simultaneous relationship among the instances
of three entity types. In this example vendors can supply various parts to warehouses. The
relationship supplies is used to record the specific parts that are supplied by a given vendor to a
particular warehouse

Thus there are three entity types. VENDOR, PART and WAREHOUSE
There are two attributes on the relationship supplies Shipping mode and Unit_cost

CARDINALITY CONSTRAINTS: cardinality constraints specifies the number of instances of


one entity that can (or must) be associated with each instance of another entity

For ex considers a VIDEOSTORE that rents videotapes of movies. Since the store may
stock more than one video tape for each movie. The store may not have any typess of a given
movie in stock at a particular time.

MINIMUM CARDINALITY: The minimum number of instances of one entity that may be
associated with each instance of another entity.

In our VIDEOTAPE the minimum number of videotapes for a movie is zero. When the
minimum number of participants is zero we say the entity type b is an optional participant in the
relationship.

MAXIMUM CARDINALITY: The maximum cardinality of a relationship is the maximum


number of instances of one entity that may be associated with each instance of another entity.

In our VIDEOTAPE example the maximum cardinality for the VIDEOTAPE entity is
“Many”. That is an unspecified one greater than one. This is indicated by the “ crows foot”
symbol on the arrow next to the VIDEOTAPE

A relationship is of course bi-directional

MANDATORY ONE: The minimum and maximum are both one. This is called mandatory one
cardinality. In other words each videotape of a movie must be a copy of exactly one movie. If the
minimum cardinality is zero participation is optional

Ex: of mandatory cardinality fig 317

Naming and Defining Relationship:


1. Verb Phase: A relationship name is a verb phrase. (Such as Assigned _to, supplies or
teaches) Relationships represent actions being taken usually in the present tense.

A relationship name

States the action Not the result of action

Employee is assigned _ to a project Employee is assigning a project

2. Avoid vague names: you should avoid vague names such as Has or Is_related_to. Use
descriptive verb phases found in the definition of relationship

(Vague means not clearly expressed)


3. A relationship definition explains what action is being taken and possibly why it is important.
It may be important to state who or what does the action. But it is not important to explain
how the action is taken.

4. It may also be important to give examples to clarify the action. For ex for a relationship of
registered for between student and course. It may be useful to explain that this covers both on
site and on line registration and includes registration s made during the drop/add period.

5. Optional participation: The definition should explain any optional participation. You should
explain what condition lead to zero associated instances.

Subtype: a subgrouping of the entities in an entity type that is meaningful to the organization and
that shares common attributes or relationships distinct from other sub groupings
Ex: graduate student, undergraduate student,
Supertype: a generic entity type that has relationship with one or more subtypes
Ex: student

Pg. 129 fig 4.1

Hourly employees: Employee_Number, Employee_Name, Adress, Date_Hired , Hourly _Rate


Salaried employees: Employee_Number,Employee_Name, Address,Date_Hired, Hourly _Rate

Contract consultant: Employee_number, Employee_name

Fig 4.2 page no 130

Attribute inheritance: attribute inheritance is the property by which subtype entities inherit values
of all attributes of the supertype. This property makes it unnecessary to include all supertype
attributes redundantly with the subtypes.

For ex Employee_name is an attribute of EMPLOYEE but not of the sub types of employee.

When to use supertype/subtype relation ships:

Whether to use supertype/subtype relations ships are not is a decision that the data modeler
must make in each situation. You should consider for the following conditions are present

1. There are attributes that apply to some of the instances of an entity type
2. The instances of a subtype participate in a relation ship unique to that subtype.
3. Fig 4.3
Page 131
The hospital entity type PATIENT has two sub types. OUT PATIENT and RESIDENT
PATIENT. The primary key is Patient_ID. All patients have an Admit Date attribute as well as
Patient_name also. Every patient is carded by a responsible physician who develops a treatment
plan for the patient.

Each sub type has an attribute that is unique to that sub type. Out patients have a check back
date. While resident patients have a Date_Discharged. Resident patients have a unique relation
that assigns each patient to a bed. Each bed may or may not assigned to a patient.

According to attribute inheritance each out patient and resident patient inherits the attributes of
the parent supetype PATIENT. Patient_Id, Patient_Name, and Admit_date

REPRESENTING GENERALIZATION AND SPECIALIZATION:

Generalization: The process of defining a more general entity type from a set of more specialized
entity types. Thus generalization is bottom up process.

Fig 4.4 page 133

In the above example three entity types have been defined. CAR, TRUCK and MOTOR
CYCLE. We have observed that three entity types have a number of attributes in common
Vehicle_Id(identifier), Vehicle_Name( with components make and model), price,
Engine_Displacement. This fact suggests that each of the three entity types is really a version of
generalization.

There is more general entity type named VEHICLE. The entity CAR has the specific attribute
NO_of_passengers, while TRUCK has two specific attributes Capacity and Cab._Type. Thus
generalization has allowed us to group entity types along with their common attributes
The entity MOTOR CYCLE is not included in the relation ship it does not satisfy the condition
for a subtype because attributes of MOTOR CYCLE are common to all vehicles. There are no
attributes specific to motor cycle. Further MOTOR CYCLE does not have a relation ship to
another entity type. Thus there is no need to create a MOTOR CYCLE subtype.

Specialization:
The process of defining one or more sub types of the super type and forming supertype/ subtype
relationships. Specialization is a top down process the direct reverse of generalization.
Fig 4.5 page 134

Fig 4.5a shows an entity type named PART together with several of its attributes. The identifier
is part_no and other attributes include Description, Unit_price, Location, Qty_on_Hand,
Routing_number and suppliers(The last attribute is multivalued since there may be more
than one supplier with associated unit price for a part.
Here some parts are manufactured internally while others are purchased from outside suppliers.
Thus routing numbers applies only to manufactured parts while supplier_ID and
Unit_price apply only to purchased parts.

SPECIFYING CONSTRAINTS IN SUPERTYPE/SUBTYPE RELATIUON SHIPS:

Specifying completeness constraints:


A completeness constraint addresses the question whether an instance of a supertype must
also be a member of at least one subtype. The completeness constraint has two possible rules
1. total specialization 2. Partial specialization

Total Specialization Rule:


The total specialization rule specifies that each entity instance of the supertype must be a
member of some subtype in the relation ship.
Fig 4.6a pg 136
In this example the business rule is the following. A patient must be either an outpatient
or a resident patient(there are no other types of patient in this hospital). Total specialization is
indicated by the double line extending from the PATIENT entity type to the circle.

In this example every time a new instance of Patient is inserted into the supertype a
corresponding instance is inserted into either OUTPATIENT or RESIDENT PATIENT. If the
instance is inserted into RESIDENT PATIENT an instance of the relation ship is assigned is
created to assign the patient to a hospital bed.

Partial Specialization Rule:


The partial specialization rule specifies that an entity instance of the supertype is allowed
not to belong to any subtype.

Fig 4.6b pg 136

Motorcycle is a type of vehicle, but that it is not represented as a subtype in that model. Thus if a
vehicle is a car it must appear as an instance of CAR. If it is a truck it must appear as an instance
of TRUCK. How ever if the vehicle is a motorcycle it can not appear as an instance of any
subtype. This example of partial specialization and it is specified by the single line from the
VEHICLE supertype to the circle.
SPECIFYING DISJOINTENESS CONSTRAINTS:

A disjointness constraint addresses the question whether an instance of a supertype may


simultaneously be a member of two or more subtypes. The disjointness constraint has two
possible rules.
1. The disjoint rule 2. The over lap rule

The disjoint rule:


The disjoint rule specifies that if an entity instance is a member of one subtype it can not
simultaneously be a member of any other sub type.
Fig 4.7a page 138

The business rule in this case is the following at any given time a patient must be either an
outpatient or a resident patient but can not be both. This is the disjoint rule as specified by the
letter ‘d’ in the circle joining the supertype and its subtypes.
Note: The sub class of a PATIENT may change over time but at a given time a PATIENT is only
of one type.

Overlap Rule:
The overlap rule specifies that an entity instance can be simultaneously be a member of
two or more sub types.
Fig 4.7b pg 138

In this example an instance of PART is a particular part number. That is a type of part not an
individual part. This is indicated by the identifier which is Part_No. For ex consider part number
4000.
The overlap rule is specified by placing the letter ‘o’ in the circle . thus any part must be either
purchased or a manufactured part or it may be simultaneously be both of these.

DEFINING SUBTYPE DISCRIMINATORS:


A subtype discriminator is an attribute of the supertype whose values determines the
target subtype or subtypes.`
Disjoint Subtypes:
Fig 4.8 page 139
This example is for EMPLOYEE supertype and its subtypes. Thus each employee must be either
hourly, salaried, consultant.
A new attribute(employee_type) has been added to the supertype to serve as subtype
discriminator. When a new employee is added to the supertype this attribute is coded with one
of three values as follows
“H” (for hourly), “S” (for salaried) or “C”(for Consultant) Depending on this code the instance
is then assigned to the appropriate subtype.

Thus for example the condition “employee_type=”S” causes an entity instance to be inserted
into the SALARIED EMPLOYEE subtype.

OVERLAPPING SUBTYPES:

When subtypes overlap a slightly modified approach must be applied for the subtype
discriminator. The reason is that a given instance of the supertype may require that we create an
instance in more than one subtype.
Fig 4.9 page 140
A new attribute when named part_type has been added to PART. Part_Type is a composite
attribute with components manufactured and purchased. Each of these attributes is a boolean
variable. It takes only the values yes “Y” and no “N” when a new instance is added to part

Type of part Manyfactured? Purchased?


Manyfactured only “Y” “N”
Purchased only “N” “Y”
Manyfactured and Purchased “Y” “Y”

Defining supertype/subtype hierarchies:


A subtype/supertype hierarchy is a hierarchical arrangement of supertypes and subtypes,
where each subtype has only one supertype
Ex: suppose you are asked to model the human resources in a university. Using specialization
you must proceed as follows
Starting at the top of a hierarchy model the most general entity type first. In this case the
attributes shown in fig 4.10 page 141 are SSN(identifier), name,address,gender and
Date_of_Birth. The entity type at the top of a hierarchy is sometimes called the root.
Next define all major subtypes of the root. In this example there are three sub types of
PERSON: EMPLOYEE(person who work for the university),STUDENT(person who attend
classes) and ALUMNUS( person who have graduated)
Assuming there are no other types of persons of interest to the university the total specialization
rule applies as shown in the fig. A person might belong to more than one subtype( for ex
ALUMNUS AND EMPLOYEE). So the over lap rule is used.
Note: overlap allows for any overlap ( a person may be simultaneous in any pair or all in three
types.). if certain combinations are not allowed then a more redefined supertype/subtype
hierarchy ould have developed to elliminate prohibitted combinations.
Page 141 fig 4.10

EER modelling diagram pine valley furniture pg 143

ENTITY CLUSTRERING:
A entity cluster is a set of one or more entity types and associated relationships grouped
into a single abstract entitytype.

Pg 146 fig 4.13

SINGLE UNIT: represents the SALESPERSON and SALES TERRIORITY entity types and the
serves relationship

CUSTOMER: represents the CUSTOMER entity supertype, its subtypes and the relationship
between supertype and subtypes

ITEMSALE: represents the ORDER entity type and ORDERLINE associative entity as well as
the relationship between them.

ITEM: represents the PRODUCT LINE and PRODUCT entity types and the includes relation
ship
MANUFACTURING : represents the WORK CENTER and EMPLOYEE supertype entity and
its subtypes as well as the works in and supervises relationships and the relationship between the
supertype and its subtypes.

MATERIAL: represents the RAW MATERIAL and VENDOR entity types, the SUPPLIER
subtype, the supplies relationship, and the supertype/subtype relationship between VENDOR and
SUPPLIER.

FIG 4.13 PG 146,147