Unit I DBMS

Unit-I
Database Management
Systems
What is a Database?
A database is any collection of data.
A DBMS is a software system
designed to maintain a database.
We use a DBMS when
there is a large amount of data
security and integrity of the data are
important
many users access the data
concurrently
Example Database
Application
Consider a Phone Company, such as
AT&T
Kinds of information they deal with:
customer records
employee records
billing information
management records
switching and wiring diagrams
customer service orders
Concerns of a Database
User
With all that data,
AT&T must be concerned with
questions such as:
Where is the information kept?
How is the data structured?
How is the data kept consistent?
How is the data described?
How is the data kept secure?
How do different pieces of data interrelate?
Why Use a DBMS?

Without a DBMS, we'd have:
Access by a collection
of ad hoc programs
in C++, Java, PHP, etc.
data stored as bits on disks

organized as files
users of
the data
There is no control or
coordination of what
these programs do
with the data
Why Use a DBMS?

With a DBMS, we have:
applications
DBMS
data stored as bits on disks

organized as files
users of
the data
DBMS provides control

and coordination to
protect the data.
Levels of Abstraction
Users
Views describe how

users see the data.
Conceptual schema
defines logical structure
View 1 View 2 View 3

Conceptual Schema
Physical schema
describes the files and
indexes used.
(sometimes called the
ANSI/SPARC model)
Physical Schema
DB
Example: University Database

Conceptual schema:
View 1
View 2
View 3
Students(sid: string, name: string,

Conceptual Schema
login: string, age: integer, gpa:real)
Courses(cid: string, cname:string,
Physical Schema
credits:integer)
Enrolled(sid:string, cid:string,
DB
grade:string)
External Schema (View):

Course_info(cid:string,enrollment:integer)
Physical schema:
Relations stored as unordered files.
Index on first column of Students.
Data Independence
Applications insulated from
how data is structured and
stored.
Logical data independence:
Protection from changes in
logical structure of data.
View 1
View 2
View 3
Conceptual Schema
Physical Schema
Physical data
independence: Protection
from changes in physical
structure of data.
DB
Queries, Query Plans, and Operators

SELECT
SELECT eid,
E.loc,
ename,
AVG(E.sal)
title
COUNT
DISTINCT
(E.eid)
FROM
Emp
E
FROM
Emp
E,E.loc
Proj
P, Asgn A
WHERE
GROUP
BY
E.sal
> $50K
WHERE E.eid = A.eid
HAVING Count(*) > 5
AND P.pid = A.pid
AND E.loc <> P.loc
Count
Having
distinct
Group(agg)
Join
Select
Join
Emp
System handles query plan

generation & optimization;
ensures correct execution.
Proj
Emp
Emp
Asgn
Employees
Projects
Assignments
Issues: view reconciliation, operator ordering, physical

operator choice, memory management, access path (index)
use,
Levels of Abstraction
Categories of data models
One fundamental characteristic of the database approach is

that it provides some level of data abstraction
High-level or Conceptual data models:

Provide concept that are close to the way many users perceive
data
Low-level or Physical data model:
Provide concepts that describe the details of how data is
stored in the computer
Conceptual data models

It uses concepts such as entities, attributes and
relationships.
Entity represents a real-world object or concept,
such as employee or project
Attribute represents some property of interest
that further describes an entity, such as
employees name or salary
Relation represents an association among two or
more entitles
Example of a Relation
Schemas and Database

State
In any data model, it is important to
distinguish between the description
of the data and database itself
The description of the database is
called the database schema
A displayed Schema is called a
schema diagram
University Database
Example of a Database Schema
Example of a Database Schema

State
The data in the database at a particular moment in time is called a database

state
The distinction between database schema and database state is very important
When we define a new database, we specify its database schema only to the
DBMS
At this point, the corresponding database state is the empty state with no data
We get the initial state of the database when the database is first loaded
From then on, every time an update operation is applied to the database, we get
another database state

State
Valid State: a state that satisfies the structure
and constrains specified in the schema.
The database schema changes very
infrequently.
The database state changes every time the
database is updated
Schema is also called intension.
State is also called extension.
Three-Schema Architecture
Defines DBMS schemas at three levels:
Internal schema at the internal level to
describe physical storage structures and
access paths (e.g indexes).
Conceptual schema at the conceptual level
to describe the structure and constraints for
the whole database for a community of users.
External schemas at the external level to
describe the various user views.
The three-schema
architecture
User/application view
defined by user or
application
programmer in
consultation with DBA
Defined by DBA
Defined by DBA for

optimization
DBMS Languages
The first step to create a database through DBMS
is to specify conceptual and internal schemas for
the database
Data Definition Language (DDL): is used by
database designers to define schemas
Data Manipulation Language (DML)
View Definition Language (VDL): is to specify
user views
In current DBMS, the preceding types of
languages are usually not considered distinct
languages
Data Definition Language (DDL)

Specification notation for defining the
database schema
DDL compiler generates a set of tables
stored in a data dictionary
Data dictionary contains metadata (data
about data)
Data storage and definition language
special type of DDL in which the storage
structure and access methods used by the
database system are specified
24
Data Manipulation Language

(DML)
Language for accessing and
manipulating the data organized by
the appropriate data model
Two classes of languages
Procedural user specifies what data is
required and how to get those data
Nonprocedural user specifies what
data is required without specifying how
to get those data
25
ANSI/SPARC Architecture
ANSI - American
National Standards
Institute
SPARC - Standards
Planning and
Requirements
Committee
1975 - proposed a
framework for DBs
A three-level
architecture
Internal level: For
systems designers
Conceptual level: For
database designers and
administrators
External level: For
database users
Internal Level
Deals with physical
storage of data
Structure of records on
disk - files, pages,
blocks
Indexes and ordering of
records
Used by database
system programmers
Internal Schema
RECORD EMP
LENGTH=44
HEADER: BYTE(5)
OFFSET=0
NAME: BYTE(25)
OFFSET=5
SALARY: FULLWORD
OFFSET=30
DEPT: BYTE(10)
OFFSET=34
Conceptual Level
Deals with the
organisation of the
data as a whole
Abstractions are used to
remove unnecessary
details of the internal
level
Used by DBAs and
application
programmers
Conceptual Schema
CREATE TABLE
Employee (
Name
VARCHAR(25),
Salary REAL,
Dept_Name
VARCHAR(10))
External Level
Provides a view of the
database tailored to a
user
Parts of the data may
be hidden
Data is presented in a
useful form
Used by end users and
application
programmers
External Schemas
Payroll:
String Name
double Salary
Personnel:
char *Name
char *Department
Mappings
Mappings translate
information from one
level to the next
External/Conceptual
Conceptual/Internal
These mappings
provide data
independence
Physical data
independence
Changes to internal
level shouldnt affect
conceptual level
Logical data
independence
Conceptual level
changes shouldnt
affect external levels
ANSI/SPARC Architecture
User 1
External Schemas
User 2
External
View 1
User 3
External
View 2
External/Conceptual Mappings
Conceptual Schema
Conceptual
View
Conceptual/Internal Mapping
Internal Schema
Stored
Data
DBA
Typical DBMS Component

Modules
Interfacing Components of
DBMS
users
software
hardware
data
DBMS Roles
application developers
DBMS
system
developers
database
designer
data
definition
processor
application
application
application
program(s)
application
program(s)
program(s)
program(s)
users of
the
data
query processor
security manager
concurrency manager
index manager
data
dictionary
data
system
administrator
(and DB
designer)
DBMS Roles
Actors On the Scene
(people interested in the actual
data):
database administrators
database designers
systems analysts and application
programmers
end users
Actors on the Scene

Database Administrators
acquiring a DBMS
managing the system
acquiring HW and SW to support the
DBMS
authorizing access (security policies)
managing staff, including DB designers
Actors on the Scene

Database Designers
identifying the information of interested
in the Universe of Discourse (UoD)
designing the database conceptual
schema
designing views for particular users
designing the physical data layout and
logical schema
adjusting data parameters for
performance
Actors on the Scene

Systems Analysts and Application
Programmers
(generic database developers)
provide specialized knowledge to
optimize database usage
provide generic (canned) application
programs
Actors on the Scene

End Users
casual users: ad-hoc queries
nave or parametric users: canned queries such as
menus for a phone company customer service agent
sophisticated users: people who understand the
system and the data and use it in many novel ways
standalone users: people who use personal easy-touse databases for personal data
DBMS Roles
Actors Behind the Scene:
people who maintain the
environment
but aren't interested in the actual
data
DBMS designers and implementers
tools developers
operators and maintenance personnel
database researchers
Actors Behind the Scene

DBMS designers and implementers
work for the company that supplies the
DBMS
(i.e. Microsoft , Oracle, Sybase, MySQL
)
programmers and engineers
design and implement the DBMS

Tools Developers
design and implement DBMS add-ons or
plug-ins
may work for DBMS supplier or be
independent
kinds of tools: database design aids,
performance monitoring tools, user and
designer interfaces

Operators and maintenance
personnel
run and maintain the computer
environment in which a DBMS operates
probably work for the database
administrator (DBA)

Database Researchers
academic or industrial researchers
develop new theory, new designs, new
data models and new algorithms to
improve future database management
systems
Software
controls the organization, storage, management, and retrieval of
data in a database.
It includes operating system, network software, and the
application programs
which encompasses the physical interconnections and devices
required to store and execute (or run) the software.
software consists of a machine language specific to an individual
processor.
It is usually written in high-level programming language more
efficient for humans to use .
Hardware
Hardware of a system can range from a PC to a network of
computers.
It also includes various storage devices like hard discs and
input and output devices like monitor, printer, etc.
DATA
Data stored in a database includes numerical data such as

whole numbers and floating point numbers and non numerical
data such as characters, date, or logical data.
More advanced systems may include more complicated data
entities such as pictures and images as data types.
Some other components of DBMS
User Interface
Data Manager
File Manager
Disk Manager
Physical Database
User Interface
The user interface is the is the aggregate of means by which the
people
the user interacts with the system a particular machine,
device, computer programme or other complex tools.
The user interface provides the means of:
-Input, allowing the users to manipulate the system.
-Output, allowing the system to produce the effects of the
users manipulation.
It refers to the graphical, textual and auditory information the
programme presents to the user and the control sequences the user
employs to the program.
Data Manager
It is a program which allows you to process and manipulate your data
in a easy and logical manner using a graphical interface.
Data Manager reads and writes delaminated files such as comma
separated files (CSV) and also can read data from ODBC Data
Sources.
It allows you to construct a conceptual design on how you are going
to process your data and transform it into another form.
You form your design by adding functional nodes and linking them
such that the links form the data flow through nodes on a graphical
work area.
You form your design by adding functional nodes and linking them
such that the links form the data flow through nodes on a graphical
work area.
Each node performs a single function on your data, once it completes
it passes your data to the node it is linked to and the process continues
until the data encounters a output node.
You can form a simple design or a complicated design with hundreds
of nodes and multiple input and output nodes.
File Manager
A file manager or file browser is a computer program that provides a
user interface to work with file systems.
They are very useful for speeding up interaction with files
The most common operations on files are create, open, edit, view,
print, play, rename, move, copy, delete, attributes, properties,
search/find, and permissions.
File managers may contain features inspired by web browsers,
including forward and back navigational buttons.
file managers also provide the ability to extend operations using user
written scripts.
It passes request to disk manager.
Disk Manager
Disk manager is a simple filesystem configurator that allows you to:
-Automatically detect new partitions at startup.
-Fully manage configuration of filesystem.

Disk Manager logs every change you make to the filesystem
configuration
explaining hardware concepts
documenting switches of many of the existing disks
putting into place custom software drivers, notably those related to
maximum disk or partition size
providing testing and informational utilities
Interaction of DBMS
components
Transaction Manager
DBMS
user
interface
Data
Manage
r
File
Manag
er
Disk
Manag
er
Recovery Manager
Physical
Databas
e
Explanation of interactions
The user requests for specific information with the help of user
interface.
This request is processed by data manager and after processing ,data
manager request for specific records to the file manager.
The file manager then request for the specific block to the disk
manager.
The disk manager then then retrives the block and sends it to file
manager,which sends the required record to data manager.
The transaction manager supervises the data transactions that is carried
out between the data manager, file manager, and the disk manager.
The recovery manager keeps a check on the transacted data so that in
case of system failure, the data can be protected.
Advantages of Using a
DBMS
application
application
application
program(s)
application
program(s)
program(s)
program(s)
users of
the
data
query processor
security manager
concurrency manager
index manager
data
definition
processor
data
dictionary
data
software operating
between the data and
the applications can
provide many
capabilities
in a generic way
Persistence
A DBMS provides
persistent objects, types and data structures
persistent = having a lifetime longer than
the programs that use the data
any information that fits the data model
of a particular DBMS
can be made persistent with little effort
data model = concepts that can be used to
describe the data
Concurrency
A DBMS supports access by concurrent users
concurrent = happening at the same time
concurrent access, particularly writes (data changes),
can result in inconsistent states

(even when the individual operations are correct)
the DBMS can check the actual operations of
concurrent users, to prevent activity that will lead to
inconsistent states
Access Control
A DBMS can restrict access to
authorized users
security policies often require control
that is more fine-grained than that
provided by a file system
since the DBMS understands the data
structure, it can enforce fairly
sophisticated and detailed security
policies
Redundancy Control
A DBMS can assist in controlling redundancy
redundancy = multiple copies of the same data
with file storage, it's often convenient to store
multiple copies of the same data, so that it's "local"
to other data and applications
this can cause many problems:
wasted disk space
inconsistencies
need to enter the data multiple times
Complex Semantics
A DBMS supports representation
of complex relationships and integrity
constraints
the semantics (meaning) of an application often
includes many relationships and rules
about the relative values of subsets of the data
these further restrict the possible instances of the
database
relationships and constraints can be defined as part of
the schema
Backup and Recovery

A DBMS can provide backup and recovery
backup = snapshots of the data particular times
recovery = restoring the data to a consistent state
after a system crash
the higher level semantics (relationships and
constraints)
can make it difficult to restore a consistent state
transaction analysis can allow a DBMS to
reconstruct a consistent state from a number of
backups
Views and Interfaces

A DBMS can support
multiple user interfaces and user views
since the DBMS provides a well-defined data model
and a persistent data dictionary, many different
interfaces can be developed to access the same data
data independence ensures that these UIs will not be
made invalid by most changes to the data
new user views can be supported as new schemas
defined against the conceptual schema
DBMS Structure
application
application
program(s)
application
program(s)
application
program(s)
program(s)
users of
the data
external/application view
internal/implementation view
DBMS
software
components
data
description
data
definition
processor
query processor
security manager
concurrency manager
index manager
data
dictionary
data
DBMS Languages
DML: data manipulation language
QL: query language
GPL: general purpose languages
application
application
application
program(s)
application
program(s)
program(s)
program(s)
users of
the
data
query processor
security manager
concurrency manager
index manager
DDL:
data
definition
language
data
definition
processor
data
dictionary
data
system
configuration
languages
Data Independence
physical data independence
conceptual and external schema are defined
in terms of the data model,
rather than the actual data layout
ensures that conceptual and external schemas
are not affected by changes to the physical data
layout
logical data independence

ensures that changes to the conceptual schema
don't affect the external views
(this is not always achievable)
Disadvantages of DBMS
The disadvantages of
summarized as follows:
the
database
approach
are
1.Complexity :The provision of the functionality that is

expected of a good DBMS makes the DBMS an extremely
complex piece of software. Database designers, developers,
database administrators and end-users must understand
this functionality to take full advantage of it. Failure to
understand the system can lead to bad design decisions,
which can have serious consequences for an organization.
2.Size :The complexity and breadth of functionality
makes the DBMS an extremely large piece of software,
occupying many megabytes of disk space and requiring
substantial amounts ofmemoryto run efficiently.
3.Performance:Typically, a File Based system is written
for a specific application, such as invoicing. As result,
4.Higher impact of a failure:The centralization of

resources increases the vulnerability of the system. Since
all users and applications rely on the ~vailabi1ity of the
DBMS, the failure of any component can bring operations
to a halt.
5.Cost of DBMS:The cost of DBMS varies significantly,

depending on the environment and functionality provided.
There is also the recurrent annual maintenance cost.
6. Additional Hardware costs:The disk storage
requirements for the DBMS and the database may
necessitate the purchase of additional storage space.
Furthermore, to achieve the required performance it may
be necessary to purchase a larger machine, perhaps even
a machine dedicated to running the DBMS. The
procurement of additional hardware results in further
expenditure.
Data Associations
Entities, Attributes and Relations
A database can be modeled as:

a collection of entities,
relationship among entities.
An entity is an object that exists and is

distinguishable from other objects.
Example: specific person, company, event, plant
Entities have attributes

Example: people have
names and addresses
An entity set is a set of entities of the same type that

share the same properties.
Example: set of all persons, companies, trees, holidays
Copyright @ www.bcanotes.com
Entity Sets customer and loan

customer-id customer- customer- customername street
city
loan- amount
number
Attribut
An entity ises
represented by a set of attributes, that is
descriptive
properties possessed by all members of
an entity set.
Example:
customer = (customer-id, customer-name,
customer-street, customer-city)
loan = (loan-number, amount)
Domain - the set of permitted values for each

attribute
Attribute types:
Simple and composite attributes.

Single-valued and multi-valued attributes
E.g. multivalued attribute: phone-numbers

Derived attributes
Can be computed from other attributes
E.g. age, given date of birth
Composite Attributes
Relationship Sets
A relationship is an association among several
entities
Example:
Hayes
depositor
A-102
customer entityrelationship setaccount entity
Example:
(Hayes, A-102) depositor
Relationship Set borrower
Relationship Sets (Cont.)

An attribute can also be property of a relationship set.
For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
Degree of a Relationship Set

Refers to number of entity sets that participate in a
relationship set.
Relationship sets that involve two entity sets are
binary (or degree two). Generally, most relationship
sets in a database system are binary.
Relationship sets may involve more than two entity
sets. E.g. Suppose employees of a bank may have jobs
(responsibilities) at multiple branches, with different jobs at

different branches. Then there is a ternary relationship set
between entity sets employee, job and branch
Relationships between more than two entity sets are

rare. Most relationships are binary.
Mapping Cardinalities
Express the number of entities to which another
entity can be associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping
cardinality must be one of the following types:
One to one
One to many
Many to one
Many to many
Mapping Cardinalities
One to one
One to many
Note: Some elements in A and B may not be mapped to any elements in the other set
Mapping Cardinalities ...
Many to one
Many to many
Note: Some elements in A and B may not be mapped to any elements in the other set
Mapping Cardinalities affect ER Design

Can make access-date an attribute of account, instead of a
relationship attribute, if each account can have only one customer

I.e., the relationship from account to customer is many to one,
or equivalently, customer to account is one to many
E-R Diagrams
Rectangles represent entity sets.

Diamonds represent relationship sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Ellipses represent attributes

Double ellipses represent multivalued attributes.
Dashed ellipses denote derived attributes.

Underline indicates primary key attributes (will study later)
E-R Diagram With Composite, Multivalued, and Derived

Attributes
Relationship Sets with Attributes
Roles
Entity sets of a relationship need not be distinct
The labels manager' and worker are called roles; they specify how
employee entities interact via the works-for relationship set.
Roles are indicated in E-R diagrams by labeling the lines that connect
diamonds to rectangles.
Role labels are optional, and are used to clarify semantics of the
relationship
Cardinality Constraints
We
express cardinality constraints by drawing either

a directed line (->), signifying one, or an
undirected line (), signifying many, between the
relationship set and the entity set.
E.g.: One-to-one relationship:

A customer is associated with at most one loan via the
relationship borrower
A loan is associated with at most one customer via borrower
One-To-Many Relationship
In the one-to-many relationship a loan is associated
with at most one customer via borrower, a customer
is associated with several (including 0) loans via
borrower
Many-To-One Relationships
In a many-to-one relationship a loan is associated with several
(including 0) customers via borrower, a customer is

associated with at most one loan via borrower
Many-To-Many Relationship
A customer is associated with several (possibly

0) loans via borrower
A loan is associated with several (possibly 0)
customers via borrower
Participation of an Entity Set in a Relationship Set

Total participation (indicated by double line): every entity in the entity
set participates in at least one relationship in the relationship set
E.g. participation of loan in borrower is total
every loan must have a customer associated to it via
borrower
Partial participation: some entities may not participate in any
relationship
in the
E.g.
participation
of relationship
customer in set
borrower is partial
Keys
A super key of an entity set is a set of one or more
attributes whose values uniquely determine each entity.

A candidate key of an entity set is a minimal super key
Customer-id is candidate key of customer
account-number is candidate key of account
Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.
Keys for Relationship Sets

The combination of primary keys of the participating
entity sets forms a super key of a relationship set.
(customer-id, account-number) is the super key of depositor

NOTE: this means a pair of entity sets can have at most one
relationship in a particular relationship set.
E.g. if we wish to track all access-dates to each account by each

customer, we cannot assume a relationship for each access. We can use
a multivalued attribute though
Must consider the mapping cardinality of the

relationship set when deciding the what are the
candidate keys
Need to consider semantics of relationship set in
selecting the primary key in case of more than one
candidate key
E-R Diagram with a Ternary Relationship
Cardinality Constraints on Ternary Relationship

We allow at most one arrow out of a ternary (or greater
degree) relationship to indicate a cardinality constraint
E.g. an arrow from works-on to job indicates each
employee works on at most one job at any branch.
If there is more than one arrow, there are two ways of
defining the meaning.
E.g a ternary relationship R between A, B and C with arrows to B
and C could mean

1. each A entity is associated with a unique entity from B and C or
2. each pair of entities from (A, B) is associated with a unique C
entity,
and each pair (A, C) is associated with a unique B
Each alternative has been used in different formalisms
To avoid confusion we outlaw more than one arrow
144DATABASEMODELS
A database model defines the logical design of data. The model also
describes the relationships between different parts of the data. In the
history of database design, three models have been in use: the
hierarchical model, the network model and the relational model.
1.File-Based Systems or Primitive Data Models:
Entities or objects of interest are represented by
records that are stored together in files.
Relationships between objects are represented by
using directories of various kinds.
2.Traditional Data Models: Most commonly used
traditional models are: hierarchical, network and
relational data model.
3.Semantic Data Models: this type models was
influenced by the semantic networks developed by
Hierarchical database model
In the hierarchical model, data is organized as an inverted

tree. Each entity has only one parent but can have several
children. At the top of the hierarchy, there is one entity,
which is called the root.
Figure 14.3 An example of the hierarchical model representing a university
Network database model
In the network model, the entities are organized in a graph,

in which some entities can be accessed through several
paths (Figure 14.4).
Figure 14.4 An example of the network model representing a university
Relational database model
In the relational model, data is organized in twodimensional tables called relations. The tables or relations
are, however, related to each other.
Figure 14.5 An example of the relational model representing a university
14.5THERELATIONALDATABASEMODEL
In the relational database management system (RDBMS), the

data is represented as a set of relations.
The entity-relationship model is a generalization of
other two commercial models (hierarchical and
network). It allows the representation of explicit
constraints as well as relationships. This model is
basically useful in the design and communication of
the logical database. In this model, the objects of
similar structures are collected into an entity set.
The relationship between entity sets is represented
by a named E-R relationship and is 1:1, 1:M or M:N,
mapping from one entity set to another. The
database structure, employing the E-R model is
usually shown pictorially using entity-relationship
Relations
A relation appears as a two-dimensional table. The RDBMS

organizes the data so that its external view is a set of
relations or tables. This does not mean that data is stored as
tables: the physical storage of the data is independent of the
way in which the data is logically organized.
Figure 14.6 An example of a relation
14.97
A relation in an RDBMS has the following features:

Name. Each relation in a relational database should have
a name that is unique among other relations.
Attributes. Each column in a relation is called an
attribute. The attributes are the column headings in the
table in Figure 14.6.
Tuples. Each row in a relation is called a tuple. A tuple
defines a collection of attribute values. The total number
of rows in a relation is called the cardinality of the
relation. Note that the cardinality of a relation changes
when tuples are added or deleted. This makes the
database dynamic.
146OPERATIONSONRELATIONS
In a relational database we can define several operations to

create new relations based on existing ones. We define nine
operations in this section: insert, delete, update, select, project,
join, union, intersection and difference. Instead of discussing
these operations in the abstract, we describe each operation as
defined in the database query language SQL (Structured
Query Language).
Structured Query Language

Structured Query Language (SQL) is the language
standardized by the American National Standards Institute
(ANSI) and the International Organization for
Standardization (ISO) for use on relational databases. It is
a declarative rather than procedural language, which
means that users declare what they want without having to
write a step-by-step procedure. The SQL language was first
implemented by the Oracle Corporation in 1979, with
various versions of SQL being released since then.
Insert
The insert operation is a unary operationthat is, it is
applied to a single relation. The operation inserts a new
tuple into the relation. The insert operation uses the
following format:
Figure 14.7 An example of an insert operation
14.101
Delete
The delete operation is also a unary operation. The operation
deletes a tuple defined by a criterion from the relation. The
delete operation uses the following format:
Figure 14.8 An example of a delete operation
14.102
Update
The update operation is also a unary operation that is applied

to a single relation. The operation changes the value of some
attributes of a tuple. The update operation uses the following
format:
Figure 14.9 An example of an update operation
14.103
Select
The select operation is a unary operation. The tuples (rows)
in the resulting relation are a subset of the tuples in the
original relation.
Figure 14.10 An example of an select operation
14.104
Project
The project operation is also a unary operation and creates
another relation. The attributes (columns) in the resulting
relation are a subset of the attributes in the original relation.
Figure 14.11 An example of a project operation
14.105
Join
The join operation is a binary operation that combines two
relations on common attributes.
Figure 14.12 An example of a join operation
Union
The union operation takes two relations with the same set of
attributes.
Figure 14.13 An example of a union operation
14.107
Intersection
The intersection operation takes two relations and creates a

new relation, which is the intersection of the two.
Figure 14.14 An example of an intersection operation
14.108
Difference
The difference operation is applied to two relations with the

same attributes. The tuples in the resulting relation are those
that are in the first relation but not the second.
Figure 14.15 An example of a difference operation
14.109
147DATABASEDESIGN
The design of any database is a lengthy and involved

task that can only be done through a step-by-step
process. The first step normally involves interviewing
potential users of the database. The second step is to
build an entity-relationship model (ERM) that defines
the entities, the attributes of those entities and the
relationship between those entities.
14.110
Entity-relationship models (ERM)
In this step, the database designer creates an entityrelationship (E-R) diagram to show the entities for which
information needs to be stored and the relationship between
those entities. E-R diagrams uses several geometric shapes,
but we use only a few of them here:
Rectangles represent entity sets
Ellipses represent attributes
Diamonds represent relationship sets
Lines link attributes to entity sets and link entity sets to
relationships sets
14.111
Example 14.1
Figure 14.16 shows a very simple E-R diagram with three entity
sets, their attributes and the relationship between the entity sets.
Figure 14.16 Entities, attributes and relationships in an E-R diagram

14.112
From E-R diagrams to relations
After the E-R diagram has been finalized, relations (tables)

in the relational database can be created.
Relations for entity sets

For each entity set in the E-R diagram, we create a relation
(table) in which there are n columns related to the n
attributes defined for that set.
14.113
Example 14.2
We can have three relations (tables), one for each entity set
defined in Figure 14.16, as shown in Figure 14.17.
Figure 14.17 Relations for entity set in Figure 14.16

14.114
Relations for relationship sets

For each relationship set in the E-R diagram, we create a
relation (table). This relation has one column for the key of
each entity set involved in this relationship and also one
column for each attribute of the relationship itself if the
relationship has attributes (not in our case).
14.115
Example 14.3
There are two relationship sets in Figure 14.16, teaches and takes,
each connected to two entity sets. The relations for these
relationship sets are added to the previous relations for the entity
set and shown in Figure 14.18.
Figure 14.18 Relations for E-R diagram in Figure 14.16

14.116
Normalization
Normalization is the process by which a given set of

relations are transformed to a new set of relations with a
more solid structure. Normalization is needed to allow any
relation in the database to be represented, to allow a
language like SQL to use powerful retrieval operations
composed of atomic operations, to remove anomalies in
insertion, deletion, and updating, and reduce the need for
restructuring the database as new data types are added.
The normalization process defines a set of hierarchical
normal forms (NFs). Several normal forms have been
proposed, including 1NF, 2NF, 3NF, BCNF (Boyce-Codd
Normal Form), 4NF, PJNF (Projection/Joint Normal Form),
5NF and so on.
14.117
First normal form (1NF)

When we transform entities or relationships into tabular
relations, there may be some relations in which there are
more values in the intersection of a row or column.
Figure 14.19 An example of 1NF
14.118
Second normal form (2NF)

In each relation we need to have a key (called a primary key)
on which all other attributes (column values) need to depend.
For example, if the ID of a student is given, it should be
possible to find the students name.
Figure 14.20 An example of 2NF
14.119
Other normal forms

Other normal forms use more complicated dependencies
among attributes. We leave these dependencies to books
dedicated to the discussion of database topics.
14.120
148OTHERDATABASEMODELS
The relational database is not the only database model

in use today. Two other common models are distributed
databases and object-oriented databases. We briefly
discuss these here.
14.121
Distributed databases
The distributed database model is not a new model, but is

based on the relational model. However, the data is stored on
several computers that communicate through the Internet or
a private wide area network. Each computer (or site)
maintains either part of the database or the whole database.
Fragmented distributed databases
In a fragmented distributed database, data is localized
locally used data is stored at the corresponding site.
However, this does not mean that a site cannot access data
stored at another site. Access is mostly local, but
occasionally global.
14.122
Replicated distributed databases

In a replicated distributed database, each site holds an exact
replica of another site. Any modification to data stored in one
site is repeated exactly at every site. The reason for having
such a database is security. If the system at one site fails,
users at the site can access data at another site.
14.123
Object-oriented databases
An object-oriented database tries to keep the advantages of

the relational model and at the same time allows applications
to access structured data. In an object-oriented database,
objects and their relations are defined. In addition, each
object can have attributes that can be expressed as fields.
XML
The query language normally used for objected-oriented
databases is XML (Extensible Markup Language). As we
discussed in Chapter 6, XML was originally designed to add
markup information to text documents, but it has also found
its application as a query language in databases. XML can
represent data with nested structures.
14.124

Unit I DBMS

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Unit I DBMS

Caricato da

Copyright:

Formati disponibili

Unit-I

customer service orders

Why Use a DBMS?

data stored as bits on disks

Why Use a DBMS?

data stored as bits on disks

DBMS provides control

Views describe how

View 1 View 2 View 3

Example: University Database

Students(sid: string, name: string,

External Schema (View):

Queries, Query Plans, and Operators

System handles query plan

Issues: view reconciliation, operator ordering, physical

Categories of data models

One fundamental characteristic of the database approach is

High-level or Conceptual data models:

Conceptual data models

Schemas and Database

Example of a Database Schema

Example of a Database Schema

Schemas and Database

The data in the database at a particular moment in time is called a database

Schemas and Database

Defined by DBA for

Data Definition Language (DDL)

Data Manipulation Language

Typical DBMS Component

Actors on the Scene

Actors on the Scene

Actors on the Scene

Actors on the Scene

Actors Behind the Scene

Actors Behind the Scene

Actors Behind the Scene

Actors Behind the Scene

Data stored in a database includes numerical data such as

Some other components of DBMS

-Fully manage configuration of filesystem.

can result in inconsistent states

Backup and Recovery

Views and Interfaces

logical data independence

1.Complexity :The provision of the functionality that is

4.Higher impact of a failure:The centralization of

5.Cost of DBMS:The cost of DBMS varies significantly,

A database can be modeled as:

An entity is an object that exists and is

Example: specific person, company, event, plant

Entities have attributes

names and addresses

An entity set is a set of entities of the same type that

Example: set of all persons, companies, trees, holidays

Entity Sets customer and loan

Domain - the set of permitted values for each

Simple and composite attributes.

E.g. multivalued attribute: phone-numbers

Can be computed from other attributes

E.g. age, given date of birth

(Hayes, A-102) depositor

Relationship Set borrower

Relationship Sets (Cont.)