Sei sulla pagina 1di 531

PAN African e-Network Project

PGDIT

DBMS
Semester - II
Session - 1

By- Mr. Gaurav Dubey

Module 1.

Introduction to DBMS

What is Database and Database Management System?


Data Model :- Definition and Types
Role of Database Administrator
File System Vs. DBMS Approach

Module 1. Introduction to DBMS


Advantage of Using DBMS
Data Independence : Logical and Physical
Schema and Instances
Architecture of Database Management System
Levels of Database Management System

What is a database
A database is any organized collection of data. Some
examples of databases you may encounter in your daily life
are:
a telephone book
T.V. Guide
airline reservation system
motor vehicle registration records
papers in your filing cabinet
files on your computer hard drive.

Data vs. information:


What is the difference?
What is data?
Data can be defined in many
ways. Information science
defines data as unprocessed
information.

What is information?
Information is data that
have been organized and
communicated in a
coherent and meaningful
manner.
Data is converted into
information, and
information is converted
into knowledge.
Knowledge; information
evaluated and organized
so that it can be used
purposefully.

Why do we need a database?

Keep records of our:


Clients
Staff
Volunteers
To keep a record of
activities and interventions;
Keep sales records;
Develop reports;
Perform research
Longitudinal tracking

What is the ultimate purpose of a database


management system?

Is to transform
Data

Information

Knowledge

Action

More about database


definition

What is a database?
Its an organized collection of data. A database management
system (DBMS) such as Access, FileMaker, Lotus Notes,
Oracle or SQL Server which provides you with the
software tools you need to organize that data in a flexible
manner. It includes tools to add, modify or delete data
from the database, ask questions (or queries) about the
data stored in the database and produce reports
summarizing selected contents.

Database Management System (DBMS)


Defn:
A software system that allows the users to define the
structure of database , store and maintain the data into
the database and provides controlled access to the
database.
A database management system is a complex piece of
software that usually consists of a number of modules.

Database Administrator (DBA)


A Database administrator (DBA) performs all activities
related to maintaining a successful database
environment.
DBA is said to be the custodian of Database.
DBA is a person or group of persons responsible for
managing the Database.

Responsibility of DBA Includes:


Designing, implementing, and maintaining the database system.
Establishing policies and procedures pertaining to the
management,
Security, maintenance, and use of the database management
system .
Training employees in database management and use.

Database administrator (DBA)


A DBA is expected to be knowledgeable of emerging
technologies and new design approaches.
DBA has either a degree in Computer Science or some on-thejob training with a particular database product or more
extensive experience with a range of database products.
A DBA is usually expected to have experience with one or
more of the major database management products, such as
Structured Query Language, SAP, and Oracle-based database.

File System Vs. DBMS Approach


File-Based Approach : Each program defines and manages
its own data.
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
Multiple file formats, duplication of information in
different files.
Difficulty in accessing data
Need to write a new program to carry out each new task

File System Vs. DBMS Approach


Data isolation multiple files and formats.
Integrity problems:
Integrity constraints (e.g. account balance > 0)
become part of program code.
Hard to add new constraints or change existing
ones.

Drawbacks of using file systems (cont.)

Atomicity of updates:
Failures may leave database in an inconsistent
state with partial updates carried out.
E.g. transfer of funds from one account to another
should either complete or not happen at all

Drawbacks of using file systems (cont.)


Concurrent access by multiple users:
Concurrent accessed needed for performance.
Uncontrolled concurrent accesses can lead to
inconsistencies.
E.g. two people reading a balance and
updating it at the same time.
Security problems.

Database Approach
Database Approach : A shared collection of logically
related data, designed to meet the information needs of
an organization.
Database systems offer solutions to all the above
problems.

Advantages of DBMS
Controlled data redundancy:
Data consistency:
More information from the same amount of data
Sharing of data:

Advantages of DBMS
Increased concurrency.
Improved data integrity:
Improved backup and recovery services

Disadvantages of DBMS

Complexity & Size,

Cost of Software & Additional H/W costs

Cost of conversion, Performance,

Higher impact of a failure.

Architecture of DBMS
Three-Tier / Three level architecture suggested by

ANSI / SPARC

Architecture of DBMS
A commonly used views of data approach is the three-level
architecture suggested by ANSI/SPARC (American National
Standards Institute/Standards Planning and Requirements
Committee).

Architecture of DBMS
ANSI/SPARC produced a final report in 1977. The
reports proposed an architectural framework for
databases. Under this approach, a database is
considered as containing data about an enterprise.
The three levels of the architecture are three different
views of the data.

Levels of Abstraction
Many external schemata,
single conceptual(logical)
schema and physical
schema.
External schemata describe
how users see the data.
Conceptual schema defines
logical structure
Physical schema describes the
files and indexes used.

External
Schema 1

External
Schema
2

External
Schema 3

Conceptual Schema
Physical Schema

Database Design
Conceptual design
Logical design
Physical design

External level
The external level is the view that the individual user of
the database has.
This view is often a restricted view of the database and
the same database may provide a number of different
views for different classes of users.
In general, the end users and even the applications
programmers are only interested in a subset of the
database.

External Level
For example:
A department head may only be interested in the
departmental finances and student enrolments but not
the library information.
The librarian would not be expected to have any interest in
the information about academic staff.
The payroll office would have no interest in student
enrolments

Conceptual Level
The conceptual view is the overall community view of the
database and it includes all the information that is going
to be represented in the database.
The conceptual view is defined by the conceptual
schema which includes definitions of each of the various
types of data.

Internal Level
The internal view is the view about the actual physical
storage of data.
It tells us what data is stored in the database and how.
At least the following aspects are considered at this
level:

Data Independence
Applications insulated from how data is structured and stored.
Logical data independence: Protection from changes in
logical structure of data.
Physical data independence: Protection from changes in
physical structure of data.

* One of the most important benefits of using a DBMS!

Internal Level
Storage allocation.
Access paths e.g. specification of primary and secondary
keys, indexes and pointers and sequencing.
Miscellaneous e.g. data compression and encryption
techniques, optimization of the internal structures.

Architecture of DBMS

Architecture of DBMS
Physical level describes how a record (e.g., customer) is
stored.
Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
View level: application programs hide details of data types.
Views can also hide information (e.g., salary) for security
purposes.

Level for a database system

Data Model
Information systems and computer sciences use data
modeling to manage and organize large quantities of
structured and unstructured data.
A data model describes the information to be stored in vast
database management systems like relational databases.
Data models do not include unstructured data such as
email messages, word processing documents.

Data Model
Data modeling establishes implicit and explicit constrains
and limitations of the structured data.
Data Modeling Analysts use data modeling functions to
supply an accurate representation of the enterprise.
Data modeling is used to accurately reflect the data of
the organization. Based on this information, a database
is created.

Data Model (Definition)


A generalized, user-defined view of data representing the real
world.
A description of the structure of data elements.
Collection of concepts allowing for the representation of an
environment according to arbitrary requirements.
A diagram that shows the various subjects about which
information is stored, and illustrates the relationships between
those subjects

Data Model (Definition)


A logical map that represents the inherent properties of the data
independent of software, hardware or machine performance
considerations.
The model shows data elements grouped into records, as well as
the association around those records.
A data model is a collection of descriptions of data structures and
their contained fields, together with the operations or functions
that manipulate them.

Types of Data Model


There are a number of data models that are used to describe
how a database is structures and used, these are:

Hierarchical Model.
Network Model.
Relational Model.
Object Relational Model.
Entity-Relationship Model

Hierarchical Model.
The hierarchical data model organizes data in a tree
structure.
There is a hierarchy of parent and child data segments.
This structure implies that a record can have
repeating information, generally in the child data
segments.
Data in a series of records, which have a set of field
values attached to it.
It collects all the instances of a specific record
together as a record type.

Hierarchical Model.
These record types are the equivalent of tables in the
relational model, and with the individual records being
the equivalent of rows.
Today, the hierarchical model is rarely in modern
databases.
It is, however primarily used storing information, ranging
from geographic, file systems to the Windows registry to
XML documents.
IBM's Information Management System (IMS) DBMS,
were popular Hierarchical DBMSs .

Network Model
The popularity of the network data model coincided with
the popularity of the hierarchical data model.
Some data were more naturally modeled with more than
one parent per child. So, the network model permitted
the modeling of many-to-many relationships in data.

In 1971, the Conference on Data Systems Languages


(CODASYL) formally defined the network model.

Network Model
The basic data modeling construct in the
network model is the set construct. A set
consists of an owner record type, a set name,
and a member record type.
The data model is a simple network, and link
and intersection record types.

Relational Model.
RDBMS (relational database management system)
A database based on the relational model developed
by E.F. Codd.
A relational database allows the definition of data
structures, storage and retrieval operations and
integrity constraints.
In such a database the data and relations between
them are organized in tables.
A table is a collection of records and each record in a
table contains the same fields.

A Sample Relational Database

A Sample Relational Database

Object Relational Model


Object Relational Database Management Systems
(ORDBMS) add new object storage capabilities to the
relational systems at the core of modern information
systems.
These new facilities integrate management of traditional
fielded data, complex objects such as time-series and
geospatial data and diverse binary media such as audio,
video, images, and applets.
By encapsulating methods with data structures, an
ORDBMS server can execute complex analytical and
data manipulation operations to search and transform
multimedia and other complex objects.

Entity-Relationship Model
E-R model of real world
Entities : Real world objects (Includes both
living or Non-living)
E.g. customers, accounts, branch.
Entities are of two types: Strong Entity and
Week Entity

Entity-Relationship Model
Relationships: Association between instances of
entities
E.g. Account A-101 is held by customer Johnson
Relationship are of Three Types:
One-to-One (1:1)
One-to-Many (1:M)
Many-to-Many (M:N)

Entity-Relationship Model
Most Widely used for database design
Database design in E-R model usually converted to
design in the relational model which is used for
s t o r a g e
a n d
p r o c e s s i n g

Entity-Relationship Model

Instances and Schemas


Schema the logical structure of the database
e.g., the database consists of information about a
set of customers and accounts and the
relationship between them.
Physical schema: database design at the
physical level.
Logical schema: database design at the logical
level

Instances and Schemas


Instance the actual content of the database at a
particular point in time.
Analogous to the value of a variable
Occurrences of an Entity.
For Example: Name : John , City : Delhi and Age: 23

Physical Data Independence

the ability to modify the physical schema without


changing the logical schema
Applications depend on the logical schema
In general, the interfaces between the various levels
and components should be well defined so that
changes in some parts do not seriously influence
others.

Physical Data Independence


Alteration in the internal schema might
include.
* Using new storage device.
* Switching from one access method to
another.
Using different file organizations or storage
structures.
* Modifying indexes.

Logical Data Independence:


Logical data independence is the ability to modify the
conceptual schema without having alteration in external
schemas or program.
Alterations in the conceptual schema may include:
Addition or deletion of fresh entities, Attributes or
relationships and should be possible without having
alteration to existing external schemas or having to
rewrite application programs.

Functions of a DBMS
1.Data storage, retrieval, and update:
Support of Query Language

2. A user-accessible catalog:
Data Dictionary

3. Transaction support:
Transaction Manager

4. Concurrency control services:


Lock Manager

5. Recovery services.

Functions of a DBMS
6. Authorization services
7. Support for data communication
8. Integrity services
9. Services to promote data independence

Components of the DBMS Environment


Hardware
Software
Data Procedures
People

Components of a DBMS
Programmers
Application
Programs

Users
Queries

DBA
Database
Schema

Query

DDL

processor

compiler

Program

Database

Dictionary

object code

manager

manager

Access

File

methods

manager

Preprocessor

System

Database and

buffers

system catalog

DBMS

Category of Database User :

Database Designer or Administrator.


Application Programmer.
End User.

Database Language

DDL: Data Definition Language.


DML: Data Manipulation Language.
DQL: Data Query Language.
DCL: Data Control Language

DDL: Data Definition Language.


Example:
CREATE Statement.
ALTER ADD Statement.
ALTER DROP Statement.
ALTER MODIFY Statement

DML: Data Manipulation Language.


Example:
Insert Statement.
Update Statement.
Delete Statement.

DQL: Data Query Language


Example:
Select Statement.
With Where Clause.
With Order By Clause.
With group By Clause

DCL: Data Control Language


Example:
Grant Statement.
Revoke Statement.

Example: Creating Table:

CREATE TABLE STATION


(ID number(6) PRIMARY KEY,
CITY CHAR(20),
STATE CHAR(10),
zip_code number (6));

Example: Inserting Data into Table:


INSERT INTO STATION
VALUES (13, 'Phoenix', 'AZ', 33112);
INSERT INTO STATION
VALUES (44, 'Denver', 'CO', 40105);

Example: Retrieving Data from Table:


SELECT * FROM STATION;
SELECT ID, CITY FROM STATION;
SELECT STATE , CITY FROM STATION;
SELECT STATE , CITY FROM STATION
WHERE CITY = MUMBAI;

DELETING Data from Table:

DELETE FROM STATION;


DELETE FROM STATION
WHERE PIN = 53461;

ALTERING STRUCTURE OF TABLE

ALTER TABLE STATION


ADD EMAIL VARCHAR2(12)
ALTER TABLE STATION
MODIFY CITY VARCHAR2(20)

Question & Answer


1. Database is defined as:

Collection of similar type of entities.


Processed data
Collection of logically related data items
Raw data

2. Data independence allows:

sharing the same database by several applications


extensive modification of applications
no data sharing between applications
elimination of several application programs

Question & Answer


3. One to Many Relationship is represented by:
a)
b)
c)
d)

1:M
M:M
N:N
M: N

4. Entities are defined as :


a)
b)
c)
d)

Real world object


Association among the instances
Property of attributes
Property of DBMS

Question & Answer


5. By data redundancy in a file based system we mean that
(a) Unnecessary data is stored
(b) Same data is duplicated in many files
(c) Data is unavailable
(d) Files have redundant data

6. Overall logical structure of a database can be expressed


graphically by
(A). ER diagram
(B). Records
(C). Relations
(D). Hierarchy

Question & Answer


7. . A table can have how many unique key

A). 1
B). any number
C). 255
D). None of the above.

8. Entity is represented by the symbol.

A) Double Circle
B) Ellipse
C) Rectangle
D) Square

Question & Answer


9. Attributes are

i) Properties of relationship
ii) Degree to entities
iii) Properties of members of an entity set
(a ) i (b) i and ii
(c) i and iii
(d) iii

10 A relationship is
a) an item in an application
b) a meaningful dependency between entities
c) a collection of related entities
d) related data

Thank You
Please forward your query
To: skjha2@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT

DBMS
Semester - II
Session - 2

By- Mr. Gaurav Dubey

Module 2. Relational Database & ER Model

Entity , Entity Set & Type


Attributes
Week & Strong Entity
Relationship Types
E-R-Diagram

Module 2. Relational Database & ER Model

Relational system
Codds Rule
Optimization
Table & View

Relational Model
RDBMS Relational Data Base Management
System

A database based on the relational model


developed by E.F. Codd.
A relational database allows the definition
of data structures, storage and retrieval
operations and integrity constraints.

Relational Model
In such a database the data and relations
between them are organized in tables.
- List of all logically related data are placed into
one table or set of tables.
A table is a collection of records and each record
in a table contains the same fields.

Codd's rules
Codd's 12 rules are set of twelve rules
proposed by Edgar F. Codd,
A pioneer of the relational model for databases.
It is designed to define what is required from a
database management system in order for it to
be considered Relational DBMS

Codd's rules
Rule 1: The Information Rule.
All data should be presented to the user in table.

Codd's rules

Codd's rules
Rule 2: Guaranteed Access Rule.
All data should be accessible without ambiguity.
This can be accomplished through a
combination of the table name, primary key
and column name.

Codd's rules
Rule 3:
Systematic Treatment of Null Values.
A field should be allowed to remain empty.
This involves the support of a null value which is
distinct from an empty string or a number with a
value of zero.

Codd's rules
Rule 4:
Dynamic On-Line Catalog Based on the
Relational Model.
A relational database must provide access to its
structure through the same tools that are used to
access the data.

Codd's rules
Rule 5: Comprehensive Data Sublanguage
Rule.
The database must support at least one clearly
defined language that includes functionality for
data definition, data manipulation, data integrity
and database transaction control.
All commercial relational databases use forms of
the standard SQL (Structured Query Language)

Codd's rules
Rule 6:
View Updating Rule.
Data can be presented to the user in different
logical combinations called views.
View are practically categorized as :
Updatable View & Non- Updatable View .

Rule 6:
Each view should support the same full range of
data manipulation that direct-access to a table
has available.
In practice providing update and delete access
to logical views is difficult and is not fully
supported by any current database.

Codd's rules
Rule 7:
High-level Insert Update and Delete .
Data can be retrieved from a relational
database in sets constructed of data from
multiple rows and / or multiple tables.
This rule states that insert update and delete
operations should be supported for any
retrievable set rather than just for a single row in
a single table

Codd's rules
Rule 8:
Physical Data Independence.
The user is isolated from the physical method of
storing and retrieving information from the
database.
Changes can be made to the underlying
architecture ( hardware disk storage methods )
without affecting how the user accesses it

Codd's rules
Rule 9:
Logical Data Independence.
How a user views data should not change when
the logical structure (tables structure) of the
database changes. This rule is particularly
difficult to satisfy. Most databases rely on strong
ties between the user view of the data and the
actual structure of the underlying tables.

Codd's rules
Rule 10: Integrity Independence.
The Database Language (like SQL)
should support constraints on user input
that maintain database integrity.
Database System should not accept any
invalid input from user side.

Codd's rules
Rule 10: This rule is not fully implemented by
most major vendors.
At a minimum all databases do preserve two
constraints through SQL.
No component of a primary key can have a null
value.
If a foreign key is defined in one table any value
in it must exist as a primary key in another table.

Codd's rules
Rule 11:
Distribution Independence: A user should be
totally unaware of whether or not the database is
distributed (whether parts of the database exist
in multiple locations).
A variety of reasons make this rule difficult to
implement.

Codd's rules
Rule 12:
Non subversion Rule: There should be no way
to modify the database structure other than
through the multiple row database language.
Most databases today support administrative
tools that allow some direct manipulation of the
data structure.

Question & Answer


How many Integrity Rules are there and what are those?
.

1.

2. Distribution Independence Means: .

3. Differentiate between Table and View

Question & Answer


3. Systematic

Treatment of null values Means

.
4. Physical Data Independence Means
..
5. Logical Data Independence Means

Question & Answer


6. Updatable View
.
7. Non- Updatable View
.

Entity-Relationship Diagrams (ERD)


An entity-relationship ( ER ) diagram is a
specialized graphical method that illustrates the
interrelationships between entities in a
database.
ER diagrams often use symbols to represent
three different types of information in designing
database.

Entity-Relationship Diagrams (ERD)


Boxes are commonly used to represent entities.
ovals are used to represent attributes.
Diamonds are normally used to represent
relationships.

Symbols for Drawing E-R-D

Entity
A person, place, object, event or concept in the user
environment about which the organization wishes to
maintain data
Represented by a rectangle in E-R diagrams
Entity Type / Set
A collection of entities that share common properties
or characteristics.
i.e. Student Entity Set , Customer Entity Set
Attribute
A named property or characteristic of an entity that is
of interest to an organization.
i.e registration_no , customer_id , customer_emailid

Entities are of Two Types:


Strong Entities.
Week Entities.

Strong and Weak Entities

Relationship
An association between the instances of one
or more entity types that is of interest to the
organization.
Relationships are always labeled with verb
phrases

Relationship
Avoid vague names
Guidelines for defining relationships
Definition explains what action is being taken
and why it is important
Give examples to clarify the action
Explain reasons for any maximum cardinality.

Few more Symbols used for drawing E-R-D

Few more Symbols used for drawing E-R-D

Example:( Types of Attributes)


Simple Attributes: Phone_No , Email_Id
Multi-Valued attributes:
Employee Skill set. , Hobbies.
Derived Attributes:
Age from DOB ,Gross salary from basic salary.
Composite Attributes:
Address comprises of Name , Locality & House No

Name comprises of First Name , Middle Name , Last Name

Question & Answer


WHAT IS DIFFERENCE BETWEEN
SIMPLE ATTRIBUTE AND COMPOSITE
ATTRIBUTE?
WHAT IS DIFFERENCE BETWEEN
DERIVED ATTRIBUTE AND MULTI VALUED
ATTRIBUTE?

Guidelines for Defining relationships


Explain any restrictions on participation in the
relationship
Explain extent of the history that is kept in the
relationship
Explain whether an entity instance involved in
a relationship instance can transfer
participation to another relationship instance.

Example: One to One Relationship Type:


Example:

DEPARTMENT

DIRECTOR

Example: One to Many Relationship Type:

PRODUCT

VENDOR

Example: One to Many Relationship Type:

course

STUDENT

Many to Many Relationship Type:

Example:
INSTRUCTOR

STUDENT

Many to Many Relationship Type:

INSTRUCTOR

COURSE

Resolving Many-to-Many Relationships

Many-to-many relationships should be avoided.


We can resolve a many-to-many relationship by
dividing it into two one-to-many relationships.

Employees of a large company, e.g., IBM, where


an employee reports to a manager. The
manager is also an employee who reports to
another manager. This chain of command
continues to the very top where the CEO is the
only employee who is not reporting to a
manager. Draw the ER diagram for this
example.

SS#
Emp
name
address

Works
for

Primary keys:
Emp: SS#
Works-for: (empSS#, mgrSS#)

RELATIONSHIPS (Cont)
Example: A library database contains a listing of
authors that have written books on various
subjects (one author per book).
It also contains information about libraries that
carry books on various subjects.
Entity sets: authors, subjects, books, libraries
Relationship sets: wrote, carry, indexed.

RELATIONSHIPS (Cont)

title

Subject
matter

isbn

SS#
authors

wrote

books

name
carry
address

libraries

index

subject

Keys
Entities and relationships are distinguishable
using various keys:
A key is a combination of one or more Subject
than one
matter
title
isbn
attributes that uniquely identifies the instances of
SS#
entity
set
or relationship.
wrote
index
books
authors
subject
name social-security number,
e.g.,
carry
Member-id, quantity
Combination of
order_id
and Product_id
libraries
address

Candidate key
A candidate key is that uniquely
identifies either an entity or a relationship.
, e.g., social-security number,
phone number,
employment_id ,
email_id.

Alternate Key
Alternate Key: An Entity Set can have
various candidate Key. Among them only
one can be selected as primary key.
Remaining are known as alternate key.
Alternate Key = Candidate Key- Primary Key

Keys
A primary key is a candidate key that is chosen
by the database designer to identify the entities
of an entity set.
Simple Primary Key
Composite Primary Key

Criteria for Selecting Primary Key.


Only those column should be selected as
primary key which are permanent in nature or
very less likely to change.
Columns value are must
Example: employment_no , registration_no

Question and Answer

Example: A Employee Data has to be stored in


a Table containing following information.
i.e. emp_id , emp_name .emp_city ,emp_age
,emp_emailid ,emp_designation ,emp_salary
,emp_phone_no.
Explain the possible primary key i.e. Candidate Key

Question and Answer

Which column should be considered as primary


key and why?
Considering the previous example Mention all
the Alternate keys.

Keys
A foreign key is a set of one or more attributes of
a strong entity set that are employed to construct
the discriminator of a weak entity set.
The primary key of a weak entity set is formed
by the primary key of the strong entity set on
which it is existence-dependent.

D_ID
Name

Emp_Id

D_Name

:
EMPLOYEE

D_ID

DEPARTMENT

D-Address

Question & Answer


Differentiate between Foreign Key and Primary
Key.
Differentiate between Unique Key and Primary
Key.

title

Subject
matter

isbn

SS#
authors

wrote

books

quantity

carry

address

libraries

name

index

subject

Example

Consider the example of a database that contains information on the


residents of a city. The ER diagram shown in the image bellow
contains two entities -- people and cities. There is a single "Lives In"
relationship.

Cardinality:
The number of instances of entity B that can be
associated with each instance of entity A
Minimum Cardinality
The minimum number of instances of entity B that
may be associated with each instance of entity A
This is also called modality.

Maximum Cardinality
The maximum number of instances of entity B that
may be associated with each instance of entity A

How do we start an ERD?


Define Entities: These are usually nouns used
in descriptions of the system, in the discussion
of business rules, or in documentation.
Example: Customer ,Supplier ,Faculty , Student.
Add attributes to the relations; these are
determined by the queries e.g. grade; or they
may suggest the need for keys or identifiers.
Registered , Supplied ,orders

How do we start an ERD?


Define Relationships: these are usually verbs
used in descriptions of the system or in
discussion of the business rules .
Add cardinality to the relations.
ERD, but they can be used with clients to
discuss business rules.

Goal
Capture as much of the meaning of the data as
possible
If you know the rules of normalization, referential
integrity, foreign keys, etc., this is good but not as
important now.
Much more important is to get the organizational
data model correct, i.e. to understand the actual
data requirements for the organization.
Result
A better design that is scalable and easier to
maintain

Database Modeling and Implementation


Process
Ideas

ER Design

Relational Schema

Relational DBMS
Implementation

Database optimization
Database optimization is to maximize the use of
system resources to perform work as efficiently and
rapidly as possible.
Put the most unique data element first in the index, the
element that has the biggest variety of values.
The index will find the correct page faster.
Keep indexes small.

Database optimization
It's better to have an index on just zip code or postal
code. (Simple Attributes)
The smaller the index, the better the response time.
For high frequency functions (thousands of times per
day) it can be wise to have a very large index, so the
system does not even need the table for the read
function.

Database optimization
Indexes are used to find rows with specific column
values fast.
Without an index, MySQL has to start with the first
record and then read through the whole table to find the
relevant rows.
The larger the table, the more this costs.

Database optimization
If the table has an index for the columns in question,
MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the
data.
If a table has 1,000 rows, this is at least 100 times faster
than reading sequentially.

Database optimization
For small tables an index is disadvantageous.
An index slows down additions, modifications and deletes.
It's not just the table that needs an update, but the index as
well. So, preferably, add an index for values that are often
used for a search, but that do not change much. An index
on bank account number is better than one on balance.

Question & Answer


1. Primary key column of the Table can accept null
values
a) True
b) False.
2. Composite

column.
a) True
b) False

primary key consist of more than one

Question & Answer


3. Primary key is a possible candidate key
a)
b)

True
False

4. Relationships are defined as :


a)
b)
c)
d)

Diamond symbol
Rectangle
Double Rectangle
Oval

Question & Answer


5. Independence of data / Application program on physical
storage method of database is called.
(a) Logical Data Independence
(b) Null Value Treatment
(c) Physical Data Independence
(d) Table

6. Multi-Valued Attributes are defined as


(A). Double Rectangle
(B). Double ellipse
(C). Arrow symbol
(D). Diamond

Question & Answer


7. Independence of data/ Application program on
logical storage of database is called.
(a) Logical Data Independence
(b) View
(c) Physical Data Independence
(d) Table.

8. Foreign Key Means


a) Same as primary key
b) Column of a Table which is related to primary key column of
another Table. C) Cant accept null values d) None of the above.

Question & Answer


9. Attributes are
i) Properties of relationship
ii) Degree to entities
iii) Properties of members of an entity set

(a ) i (b) i and ii
(c) i and iii
(d) iii
10 Candidate Key
a) Possible Primary key
b) Set of similar attributes
c) a collection of related entities
d) related data

Thank You
Please forward your query
To: gdubey@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT

DBMS
Semester - II
Session - 3
Ms. Archana Singh

Module : Database design


What is Functional Dependency (FD)
Types of Functional Dependency
Full FD
Partial FD
Transitive DF

Definition of Normalization
Different Normal Forms (NF)
1 NF : First Normal Form
2 NF : Second Normal Form
3 NF : Third Normal Form

Module : Database design

BCNF: Byoce Codd Normal Forms


4 NF: Fourth Normal Form
5 NF : Fifth Normal Form
Examples
Question & Answer

Definition: Functional Dependency


In a given Relation R , including attribute A and
B, B is said to be functional dependent on A if, for
every valid occurrence, the value A determines the
value B.
Functional Dependency is symbolically represented as :

AB .

It is read as B is functionally dependent on A

Definition: Functional Dependency


FD is an acronyms for Functional Dependency.
It represents integrity constraints.
FDs are checked by the database management
system (DBMS) at every update.
So we are interested in finding the smallest set
of FDs that capture the intended meaning of
the data.

Definition: Functional Dependency


A more formal definition:
Given Relation R, an instance of a relation, and X and Y,
arbitrary attribute subsets of R, then Y is functionally
dependent on X:
XY
If and only if each X-value in R is associated with
precisely one Y-value in R.

Use functional dependencies:


We use functional dependencies to test a relations to see if they
are legal under a given set of functional dependencies.
If a relation R is legal under a set S of functional dependencies, we
say that R satisfies S.
Specify constraints on the set of legal relations: we say that S holds
on R if all legal relations on R satisfy the set of functional
dependencies S.

Example
Let us consider a STUDENT Relation with following
set of attributes:
(S_ID , S_Name , S_Age , S_City ,S_Course )
How can we select the Determinant of the above Relation.
Attribute which can uniquely identify each Tupple of
the Relation can be considered as Determinant

Example
From the above Relation we can say that
S_ID S_NAME
S_ID S_AGE
S_ID S_CITY
S_ID S_COURSE

S_ID is determinant

Example
In the previous example S_ID is called the
Determinant of the Relation.
It is also known as Prime attribute or Primary
key.
Remaining attribute of the relation is known as
non-prime attribute

Functional Dependency: Types


Functional Dependency are categorized as
follows:

1. Full Functional Dependency


2. Partial Functional Dependency
3. Transitive Functional Dependency

1. Full Functional Dependency:


A given relation (R ) is said to possess Full functional
dependency if with the given two set of attributes:
One set of attributes are represented by (X )
Another set of attributes are represented by (Y)
if for a given values for the set of attribute X attribute set Y
has unique value and there is no Z where Z is subset of X
on which Y is dependent.

It is represented as:

X Y

Example
Let us Consider a Relation ORDER_INFO
With list of attributes:
( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,
Quantity_ORDERED , TOTAL_ITEM_PRICE )

Example
In previous example
Set X is represented by : ORDER_ID , ITEM_ID
Set Y is represented by : ITEM_DESCRIPTION ,
PRICE , Quantity_ORDERED , TOTAL_ITEM_PRICE.
There also exist a Z where Z is subset of X

Set Z is represented Item_ID on which


Item_Price and ITEM_Description is dependent

Example
ORDER_INFO
( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,
QNTY_ORDERED , TOTAL_ITEM_PRICE )
Therefore in this relation Full functional dependency is not
exist.
There exist Partial dependency in this relation

Example
Relation ORDER_INFO can be converted into Full FD
ONLY BY DECOMPOSING the Relation into Two
R1 (ORDER_ID , ITEM_ID, QNTY_ORDERED ,
TOTAL_ITEM_PRICE )
R2 ( ITEM_ID, ITEM_DESCRIPTION ,PRICE )

Partial Dependency
2. Partial Functional Dependency:
A given relation (R ) is said to possess Partial Functional
Dependency if with the given two set of attributes:
One set of attributes are represented by (X )
Another set of attributes are represented by (Y)
if for a given values for the set of attribute X attribute set Y
has unique value and there exist Z where Z is subset of X on
which Y is dependent.

Example
ORDER_INFO
( ORDER_ID , ITEM_ID, ITEM_DESCRIPTION ,PRICE ,
QNTY_ORDERED , TOTAL_ITEM_PRICE )
There exist Partial dependency in this relation.
ITEM_DESCRIPTION and PRICE are partially
dependent on Item_id.

Example
Let us think of this relation and Types of
Dependency.
( S_ID , S_Name , S_City , Subject_ID ,GRADE )

Transitive Dependency
3. Transitive Functional Dependency:
A given relation (R ) is said to possess Transitive Functional
Dependency if a nonprime attribute is dependent on another non
prime attribute.
ABC
It means C is transitively dependent on A WHERE B and C are
nonprime attributes

Example
Let us think of the relation name Student
with list of attributes
( S_ID , S_Name , S_City , Dept_ID , HOD)
S_ID Dept_ID HOD

Example
Let us think of another relation name
Faculty with list of attributes
( F_ID , F_Name , F_Salary , Dept_ID Dept_Location)

F_ID Dept_ID Dept_Location

Example
In the previous example HOD is transitively
dependent on S_ID

?
Because HOD is dependent on DEPT_ID and
DEPT_ID is dependent on S_ID

Example
In the next example Dept_Location is transitively
dependent on F_ID

Because Dept_Location is dependent on DEPT_ID


and DEPT_ID is dependent on F_ID

In any Relation There must exist a Full Functional


Dependency.
Transitive and Partial Functional Dependency
must be avoided.
Such Dependency must cause lots of redundancy
and data Inconsistency

Normalization
Database normalization It is the step by step process of
removing redundant data from the database in order to
improve storage efficiency, data integrity, and consistency.
Normalization generally involves splitting existing tables into
multiple ones, which must be re-joined or linked each time a
query is issued.

Normalization
Decomposition process in Normalization
are of two types:
Lossless Decomposition.
Lossy Decomposition.
Decomposition always should be Lossless
Decomposition

Normalization
Edgar F. Codd originally established three normal
forms:
1NF, 2NF and 3NF.
There are now others that are generally accepted, but
3NF is widely considered to be sufficient for most
applications.
Most tables when reaching 3NF are also in BCNF
(Boyce-Codd Normal Form)

Normalization
5NF
4NF
3NF
2NF
1NF
Redundancy
Redundancy
Redundancy

Normalization
Normalization is based on the idea that an attribute may
depend on another attribute in some way.
There are 2 different kinds of dependencies involved up
to 5 NF
Functional dependency
Multivalued dependence

First Normal Form


First Normal Form: A relation R is said to be
in First Normal Form (1NF) if for a given row
each column should have atomic value.
Atomic value means column should have one
and only one value
Atomic value in the column will help in easy
access of data

Example: Table 1
Title

Author1

Author ISBN
2

Subject

Pages Publisher

Database
System
Concepts

Abraham
Silberschatz

Henry F. 0072958863
Korth

MySQL,
Computers

1168

McGrawHill

Operating
System
Concepts

Abraham
Silberschatz

Henry F. 0471694665
Korth

Computers

944

McGrawHill

Limitations with Table1


This table is not very efficient with storage.
This design does not protect data integrity.
Third, this table does not scale well.

In Table 1,
We have two violations of First Normal Form:
First :

We have more than one author field,

Second: Our subject field contains more than one piece of


information.
With more than one value in a single field, it would be very
difficult to search for all books on a given subject.

First Normal Form Table (Table 2)

Title

Author

ISBN

Subject

Pages

Publisher

Database System
Concepts

Abraham
Silberschatz

0072958863

MySQL

1168

McGraw-Hill

Database System
Concepts

Henry F. Korth

0072958863

Computers

1168

McGraw-Hill

Operating System
Concepts

Henry F. Korth

0471694665

Computers

944

McGraw-Hill

Operating System
Concepts

Abraham
Silberschatz

0471694665

Computers

944

McGraw-Hill

First Normal Form Table (Table 2)


Table 2 is in I NF.
Every column of the Table has atomic value.
Data Integrity will be maintained in the Table.
Problem in Table 1 was removed

Example: 1NF
Order (OrderNumber, OrderDate, {PartNumber,
{Supplier}})
Order

(OrderNumber, OrderDate)

Order-Part (OrderNumber, PartNumber)


Part (PartNumber, {Supplier})

Second Normal Form : 2 NF


A Relation or Table is said to be in 2 NF ,If no non
key attributes are dependent on part of the primary
key .
Table or Relation should not possess Partial
Dependency in 2NF.

2nd Normal Form


No partial dependencies.
No attribute depends on only some of the attributes of
a concatenated key.
Order-Part
[OrderNumber | PartNumber | PartDescription]
Create a new table with Part Number key.

ENO

Name

Dno

DeptName

E001

Somchai

D01

Physic

P01

NMR

E001

Somchai

D01

Physic

P02

Laser

E002

Sompong

D01

Physic

P03

Medical Image processing

E003

Somchay

D02

Computer
Science

P05

Voice ordering

E003

Somchay

D02

Computer
Science

P04

Speech Coding

E004

SomSiri

D02

Computer
Science

P04

Voice ordering

E004

SomSiri

D02

Computer
Science

P06

Speech Synthesis

KEY = ENO + ProjNo

ProjNo

ProjName

Answer is No. Because


ProjNo is dependent on
ProjNo. (not all part of Key)

Problem
ENO

Name

Dno

DeptName

ProjNo

ProjName

E001

Somchai

D01

Physic

P01

NMR

E001

Somchai

D01

Physic

P02

Laser

E002

Sompong

D01

Physic

P03

Medical Image processing

E003

Somchay

D02

Computer Science

P05

Voice ordering

E003

Somchay

D02

Computer Science

P04

Speech Coding

E004

SomSiri

D02

Computer Science

P04

Voice ordering

E004

SomSiri

D02

Computer Science

P06

Speech Synthesis

We can not insert Project if have not yet


assigned project to any employee

Result
Project

PERSON
ENO

Name

Dno

DeptNa
me

E001

Somchai

D01

Physic

E003

Somchay

D02

Computer
Science

E004

SomSiri

D02

Computer
Science

PERSON_Proj

Proj ProjName
No

ENO

ProjN
o

P01

NMR

E001

P01

P02

Laser

E001

P02

P03

Medical Image
processing

E002

P03

E003

P04

P04

Speech Coding

E004

P05

P05

Voice ordering

E004

P06

P06

Speech Synthesis

PERSON(ENO,NAME,Dno,DeptName)
PROJECT(ProjNo,ProjName)
PERSON_PROJ(ENO,ProjNo)

Difference Between 1NF & 2NF


Relation in 1 NF only removes the non atomicity
among the attributes.
It means Attributes in the relation shoul have atomic
value only
Where as a Relation in 2 NF also takes care of Partial
Dependency among the attributes
Non Key attributes should only dependent on whole
primary key .

Third Normal Form : 3NF


3rd Normal Form: No transitive dependencies
should exist.
Relation must be at least in 2 NF
Transitive dependency means that a non-key
attribute depends on another non-key attribute(s).

Transitive dependent
R(A,B,C,D) ; A is Key, others are non- key

If A B and B C
can say
A B C (C transitive dependent on A)

Third Normal Form : 3NF


Definition:
A relation is said to be in 3 NF if it is in
second normal form and no transitive
dependency should exist.

3NF?
Project

PERSON
ENO

Name

Dno

DeptNa
me

E001

Somchai

D01

Physic

E003

Somchay

D02

Computer
Science

E004

SomSiri

D02

Computer
Science

Answer is No
Because DeptName is dependent on Dno
(has transitive dependent on key)

PERSON_Proj

Proj ProjName
No

ENO

ProjN
o

P01

NMR

E001

P01

P02

Laser

E001

P02

P03

Medical Image
processing

E002

P03

E003

P04

P04

Speech Coding

E004

P05

P05

Voice ordering

E004

P06

P06

Speech Synthesis

Result
PERSON
ENO

Name

Dno

E001

Somchai

D01

E003

Somchay

E004

SomSiri

Project

PERSON_Proj
ENO

D02

Proj ProjName
No

ProjN
o

D02

P01

NMR

E001

P01

P02

Laser

E001

P02

P03

E002

P03

DeptNa
me

Medical Image
processing

E003

P04

P04

Speech Coding

D01

E004

P05

Physic

P05

Voice ordering

D02

Computer
Science

E004

P06

P06

Speech Synthesis

D02

Computer
Science

Department
Dno

Example
Let us think of another relation name
Faculty with list of attributes
( F_ID , F_Name , F_Salary , Dept_ID Dept_Location)

Above Relation is not in 3 NF because there


exist a Transitive dependency

Example
Relation can be converted into 3NF by
Decomposing the above relation into two
relation R1 and R2
R1 (F_ID , F_Name , F_Salary , Dept_ID )
R2 ( Dept_ID Dept_Location)
Transitivity has been removed

Difference Between 2NF & 3NF


Relation in 2 NF only removes the Partial
dependency among the attributes.
Where as a Relation in 3 NF also takes care of
Transitivity.

Note
The third normal form is often reached in practice by
inspection, in a single step.
Its meaning seems intuitively clear; it represents a
formalization of designers common sense.
This level of normalization is widely accepted as the
initial target for a design which eliminates redundancy.
However, there are higher normal forms which, although
less frequently invoked, highlight further redundancy
problems which may affect the designer

Boyce-Codd Normal Form : BCNF

BCNF: Every Determinant is a candidate key.


Determinant: any attribute(s) that functionally
determine another attribute
BCNF means that there are no transitive
dependencies involving key or non-key attributes.
BCNF is a refinement to third normal form, and
tightens its duration.

Difference Between 3NF & BCNF


3 NF only checks for Transitivity
If a Relation is in 3NF then Transitive
Dependency should not exit.
Where as in some complex situation there exist
Dependency among the determinant of the
RELATION.
BCNF removes such Dependency

Fourth Normal Form: 4 NF


No multi valued dependencies
A multi valued dependency of column B on
column A occurs when a table has a key with
three or more attributes, (A, B, C) and
each value of A is associated with a collection
of values of B
this collection of values is independent of C

A
b1
b2
b3
b4
c1
c2
c3

Example: Multi valued dependence


One Part has many suppliers. One Part is used in multiple
projects. One supplies is supplying in multiple projects

[Pard_Id | Supplier_ID | Project_ID]

[ [Part_ID | Supplier_ID] [Part-ID | Project_ID]

Fifth Normal Form: 5 NF


A relation is said to be in 5 NF , If it is in 4 NF and
Decomposition of Relation should be lossless.
It means natural join of all the decomposed Relation
should produce the original Relation

Example: 5 NF

[Pard_Id | Supplier_ID | Project_ID]

[ [Part_ID | Supplier_ID] [Part-ID | Project_ID]


We can add one more relation [ Supplier_ID |
Project_ID]

5 NF
Thus if we take natural join of all the
decomposition relation , It will produce the
original Relation.
Else there will be loss of information
Such Decomposition is called Lossless
Decomposition .

5 NF
If after joining all the decomposed Relation
original Relation is not produced then the
decomposition is called lossy decomposition.
Lossy decomposition or Relation or Table
will cause loss of information

De Normalization
De Normalization is reverse of Normalization:
Some times to improve the performance of the
system we need to de-normalize the Relations
Before De Normalization following must be
considered: Use with caution
Normalize first, then de-normalize
Use only when you cannot optimize

Data Integrity
1. Data into the database must be as per predefined
set of rules, as determined by:
The DBA or Application developer.
2. When an integrity constraint applies to a table, all
data in the table must conform to the corresponding
rule.
3. When you issue a SQL statement that modifies data
in the table, Oracle Database ensures that the new
data satisfies the integrity constraint, without the
need to do any checking within your program.

Data Integrity
1.

You can enforce rules by defining integrity constraints


more reliably than by adding logic to your application.

2.

Oracle Database can check that all the data in a table


obeys an integrity constraint faster than an application
can

Data Integrity
Example of data integrity:
Consider the tables employees and departments and
the business rules for the information in each of the
tables,
As illustrated in Figure : ensure that each employee
works for a valid department, first create a rule that all
values in the department table are unique and value in
the foreign key column must be same as value in
primary key column or NULL

Types of Data Integrity


Primary Key Values
A rule defined on a column or set of columns that
specifies that each row in the table can be uniquely
identified by the values in the key.
Referential Integrity Rules
A referential integrity rule is a rule defined on a key (a
column or set of columns) in one table that guarantees
that the values in that key match the values in a key in a
related table (the referenced value).

Question & Answer


1. Normalization is step by step process of decomposing:

a Table

b) Database

c) Group Data item

d) All of the above


2. The keys that can have NULL values are
A). Primary Key
B). Unique Key
C). Foreign Key
D). Both b and c

Question & Answer


3. Rows

of a relation are called


a) Tuples
b) Column
c) a data structure
d) an entity
4. A relation is said to be in 2 NF if
i) It is in 1 NF
ii) Non-key attributes dependent on key attribute
iii) Non-key attributes are independent of one another
iv) If it has a composite key, no non-key attribute should
be dependent on part of the composite key.

(a) i, ii, iii (b) i and ii


(c) i, ii, iv (d) i, iv

Question & Answer


5. A relation is said to be in BCNF when
a) It has overlapping composite keys
b) It has no composite keys
c) It has no multi valued dependencies
d) It has no overlapping composite keys which have
related attributes
6. Fourth Normal form (4 NF) relations are needed when.
there are multi valued dependencies between
attributes in composite key
there are more than one composite key
there are two or more overlapping composite keys
there are multi valued dependency between non-key
attributes

Question & Answer


7.
a)
b)
c)

Transitive Dependency explains :


Dependency of Key attributes to another Key attributes
Dependency of Key attributes to another non Key attributes
Dependency of non Key attributes to another non Key
attributes
d) All of the above
8. Partial Dependency exist in a realtion when
a) A Non key attribute depends on part of the primary key
b) Priamary Key doesnt exist
c) A non key attribute is dependent on another non key attriburte
d) None of the above

Question & Answer

9) Full functional dependency is always desirable in a


Relation
a) True
b) False
10) Process of Normalization increases data redundancy
and reduces data consistency
a) True
b) False

Question & Answer


11. Composite primary key consist of two or more than two
non key attributes
a) True
B) False
12.Partial dependency exist only when primary key is
composite in nature.
a) True
B) False

Question & Answer


13. Decomposition of Relation should always be
Lossless
a) True.
b) False.
14 Multi value Dependency is most desirable
Dependency
a) True.
b) False.

Case Study
A H R Consultancy Firm has hired a Database Designer to
store manipulate and retrieve the various data related of day
to day operations of Recruitment Process. Data includes the
various information related to Applicants , Jobs, Companies
and Interviewers.
Database designer has created a single table to perform various
operations over the database.
a) What are the problems with existing design of the Database?
b) Suggest a suitable database design which can solve all the
problems.

Thank You
Please forward your query
To: gdubey@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT

DBMS
Semester - II
Session - 4

By- Mr. Gaurav Dubey

Data Recovery & Protection

CASE STUDY
Data Recovery
Types of Recovery :-- Transaction Recovery
System Recovery
Media Recovery

Concurrency Control in DBMS


Concurrency Control Techniques
Locking
Types of Locking :
READ Lock & Write lock

Data Recovery & Protection

Definition: Serializability
Serial Schedule & Non- Serial Schedule.
Examples.
Database Security
Question & Answer

CASE STUDY

CASE STUDY
To maintain and operate on various Information
related to Student , Faculty , Course & Result the
management of a training Institute has hired a
Database Designer .
Data related to the training Institute includes:
( S_Id , S_Name , S_Age , S_Course , C_Id ,
C_Fee ,C_Duration ,Faculty_Name & S_Grade)

CASE STUDY
Database designer has created a single table to
perform various operations over the database.
a) What are the problems with existing design
of the Database?
b) Suggest a suitable database design which
can solve all the problems.

CASE STUDY
Data Duplicity or Data Redundancy
Anomalies: -- Insert ,
Update
Delete
Data Inconsistency

Limitation of Existing Design


If a student has join for multiple course
then student detail and will be repeated.
If multiple students has join the same course
Information about the course will be repeated.
Course info cant be stored until and unless students
are registered for the course

Limitation of Existing Design

Multiple Updating of data is required.


There may be loss of data in some cases
if deletion operation is performed.
Data may become inconsistent in case of
multiple updating.

Proposed Solution

Existing database design should be further


decomposed into more than one relation.
Data into the database should be
normalized.
Decomposition must be lossless.

R1 (S_Id , S_Name , S_Age )

S_ID (Primary Key)


R2 (C_Id , C_Fee ,C_Duration ,Faculty_Name )
C_ID (Primary Key)
R3 (S_ID, C_Id , Grade)
S_ID + C_ID (Primary Key)
S_ID (Foreign Key) & C_ID (Foreign_Key)

Definition:-- Transaction
A transaction is the basic logical unit of execution in
an information system.
A transaction is a sequence of operations that must be
executed as a whole.
.

It is process of taking one consistent (& correct)


database state into another consistent (& correct)
database state.

Definition:-- Transaction
A collection of actions that make consistent
transformations of system states while preserving system
consistency
Example:
RAED A
READ B
A=A-5000
WRITE A
B = B+ 5000
WRITE B

Schedules of Transactions

A schedule S of n transactions is a
sequential ordering of the operations
of the n transactions.
The transactions are interleaved

Schedules of Transactions

A schedule maintains the order of


operations within the individual
transaction.
For each transaction T if operation a is
performed in T before operation b, then
operation a will be performed before operation
b in Schedule S.
The operations are in the same order as they
were before the transactions were interleaved

Schedules of Transactions

Two operations conflict if they belong to


different transactions, AND access the
same data item AND one of them is a
write.

Example
T 1 :
r e a d _ ite m ( X ) ;
X := X - N ;

T 2 :
r e a d _ ite m ( X ) ;
X := X + M ;

w r ite _ ite m ( X ) ;
r e a d _ ite m ( Y ) ;
w r ite _ ite m ( X ) ;
Y := Y + N ;
w r ite _ ite m ( Y ) ;

Serial Schedules
Schedule S is said to be serial if: --- For every transaction T participating in the
schedule, all of T's operations are
executed consecutively in the schedule.
Otherwise it is called non-serial.

Example: Serial Schedule


T1:
r e a d _ ite m ( X ) ;
X := X - N ;
w r ite _ ite m ( X ) ;
r e a d _ ite m ( Y ) ;
Y := Y + N ;
w r ite _ ite m ( Y ) ;

T2:

r e a d _ ite m ( X ) ;
X := X + M ;
w r ite _ ite m ( X ) ;

Example: Serial Schedule


T1:

r e a d _ it e m ( X ) ;
X := X - N ;
w r it e _ i t e m ( X ) ;
r e a d _ it e m ( Y ) ;
Y := Y + N ;
w r it e _ i t e m ( Y ) ;

T2:
r e a d _ it e m ( X ) ;
X := X + M ;
w r it e _ it e m ( X ) ;

Non-serial Schedules
Non-serial schedules mean that transactions are
interleaved.
There are many possible orders or schedules.
Conflicting operations must be taken care of

Example: Non-serial Schedules


T1
r e a d _ ite m ( X ) ;
X := X -1 0 ;
w r ite _ ite m ( X ) ;

T2

r e a d _ ite m ( Y ) ;
Y := Y -2 0 ;
w r ite _ ite m ( Y ) ;
r e a d _ ite m ( Y ) ;
Y := Y + 1 0 ;
w r ite _ ite m ( Y ) ;

Example: Non-serial Schedules


T 1 :
r e a d _ ite m ( X ) ;
X := X - N ;

T 2 :
r e a d _ ite m ( X ) ;
X := X + M ;

w r ite _ ite m ( X ) ;
r e a d _ ite m ( Y ) ;
w r ite _ ite m ( X ) ;
Y := Y + N ;
w r ite _ ite m ( Y ) ;

Theory of Serializability

Serial and Non-serial Schedules


Serializability theory attempts to determine
the 'correctness' of the schedules.
A schedule S of n transactions is
serialisable if it is equivalent to some serial
schedule of the same n transactions.

Serializability:-- Conflicting

Serializability

Non-Conflicting Serializability

Example Serializability (Conflicting)


T 1:
r e a d _ ite m ( X ) ;
X := X - N ;

T 2:
r e a d _ ite m ( X ) ;
X := X + M ;

w r ite _ ite m ( X ) ;
r e a d _ ite m ( Y ) ;
w r ite _ ite m ( X ) ;
Y := Y + N ;
w r ite _ ite m ( Y ) ;

Example of Non Serial Schedules


(Not Conflicting)
T1
r e a d _ ite m ( X ) ;
X := X -1 0 ;
w r ite _ ite m ( X ) ;

T2

r e a d _ ite m ( Y ) ;
Y := Y -2 0 ;
w r ite _ ite m ( Y ) ;
r e a d _ ite m ( Y ) ;
Y := Y + 1 0 ;
w r ite _ ite m ( Y ) ;

The Transaction Manager


The transaction
manager enforces the
ACID properties
It schedules the
operations of
transactions
COMMIT and
ROLLBACK are used to
ensure atomicity

Locks or timestamps are


used to ensure
consistency and isolation
for concurrent
transactions (next
lectures)
A log is kept to ensure
durability in the event of
system failure (this
lecture)

Properties
of Transaction
A
Atomicity: a transaction is an atomic unit of processing and
it is either performed entirely or not at all
C
Consistency Preservation: a transaction's correct execution
must take the database from one correct state to another.
I
Isolation/Independence: the updates of a transaction must
not be made visible to other transactions until it is committed

Example
T1:
r e a d _ it e m ( X ) ;
X := X - N ;

T2:
r e a d _ it e m ( X ) ;
X := X + M ;

w r it e _ it e m ( X ) ;
r e a d _ it e m ( Y ) ;
w r it e _ it e m ( X ) ;
Y := Y + N ;
w r it e _ it e m ( Y ) ;

Properties
of Transaction
D

Durability (or Permanency): if a transaction


changes the database and is committed, the changes
must never be lost because of subsequent failure
READ A
A=A-1500
WRITEA
READ B
B= B+1500
WRITE B

Concurrency Control
Most DBMS are multi-user systems.
The concurrent execution of many different transactions
submitted by various users must be organized such that
each transaction does not interfere with another transaction
with one another in a way that produces incorrect results.
The concurrent execution of transactions must be such that
each transaction appears to execute in isolation

Concurrency Problems
In order to run
transactions
concurrently we
interleave their
operations
Each transaction gets a
share of the computing
time

This leads to several


sorts of problems
Lost updates
Uncommitted updates
Incorrect analysis

All arise because


isolation is broken

Example

Locking Techniques for Concurrency Control


The concept of locking data items is one of the
main techniques used for controlling the
concurrent execution of transactions.
A lock is a variable associated with a data item
in the database. Generally there is a lock for
each data item in the database.

Locking Techniques for Concurrency Control


A lock describes the status of the data item with
respect to possible operations that can be
applied to that item.
It is used for synchronizing the access by
concurrent transactions to the database items.
A transaction locks an object before using it
When an object is locked by another transaction,
the requesting transaction must wait

Types of Locks

Binary locks have two possible states:


1. locked (lock_item(X) operation) and
2. unlocked (unlock_item(X) operation

Types of Locks

Multiple-mode locks allow concurrent access


to the same item by several transactions.
Three possible states:
1. read locked or shared locked (other transactions are
allowed to read the item)
2. write locked or exclusive locked (a single
transaction exclusively holds the lock on the item)
and
3. unlocked.

Two-Phasing Locking
Basic 2PL
When a transaction releases a lock, it may not
request another lock
Conservative 2PL or static 2PL
A transaction locks all the items it accesses
before the transaction begins execution
Pre-declaring read and write sets

Two-Phasing Locking
lock point
obtain lock
number
of locks

release lock

Phase 1
BEGIN

Phase 2
END

Two-Phasing Locking
Strict 2PL a transaction does not release
any of its locks until after it commits or
aborts
leads to a strict schedule for recovery

Two-Phasing Locking
obtain lock
release lock

number
of locks

BEGIN

period of data END


item use

Transaction
duration

Deadlocks and Live locks


Deadlock prevention protocol:
conservative 2PL
transaction stamping (younger transactions
aborted)
no waiting
cautious waiting
time outs

Deadlocks and Live locks


Deadlock detection (if the transaction load
is light or transactions are short and lock
only a few items)
wait-for graph for deadlock detection
victim selection
cyclic restarts

Deadlocks and Live locks


Live lock: a transaction cannot proceed for
an indefinite period of time while other
transactions in the system continue
normally.
fair waiting schemes (i.e. first-come-firstserved)

Locking Granularity
A database item could be
a database record
a field value of a database record
a disk block
the whole database

Locking Granularity
Trade-offs
Coarse granularity
the larger the data item size, the lower the degree
of concurrency

Fine granularity
the smaller the data item size, the more locks to be
managed and stored, and the more lock/unlock
operations needed.

Database Backup and Recovery Concepts


Backup of Database means to make single or multiple
copies of data files, control file, and archived redo logs

Restoring a Database means copying the physical files


that make up the database from a backup medium,
typically disk or tape, to their original or to new
locations

Database Backup and Recovery Concepts

A backup is either consistent or inconsistent.


To make a consistent backup, database must have been shut
down cleanly and remain closed for the duration of the backup.
All committed changes in the redo log are written to the data
files, so the data files are in a transaction-consistent state.
When restoring data files from a consistent backup, you can
open the database immediately

COMMIT and ROLLBACK


COMMIT signals the
successful end of a
transaction
Any changes made by
the transaction should be
saved
These changes are now
visible to other
transactions

ROLLBACK signals the


unsuccessful end of a
transaction
Any changes made by
the transaction should
be undone
It is now as if the
transaction never
existed

Recovery
Transactions should be
durable, but we cannot
prevent all sorts of
failures:

System crashes
Power failures
Disk crashes
User mistakes
Sabotage
Natural disasters

Prevention is better than


cure

Reliable OS
Security
UPS and surge protectors
RAID arrays

Cant protect against


everything though

Forwards and Backwards


Backwards recovery
We need to undo some
transactions
Working backwards
through the log we undo
any operation by a
transaction on the UNDO
list
This returns the database
to a consistent state

Forwards recovery
Some transactions need to
be redone
Working forwards through
the log we redo any
operation by a transaction
on the REDO list
This brings the database
up to date

The Transaction Log


The transaction log records
the details of all
transactions
Any changes the
transaction makes to the
database
How to undo these
changes
When transactions
complete and how

The log is stored on disk,


not in memory
If the system crashes it is
preserved
Write ahead log rule
The entry in the log must
be made before
COMMIT processing
can complete

System Failures
A system failure means all
running transactions are
affected
Software crashes
Power failures
The physical media (disks)
are not damaged

At various times a DBMS


takes a checkpoint
All committed
transactions are written
to disk
A record is made (on
disk) of the transactions
that are currently running

Types of Transactions
T1
T2
T3
T4
T5

Last Checkpoint

System Failure

Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint
UNDO: T2, T3

Failure

Last Checkpoint

REDO:
Active transactions: T2, T3

System Recovery
Any transaction that
was running at the time
of failure needs to be
undone and restarted
Any transactions that
committed since the last
checkpoint need to be
redone

Transactions of type T1 need


no recovery
Transactions of type T3 or
T5 need to be undone and
restarted
Transactions of type T2 or
T4 need to be redone

Transaction as a Recovery Unit


The database is restored to some state from the past
so that a correct stateclose to the time of failure
can be reconstructed from the past state.
A DBMS ensures that if a transaction executes some
updates and then a failure occurs before the
transaction reaches normal termination, then those
updates are undone.
The statements COMMIT and ROLLBACK (or their
equivalent) ensure Transaction Atomicity

Media Failures
System failures are
not too severe
Only information since
the last checkpoint is
affected
This can be recovered
from the transaction
log

Media failures (disk


crashes etc) are more
serious
The data stored to disk
is damaged
The transaction log
itself may be damaged

Recovery from Media Failure


Restore the database from
the last backup
Use the transaction log to
redo any changes made
since the last backup

If the transaction log is


damaged you cant do
step 2
Store the log on a
separate physical
device to the database
The risk of losing both
is then reduced

Recovery Methods:
1. Mirroring
keep two copies of the database and maintain them
simultaneously

2. Backup
periodically dump the complete state of the database to some
form of tertiary storage

Recovery
3. System Logging
the log keeps track of all transaction operations affecting the
values of database items. The log is kept on disk so that it is not
affected by failures except for disk and catastrophic failures

Recovery from Transaction Failures


Catastrophic failure

Restore a previous copy of the database from archival backup


Apply transaction log to copy to reconstruct more current state by
redoing committed transaction operations up to failure point
Incremental dump + log each transaction

Non-catastrophic failure

Reverse the changes that caused the inconsistency by undoing the


operations and possibly redoing legitimate changes which were lost
The entries kept in the system log are consulted during recovery.
No need to use the complete archival copy of the database

Transaction States
For recovery purposes the system needs to keep
track of when a transaction :
Starts,
terminates and
commits.

Transaction States
Begin_Transaction: Marks the beginning of a transaction
execution.
End_Transaction: Specifies that the read and write
operations have ended and marks the end limit of
transaction execution (but may be aborted because of
concurrency control).
Commit_Transaction: Signals a successful end of the
transaction. Any updates executed by the transaction
can be safely committed to the database and will not be
undone.

Transaction States
Rollback (or Abort): signals that the transaction has ended
unsuccessfully. Any changes that the transaction may have
applied to the database must be undone.
Undo: similar to ROLLBACK but it applies to a single
operation rather than to a whole transaction.
Redo: specifies that certain transaction operations must be
redone to ensure that all the operations of a committed
transaction have been applied successfully to the database.

Transaction Execution
A transaction reaches its commit point when all operations accessing
the database are completed and the result has been recorded in the
log.
It then writes a [commit, transaction-id].

If a system failure occurs, searching the log and rollback the


transactions that have written into the log a
[start_transaction, transaction-id]
[write_item, transaction-id, X, old_value, new_value]
but have not recorded into the log a [commit, transaction-id]

Transaction execution
Transaction Execution:------BEGIN
TRANSACTION
active

END
TRANSACTION
partially
committed

COMMIT
committed

READ, WRITE

ROLLBACK

ROLLBACK

failed

terminated

Question & Answer

1. Transaction is defines as
(a) Single logical unit of work
(b) Repeating same data item
(c) Correcting database
(d) Removing data duplicity

2. In 2PL When a transaction releases a lock, it may not


request another lock.
a) True
b) False

Question & Answer


3. The concept of locking data items is one of the main
techniques used for :
a) Controlling the concurrent execution of transactions
b) Controlling the data redundancy
c) Avoid Transitivity
d) None of these
4. A

transaction's correct execution must take the database from


one correct state to another is known as:

a)
b)
c)
d)

Atomicity
Isolation
Both a and b
Consistency

Question & Answer


5. A schedule S is serial if, for every transaction T
participating in the schedule, all of T's operations are
executed consecutively in the schedule
a) True
b) False

6. A schedule S of n transactions is serializable if it is


equivalent to some serial schedule of the same n
transactions
a) True
b) False

Question & Answer


7. Which among the following is types of LOCK is known
as exclusive lock
a) Read Lock
b) Write LOCK
c) Shared Lock
d) All of the above
8. A transaction reaches its commit point when all
operations accessing the database are completed and the
result has been recorded in the log
a) True
b) False

Question & Answer


9. Which among the following is data recovery method.
a) System Logging

b) Mirroring
c) Both a and b
d) None

10 A transaction cannot proceed for an indefinite period of


time while other transactions in the system continue
normally is called

a) Dead Lock
b) Live Lock
c) Binary Lock
d) Both a and b

Thank You
Please forward your query
To: skjha2@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT
Data Base Management System

Semester - II
Session - 5
By Mr. Gaurav Dubey

SQL Environment
D D L Statements
Data Types
Constraints

VIEW
D M L Statements
INSERT , UPDATE
DELETE

A simplified schematic of a typical SQL environment, as


described by the SQL-2003 standard

Writing SQL Statements

SQL statements are not case sensitive


(but criteria within quotation marks are for some RDBMS)
SQL statements can be on one or more lines
Clauses are usually placed on separate lines
Keywords cannot be split across lines
Tabs and spaces are allowed to enhance readability
Each SQL statement (not line) ends with a semicolon (;)

Components
Oracle
Server
Front end
application
VB, Developer
,Access

ODBC Driver

Oracle
Sybase
Access

Sybase
Server
Access
DB

Core Database Engine


ORACLE RDBMS (Oracle Universal server)
Integrated Data Dictionary: manage tables
owned by all users in a system
SQL: language to access and manipulate
data
PL/SQL: a procedural extension to SQL
language
7

SQL*Plus
Command line tool that process users SQL statements
Requires Oracle account

DDL

Data Definition

DML

Data Manipulation

DCL

Data Control

SQL

Help command

Typing a SQL command

Saving SQL command in a file


Editing SQL command in a file

10

Structured Query Language features


DQL (Data Query Language)
SELECT
Used to get data from the database and impose ordering
upon it.
DML (Data Manipulation Language)
DELETE, INSERT, UPDATE
Used to change database data.
DDL (Data Definition Language)
DROP, TRUNCATE, CREATE, ALTER
Used to manipulate database structures and definitions.
RIGHTS
REVOKE, GRANT
Used to give and take access rights to database objects.

DDL, DML, DCL, and the database


development process

12

DDL (Data Definitions Language)


SQL commands are divided into a number of categories,
of which the DDL commands are but one part:

Data Definition Language commands


Data Manipulation Language commands
Transaction Control commands
Session Control commands

DDL (Data Definitions Language)


Data Definition Language commands allow you to
perform these tasks:

Create,
Alter, and
Drop objects

Data Definition examples


CREATE TABLE TABLE_NAME
( COLUMN_NAME DATA_TYPE [(SIZE)]
COLUMN_CONSTRAINT,
[, other column definitions,...]
[, primary key constraint]
)

Data Definition examples


ALTER TABLE TABLE_NAME
ADD | DROP | MODIFY
( COLUMN_NAME DATA_TYPE
[(SIZE)] COLUMN_CONSTRAINT,
[, other column definitions,...]
)

Create, modify, drop Tables, views, and sequences


Table
Name

emp
Type

---------------------------------------------------------------EMPID

NUMBER(5)

FNAME
VARCHAR2(20)
LNAME
VARCHAR2(20)
SEX
VARCHAR2(1)
SSN
VARCHAR2(9)
SALARY

NUMBER(8)

DEPTNO

NUMBER(5)
17

Create, modify, drop Tables,


views, and sequences

CREATE TABLE emp


(
empid NUMBER(5),
fname VARCHAR2(20),
lname VARCHAR2(20),
sex VARCHAR2(1),
ssn VARCHAR2(9),
salary NUMBER(8),
deptno NUMBER(5) );

Another table Dept


CREATE TABLE dept
(deptno NUMBER(5) NOT NULL, name
VARCHAR2(20) NOT NULL, building
VARCHAR2(20),
CONSTRAINT pk_deptno PRIMARY KEY (deptno)
);

19

Insert
INSERT INTO dept VALUES (4001, 'SHOES',
'BUILDING I');
INSERT INTO dept VALUES (4002, 'WOMAN
CLOTHING', 'BUILDING II');
INSERT INTO dept VALUES (4003, 'MEN CLOTHING',
'BUILDING II');
INSERT INTO dept VALUES (4004, 'KITCHEN
APPLIANCES', 'MAIN BUILDING');

20

Data Definition examples


CREATE VIEW VIEW_NAME AS QUERY_NAME
( Select col1 , col2 . From Table_Name
Where .. )
DROP TABLE TABLE_NAME
DROP INDEX INDEX_NAME ON TABLE_NAME

Create Table
This command allows the user to create a table, the
basic structure to hold user data, by specifying the
following information:

column definitions
integrity constraints
the table's table space
storage characteristics
data from an arbitrary query

Naming conventions for Table/Fields


Naming conventions for table names and
attributes names
Illegal
spaces
hyphens

Legal

letters [a-z A-Z ] + digits


_, #, $
first character must be a letter
no reserved words in SQL [no attribute called 'by or table]

23

Data Types
A table is made up of one or more columns
Each column is given a name and a data type that
reflects the kind of data it will store.
Oracle supports four basic data types
CHAR
NUMBER
DATE
RAW.
There are also a few additional variations on the RAW
and CHAR data types.
24

Data Types
VARCHAR2

Character data type.


Can contain letters, numbers and punctuation.
The syntax : VARCHAR2(size)
where size is the maximum number of alphanumeric
characters the column can hold.
In Oracle8, the maximum size of a VARCHAR2 column
is 4,000 bytes.

25

Data Types
NUMBER
Numeric data type.
Can contain integer or floating point numbers
only.
The syntax : NUMBER(precision, scale)
where precision is the total size of the number
including decimal point and scale is the
number of places to the right of the decimal.
For example, NUMBER(6,2) can hold a
number between -999.99 and 999.99.

Data Types
DATE
Date and Time data type.
Can contain a date and time portion in the
format: DD-MON-YY HH:MI:SS.
No additional information needed when
specifying the DATE data type.
the time of 00:00:00 is used as a default.
The output format of the date and time can be
modified
27

Data Types
RAW
Free form binary data.
Can contain binary data up to 255 characters.
Data type LONG RAW can contain up to 2
gigabytes of binary data.
RAW and LONG RAW data cannot be indexed
and can not be displayed or queried in
SQL*Plus.
Only one RAW column is allowed per table.

Data Types
LOB
Large Object data types.
These include BLOB (Binary Large OBject)
and CLOB (Character Large OBject).
More than one LOB column can appear in a
table.
These data types are the prefferred method
for storing large objects such as text
documents (CLOB), images, or video (BLOB).

Create Table (cont)

Create Table (cont.)


schema - is the schema to contain the table.
If you omit schema, ORACLE creates the table in your own
schema.
table - is the name of the table to be created.
column - specifies the name of a column of the table. The
number of columns in a table can range from 1 to 254.
datatype - is the datatype of a column. Datatypes are
defined previously in this manual.

Create Table (cont.)


DEFAULT - specifies a value to be assigned to the
column if a subsequent INSERT statement omits a value
for the column.
The datatype of the expression must match the datatype
of the column.
A DEFAULT expression cannot contain references to
other columns.
column_constraint - defines an integrity constraint as
part of the column definition.
table_constraint - defines an integrity constraint as part
of the table definition.

Create Table (cont.)


PCTFREE - specifies the percentage of space in each of
the table's data blocks reserved for future updates to the
table's rows.
PCTFREE has the same function in the commands that
create and alter clusters, indexes, snapshots, and
snapshot logs.
The combination of PCTFREE and PCTUSED
determines whether inserted rows will go into existing
data blocks or into new blocks.
These parameters need not be set as the default values
will be sufficient for your purpose.

Create Table (cont.)


AS sub query - inserts the rows returned by the sub query
into the table upon its creation.
After creating a table, you can define additional columns
and integrity constraints with the ADD clause of the ALTER
TABLE command.
You can change the definition of an existing column with
the MODIFY clause of the ALTER TABLE command.
To modify an integrity constraint, you must drop the
constraint and redefine it.

Create View
This directive defines a view, a logical table based on one or
more tables or views.
To create a view in your own schema, you must have the
CREATE VIEW system privilege.
The owner of the schema containing the view must have the
privileges necessary to either select, insert, update or delete
rows from all the tables or views on which the view is based.

Create View

CREATE VIEW clerk(id_number,person,department, position)


AS SELECT empno, ename, deptno, job
FROM emp
WHERE job = 'CLERK'
WITH CHECK OPTION CONSTRAINT wco
Because of the CHECK OPTION, you cannot subsequently insert a new row into
CLERK if the new employee is not a clerk.

Create View (cont.)

schema - is the schema to contain the table. If you omit schema,


ORACLE creates the table in your own schema.
OR REPLACE - recreates the view if it already exists. You can use this
option to change the definition of an existing view without dropping,
recreating, and regranting object privileges previously granted to it.
FORCE - creates the view regardless of whether the view's base tables
exist or the owner of the schema containing the view has privileges on
them. Note that both of these conditions must be true before any
SELECT, INSERT, UPDATE, or DELETE statements can be issued
against the view.
NOFORCE - creates the view only if the base tables exist and the owner
of the schema containing the view has privileges on them. The default is
NOFORCE.
schema - is the schema to contain the view. If you omit schema,
ORACLE creates the view in your own schema.

Create View (cont.)

view - is the name of the view.


alias - specifies names for the expressions selected by the view's
query. The number of aliases must match the number of
expressions selected by the view.
AS subquery - identifies columns and rows of the table(s) that the
view is based on. A view's query can be any SELECT statement
without the ORDER BY or FOR UPDATE clauses.
WITH CHECK OPTION - specifies that inserts and updates
performed through the view must result in rows that the view query
can select. The CHECK OPTION cannot make this guarantee if
there is a subquery in the query of this view or any view on which
this view is based.
CONSTRAINT - is the name assigned to the CHECK OPTION
constraint. If you omit this identifier, ORACLE automatically assigns
the constraint a name of the form ``SYS_Cn''.

Create INDEX
This directive permits the user to create an index on one or
more columns of a table or a cluster.
An index is a database object that contains an entry for each
value that appears in the indexed column(s) of the table or
cluster and provides direct, fast access to rows.

To create an index in your own schema, you must have


either space quota on the table space to contain the index or
UNLIMITED TABLESPACE system privilege.
schema - is the schema to contain the index. If you omit
schema, ORACLE creates the index in your own schema.
index - is the name of the index to be created.

Create INDEX
table - is the name of the table for which the index is to
be created.
column - is the name of a column in the table. An index
can have as many as 16 columns. A column of an index
cannot be of datatype LONG or LONG RAW.
ASC DESC - are allowed for DB2 syntax compatibility,
although indexes are always created in ascending order.
NOSORT - indicates to ORACLE that the rows are
stored in the database in ascending order and therefore
ORACLE does not have to sort the rows when creating
the index.
CREATE INDEX i_emp_ename ON emp (ename)

Constraint clause
The CONSTRAINT command is used to define an integrity
constraint.
CONSTRAINT clauses can appear in either CREATE TABLE or
ALTER TABLE commands.
CONSTRAINT - identifies the integrity constraint by the name
constraint.
ORACLE stores this name in the data dictionary along with the
definition of the integrity constraint.
NULL - specifies that a column can contain null values.

Constraint clause
NOT NULL - specifies that a column cannot contain null
values.
UNIQUE - designates a column or combination of
columns as a unique key.
PRIMARY KEY - designates a column or combination of
columns as the table's primary key.

Constraint clause
FOREIGN KEY - designates a column or combination of
columns as the foreign key in a referential integrity constraint.
REFERENCES - identifies the primary or unique key that is
referenced by a foreign key in a referential integrity constraint.
ON DELETE CASCADE - specifies that ORACLE
maintains referential integrity by automatically removing
dependent foreign key values if you remove a referenced
primary or unique key value.

Constraint clause
CHECK - specifies a condition that each row in the table must
satisfy.
USING INDEX - specifies parameters for the index ORACLE
uses to enforce a UNIQUE or PRIMARY KEY constraint.
Only use this clause when enabling UNIQUE and PRIMARY
KEY constraints.
EXCEPTIONS INTO - identifies a table into which
ORACLE places information about rows that violate an
enabled integrity constraint. This table must exist before you
use this option.
DISABLE - disables the integrity constraint. If an integrity
constraint is disabled, ORACLE does not enforce it.

Constraints (cont)
Defining Integrity Constraints - To define an integrity
constraint, include a CONSTRAINT clause in a CREATE
TABLE or ALTER TABLE statement.
The CONSTRAINT clause has two syntactic forms:
table_constraint syntax - is part of the table definition.
An integrity constraint defined with this syntax can impose rules
on any columns in the table.
This syntax can define any type of integrity constraint except a
NOT NULL constraint.

Constraints (cont)
column_constraint syntax - is part of a column definition.
In most cases, an integrity constraint defined with this syntax
can only impose rules on the column in which it is defined.
Column_constraint syntax that appears in a CREATE TABLE
statement can define any type of integrity constraint.
Column_constraint syntax that appears in an ALTER TABLE
statement can only define or remove a NOT NULL constraint.

Constraints (cont)
The table_constraint syntax and the column_constraint
syntax are simply different syntactic means of defining
integrity constraints.
There is no functional difference between an integrity
constraint defined with table_constraint syntax and the
same constraint defined with column_constraint syntax.

Constraints (cont.)
NOT NULL constraint - specifies that a column cannot
contain nulls.
To satisfy this constraint, every row in the table must contain
a value for the column.
The NULL keyword indicates that a column can contain nulls
(this is the default).
It does not actually define an integrity constraint.

Constraints (cont.)
You can only specify NOT NULL or NULL with
column\_constraint syntax in a CREATE TABLE or ALTER
TABLE statement, not with table\_constraint syntax.

ALTER TABLE emp


MODIFY (sal NUMBER CONSTRAINT
nn_sal NOT NULL);
NN_SAL ensures that no employee in the table has a null salary.

Constraints (cont.)
UNIQUE constraint - designates a column or combination of
columns as a unique key.
To satisfy a UNIQUE constraint, no two rows in the table can have
the same value for the unique key.
However, the unique key made up of a single column can contain
nulls.
A unique key column cannot be of data type LONG or LONG
RAW.
You cannot designate the same column or combination of columns
as both a unique key and a primary key.
However, you can designate the same column or combination of
columns as both a unique key and a foreign key.

Constraints (cont.)
You can define a unique key on a single column with
column_constraint syntax.
The constraint below ensures that no two departments in the
table have the same name.
However, the constraint does allow departments without
names.
CREATE TABLE dept
(deptno NUMBER,
dname VARCHAR2(9) CONSTRAINT
unq_dname UNIQUE,
loc VARCHAR2(10) )

Constraints (cont.)
PRIMARY KEY constraint - designates a column or
combination of columns as a table's primary key.
To satisfy a PRIMARY KEY constraint, both of these
conditions must be true:
no primary key value can appear in more than one
row in the table.
no column that is part of the primary key can
contain null.

Constraints (cont.)
You can use the column_constraint syntax to define a primary
key on a single column.
The constraint below ensures that no two departments in the
table have the same department number and that no
department number is NULL.

CREATE TABLE dept


(deptno NUMBER
CONSTRAINT pk_dept PRIMARY KEY,
dname VARCHAR2(9), loc
VARCHAR2(10) )

Constraints (cont.)
REFERENTIAL INTEGRITY constraint - designates a
column or combination of columns as a foreign key and
establishes a relationship between that foreign key and a specified
primary or unique key, called the referenced key.
In this relationship, the table containing the foreign key is called
the child table and the table containing the referenced key is
called the parent table.
The child and parent tables must be on the same database.
They cannot be on different nodes of a distributed database.
The foreign key and the referenced key can be in the same
table. In this case, the parent and child tables are the same.

Constraints (cont.)
To satisfy a referential integrity constraint, each row of the
child table must meet one of these conditions:
The value of the row's foreign key must appear as a
referenced key value in one of the parent table's rows.
The row in the child table is said to depend on the
referenced key in the parent table.
The value of one of the columns that makes up the foreign key
must be null.
A referential integrity constraint is defined in the child table. A
referential integrity constraint definition can include any of
these keywords:

Constraints (cont.)

FOREIGN KEY - identifies the column or combination of columns in


the child table that makes up of the foreign key.
Only use this keyword when you define a foreign key with a table
constraint clause.
REFERENCES - identifies the parent table and the column or
combination of columns that make up the referenced key.
The referenced key columns must be of the same number and data
types as the foreign key columns.
ON DELETE CASCADE - allows deletion of referenced key values
in the parent table that have dependent rows in the child table and
causes ORACLE to automatically delete dependent rows from the
child table to maintain referential integrity.
If you omit this option, ORACLE forbids deletion of referenced key
values in the parent table that have dependent rows in the child
table.

SQL
statement
processing
order

57

Using and Defining Views

Views simplify query command, provide users


controlled access to tables
Base Tabletable containing the raw data

58

Using and Defining Views


Dynamic View
A virtual table created dynamically upon request by a user
No data actually stored; instead data from base table made
available to user
Based on SQL SELECT statement on base tables or other views
You can think of views as stored queries

Materialized View
Copy or replication of data
Data actually stored
Must be refreshed periodically to match the corresponding base
tables

Sample CREATE VIEW


CREATE VIEW EXPENSIVE_STUFF_V AS
SELECT PRODUCT_ID, PRODUCT_NAME,
UNIT_PRICE
FROM PRODUCT_T
WHERE UNIT_PRICE >300
WITH CHECK OPTION;

Sample CREATE VIEW

View has a name


View is based on a SELECT statement
CHECK OPTIONapplied to updatable view

Advantages of Views
Simplify query commands
Assist with data security (but don't rely on views for
security,
there are more important security measures)
Enhance programming productivity
Contain most current base table data
Use little storage space
Provide customized view for user
Establish physical data independence

Disadvantages of Views

Use processing time each time view is


referenced
May or may not be directly updateable

63

DML--Insert Statement
Adds data to a table
Inserting into a table: every attribute is
supplied
INSERT INTO CUSTOMER_T
VALUES
(001, Contemporary Casuals, 1355 S. Himes
Blvd., Gainesville, FL, 32601);
64

DML--Insert Statement
Inserting a record that has some null attributes requires
entering null explicitly for the empty fields or identifying
the fields that actually get data
INSERT INTO PRODUCT_T
(PRODUCT_ID, PRODUCT_DESCRIPTION,
PRODUCT_FINISH, STANDARD_PRICE,
PRODUCT_ON_HAND)
VALUES (1, End Table, Cherry, 175, 8);

DML--Insert Statement
INSERT INTO PRODUCT_T
(&PRODUCT_ID, &PRODUCT_DESCRIPTION ,
&PRODUCT_FINISH, &STANDARD_PRICE,
&PRODUCT_ON_HAND)
VALUES (1, End Table, Cherry, 175, 8);

DML--Insert Statement
Inserting from another table:
INSERT INTO CA_CUSTOMER_T
SELECT *
FROM CUSTOMER_T
WHERE STATE = CA;

DML--Delete Statement
Delete all rows
DELETE FROM CUSTOMER_T;

Be cautious: always use SELECT clause


to display the records first to make sure
only desired records are to be deleted!

DML--Update Statement

Modifies data in existing rows


UPDATE PRODUCT_T SET UNIT_PRICE =
775 WHERE PRODUCT_ID = 7;
YOU CAN USE ROLLBACK COMMAND TO
CANCELL
THE UPDATE OPERATION OR COMMIT COMMAND
TO MAKE IT PERMANENT
69

Question & Answer


1. SQL

statements can be on one or more lines


a) True
b) False.
2. Each SQL statement (not line) ends with a semicolon
a) True
b) False.
3. Data Definition Language commands allow you to perform
create, alter, and drop objects
a) True
b) False.
4. Keywords cannot be split across lines
a) True
b) False.
5. Column_constraint - defines an integrity constraint as part of
the column definition.
a) True
b) False

Question & Answer


6 . REFERENTIAL

a)
b)
c)
d)

INTEGRITY constraint designates a column or


combination of columns as
foreign key
Primary Key
Unique KEY
None

7.. NOT NULL constraint - specifies that a column has .

a)
b)
c)
d)

Unique Values
Not Null Values
Null Values
Both a and c

Question & Answer


8 . Update Command is used to Modify
a) Data from the Table.
b) Structure of the Table.
c) Remove column from the Table
d) None of the above
9. . Insert command is used to
a) Store data into the Table
b) Save data permanently into the Table.
c) Modify data into the Table
d) All of the above

Question & Answer


10 . DELETE FROM Table Name
a) Delete a particular row from the Table.
b) Delete all the rows from the Table.
c) Remove primary key from the table.
d) Remove the structure of the Table.
11. Unique Key column can accept null value
a) True
b)False

Question & Answer


12 . DELETE FROM Table Name
a) Delete a particular row from the Table.
b) Delete all the rows from the Table.
c) Remove primary key from the table.
d) Remove the structure of the Table.
13 Commit command makes all the changes permanent
a) True
b) False

Thank You
Please forward your query
To: skjha2@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT
Data Base Management System
Semester II
Session - 6
Gaurav Dubey

Basic Select Statement :


Select Statement with Where Clause
Select statement with Distinct
Group By Statement

Group Functions :
COUNT( ) , SUM( ) , AVG( ) , MIN( ) , MAX( )
Having Clause
ORDER By Clause

General Structure
SELECT ...... FROM ......
WHERE ......
SELECT [ALL / DISTINCT] expr1 [AS col1], expr2 [AS
col2]
FROM tablename WHERE condition

Using the DISTINCT keyword


To select ALL values from the column named "Company" we use
a SELECT statement like this:
SELECT Company FROM Orders
Company

Orders

Sega
Company

OrderNumber

Sega

3412

W3Schools

2312

Trio

4678

W3Schools

6798

W3Schools
Trio
W3Schools

Note that "W3Schools" is listed twice in the result-set.


To select only DIFFERENT values from the column named
"Company" we use a SELECT DISTINCT statement like this:
SELECT DISTINCT Company FROM Orders
Orders

Company
Sega

Company

OrderNumber

W3Schools

Sega

3412

Trio

W3Schools

2312

Trio

4678

W3Schools

6798

General Structure
SELECT [ALL / DISTINCT] expr1 [AS col1], expr2 [AS col2]
FROM tablename WHERE condition

The query will select rows from the source tablename


and output the result in table form.
Expressions expr1, expr2 can be :
(1) a column, or
(2) an expression of functions and fields.
And col1, col2 are their corresponding column
names in the output table.

General Structure
SELECT [ALL / DISTINCT] expr1 [AS col1], expr2 [AS col2] ;
FROM tablename WHERE condition
DISTINCT will eliminate duplication in the output
while ALL will keep all duplicated rows.
condition can be :
(1) an inequality, or
(2) a string comparison
using logical operators AND, OR, NOT.

Syntax for SELECT statement


Clauses must be written in the following order
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY

Order that the DBMS processes a SELECT


statement
Order that the DBMS processes a SELECT statement
Step 1: FROM
Step 2: WHERE
Step 3: GROUP BY
Step 4: HAVING
Step 5: SELECT (this must be "writtten" first)
Step 6: ORDER BY
At each step the DBMS keeps track of the "interim result set"
which is then further refined by the next step

Step 1: FROM clause


SQL 92 Syntax (i.e. JOIN/ON in FROM clause)
generate Cartesian product (ie "cross join") of (1) the
first table and (2) the table in the first JOIN clause
filter out records from Cartesian product that don't
match the first ON clause
generate Cartesian product of (1) the results so far and
(2) the table in the next JOIN clause
filter out records from Cartesian product that don't
match the associated "ON" clause
keep proceeding in this way until all tables are joined

Step 2: WHERE clause


Filter out records from the "interim result set"
generated in Step 1 by filtering out (i.e. "throwing out")
records that don't match the conditions in the WHERE
clause
Each record in the interim result set is looked at
separately and the results of the WHERE clause is
calculated.
If the result of the WHERE clause for that row is
TRUE then the row is kept. If the result of the WHERE
clause for that row is FALSE then the row is "thrown
away".

Step 3: GROUP BY
Create separate groups of rows that match in
all of the values listed in the GROUP BY list.
There may be a single group for all records in
the interim result set or there may be many
groups.
There is ALWAYS at least one group.

Step 4 & 5
Step 4: HAVING

Filter out all groups that don't match the


conditions in the HAVING clause
Step 5: SELECT
Figure out what values will actually be
included in the final result set by
processing the SELECT clause

Step 6: ORDER BY

Sort the result set in the order specified in the


ORDER BY clause

WHERE vs HAVING

WHERE vs. HAVING


Similarities:
The WHERE and HAVING clauses are both used to
exclude records from the result set.
Differences
WHERE clause

The WHERE clause is processed before the groups are


created
Therefore, the WHERE clause can refer to any value in
the original tables

WHERE vs. HAVING


HAVING clause
The HAVING clause is processed after the groups are
created
Therefore, the HAVING clause can only refer to
aggregate information for the group
(including fields that are part of the GROUP BY
clause).
The HAVING clause CANNOT refer to individual
columns from a table that are not also part of the group.

General Structure
eg. 1

List all the student records.

SELECT * FROM student;


Result

id name
9801 Peter
9802 Mary
9803 Johnny
9804 Wendy
9805 Tobe
: :

dob
06/04/86
01/10/86
03/16/86
07/09/86
10/17/86
:

sex
M
F
M
F
M
:

class
1A
1A
1A
1B
1B
:

mtest
70
92
91
84
88
:

hcode
R
Y
G
B
R
:

dcode
SSP
HHM
SSP
YMT
YMT
:

remission
.F.
.F.
.T.
.F.
.F.
:

General Structure
eg. 2

List the names and house code of 1A students.


SELECT name, hcode, class FROM student
WHERE class= "1A ;
Class
1A
1A
1A
1B
1B
:

Class
class="1A"

1A
1A
1A
1B
1B
:

General Structure
eg. 2

List the names and house code of 1A students.

Result

name
Peter
Mary
Johnny
Luke
Bobby
Aaron
:

hcode
R
Y
G
G
B
R
:

class
1A
1A
1A
1A
1A
1A
:

General Structure
eg. 3

List the residential district of the Red House


members.
SELECT DISTINCT dcode FROM student
WHERE hcode="R ;

Result

dcode
HHM
KWC
MKK
SSP
TST
YMT

General Structure
eg. 4

List the names and ages (1 d.p.) of 1B girls.

1B Girls ?

General Structure
eg. 4

List the names and ages (1 d.p.) of 1B girls.

Condition for "1B Girls":


1)

class = "1B"

2)

sex = "F"

3)

Both ( AND operator)

General Structure
eg. 4

List the names and ages (1 d.p.) of 1B girls.

What is "age"?

General Structure
eg. 4

List the names and ages (1 d.p.) of 1B girls.

Functions:
# days :

DATE( ) dob

# years :(DATE( ) dob) / 365


1 d.p.:

ROUND(__ , 1)

General Structure
eg. 4

List the names and ages (1 d.p.) of 1B girls.


SELECT name, ROUND((DATE( )-dob)/365,1) AS age
FROM student WHERE class="1B" AND sex="F"

Result

name
Wendy
Kitty
Janet
Sandy
Mimi

age
12.1
11.5
12.4
12.3
12.2

Comparison
eg. 5

List the students who were born on Wednesday


or Saturdays.
SELECT name, class, CDOW(dob) AS bdate
FROM student
WHERE DOW(dob) IN (4,7);

Result

name
Peter
Wendy
Kevin
Luke
Aaron
:

class
1A
1B
1C
1A
1A
:

bdate
Wednesday
Wednesday
Saturday
Wednesday
Saturday
:

Comparison
eg. 6

List the students who were not born in January,


March, June, September.
SELECT name, class, dob FROM student
WHERE MONTH(dob) NOT IN (1,3,6,9);

Result

name
Wendy
Tobe
Eric
Patty
Kevin
Bobby
Aaron
:

class
1B
1B
1C
1C
1C
1A
1A
:

dob
07/09/86
10/17/86
05/05/87
08/13/87
11/21/87
02/16/86
08/02/86
:

Comparison
eg. 7

List the 1A students whose Math test score is


between 80 and 90 (incl.)
SELECT name, mtest FROM student
WHERE class="1A" AND
mtest BETWEEN 80 AND 90;

Result

name
Luke
Aaron
Gigi

mtest
86
83
84

Comparison
eg. 8

List the students whose names start with "T".


SELECT name, class FROM student
WHERE name LIKE "T%"

Result

name
Tobe
Teddy
Tim

class
1B
1B
2A

Comparison
eg. 9

List the Red house members whose names contain


"a" as the 2nd letter.
SELECT name, class, hcode FROM student
WHERE name LIKE "_a%" AND hcode="R;
Result

name
Aaron
Janet
Paula

class
1A
1B
2A

hcode
R
R
R

Grouping
SELECT ...... FROM ...... WHERE condition
GROUP BY groupexpr [HAVING requirement]

Group functions:
COUNT( ), SUM( ), AVG( ), MAX( ), MIN( )
groupexpr specifies the related rows to be
grouped as one entry. Usually it is a column.
WHERE condition specifies the condition of
individual rows before the rows are group.
HAVING requirement specifies the condition
involving the whole group.

Grouping

List the number of students of each


class.

eg. 10

Group By Class
class
1A

1A

1A

COUNT( )

1A
1B
1B

1B

1B
1B

COUNT( )

1B
1B
1C

1C

1C
1C
Student

COUNT( )

Grouping
eg. 11

List the number of students of each class.


SELECT class, COUNT(*) FROM student
GROUP BY class;

Result

class
1A
1B
1C
2A
2B
2C

cnt
10
9
9
8
8
6

Grouping

List the average Math test score


of each class.

eg. 11

Group By Class
class
1A

1A

1A

AVG( )

1A
1B
1B

1B

1B
1B

AVG( )

1B
1B
1C

1C

1C
1C
Student

AVG( )

Grouping
eg. 12 List the average Math test score of each

class.

SELECT class, AVG(mtest) FROM student


GROUP BY class;

Result

class
1A
1B
1C
2A
2B
2C

avg_mtest
85.90
70.33
37.89
89.38
53.13
32.67

Grouping
eg. 13 List the number of girls of each district.

SELECT dcode, COUNT(*) FROM student


WHERE sex="F" GROUP BY dcode
Result

dcode
HHM
KWC
MKK
SSP
TST
YMT

cnt
6
1
1
5
4
8

Grouping
eg. 14 List the max. and min. test score of Form 1
students of each district.
SELECT MAX(mtest), MIN(mtest), dcode FROM student
WHERE class LIKE "1_" GROUP BY dcode

Result

max_mtest min_mtest dcode


92
36
HHM
91
19
MKK
91
31
SSP
92
36
TST
75
75
TSW
88
38
YMT

Display Order
eg. 15 List the boys of class 1A, order by their names.

SELECT name, id FROM student ;


WHERE sex="M" AND class="1A" ORDER BY name;
name
Peter
Johnny
Luke
Bobby
Aaron
Ron

id
9801
9803
9810
9811
9812
9813

Result

ORDER BY
dcode

name
Aaron
Bobby
Johnny
Luke
Peter
Ron

id
9812
9811
9803
9810
9801
9813

Grouping
eg. 13 List the number of girls of each district.

SELECT dcode, COUNT(*) FROM student


WHERE sex="F" GROUP BY dcode
Result

dcode
HHM
KWC
MKK
SSP
TST
YMT

cnt
6
1
1
5
4
8

Contact Table
CREATE TABLE contacts (
ContactID
Name
LOCALITY
CITY
Company
Phone
URL
Age
Height
Birthday
);

NUMBER(10) PRIMARY KEY,


VARCHAR(40),
VARCHAR(60),
VARCHAR(20),
VARCHAR(60),
VARCHAR(11),
VARCHAR(80),
NUMBER(3),
NUMBER(5,2),
DATE

A few simple SELECT Statements


SELECT *
FROM contacts;
Display all records in the contacts table
* Represents all the column of the Specified
Table Name

A few simple SELECT Statements


SELECT contactid , name
FROM contacts;
Display only the record number and names
Only specified column name will be displayed

A few simple SELECT Statements


SELECT DISTINCT url
FROM contacts;
Display only one entry for every value of URL

Refining selections with WHERE


The WHERE sub clause allows you to select records
based on a condition.
SELECT *
FROM contacts
WHERE

age < 10;

Display records from contacts where age<10

Refining selections with WHERE


SELECT *
FROM contacts
WHERE age BETWEEN 18 AND 35;
Display records where age is 18-35

Additional selections
The LIKE condition
Allows you to look at strings that are alike

SELECT *
FROM contacts
WHERE name LIKE J%;
Display records where the name starts with J

Additional selections
SELECT *
FROM contacts
WHERE url LIKE %.com;
Display records where url ends in .com

Other SELECT examples


SELECT * FROM contacts
WHERE name is NULL;
SELECT * FROM contacts
WHERE zip IN (14454,12345);

Other SELECT examples


SELECT *
FROM contacts
WHERE zip IN (
SELECT zip
FROM address
WHERE state=NY
);

GROUP BY Function
The GROUP BY clause allows you to group results
together with aggregate functions

AVG(),
COUNT(),
MAX(),
MIN(),
SUM()
COUNT DISTINCT

HAVING Clause

HAVING allows you to search the GROUP


BY results

GROUP BY Examples
SELECT company, count(contactid)
FROM contacts
GROUP BY company;

GROUP BY Examples
SELECT company, Avg(Age)
FROM contacts
GROUP BY company;

GROUP BY Examples
SELECT company, MAX (Height)
FROM contacts
GROUP BY company;

Example: HAVING Clause

SELECT company, count(company)


FROM contacts
GROUP BY company
HAVING count(company) > 5;

Example: HAVING Clause


SELECT company, Avg(Age)
FROM contacts
GROUP BY company
HAVING Avg(Age) <=35 ;

ORDER BY
The ORDER BY clause allows you to sort the
results returned by SELECT.
SELECT * FROM contacts
ORDER BY company;
SELECT * FROM contacts
ORDER BY company, name;

HAVING Clause
Without WHERE clause
There is a HAVING clause but no WHERE clause:
The GROUP BY clause works to group several rows from the
original table together to get aggregate information about the
group.
The HAVING clause eliminates some of the resulting rows of
aggregate information.

HAVING Clause
Without WHERE clause

SELECT vendorId, avg(PaymentTotal) as


avgPaymentTotal
FROM invoices
GROUP BY vendorId
HAVING avg(PaymentTotal) <=10
ORDER BY avgPaymentTotal;

Processing the select without WHERE


Step 1: Create the groups based on the GROUP BY
Step 2: Generate the aggregate information (e.g. avg) for
each group.

Final Results
avgPaymentT
otal

Table: Invoices
VendorId

PaymentTotal

InvoiceTotal

001

20

001

10

20

001

15

1500

002

10

10

002

20

4000

002

30

4000

group1

group2

Adding a WHERE clause to the


example

Same select statement with WHERE


We will now examine what happens when we add a
WHERE clause to the same SELECT statement we used
above.
SELECT vendorId, avg(PaymentTotal) as
avgPaymentTotal
FROM invoices
WHERE invoiceTotal < 1000
GROUP BY vendorId
HAVING avg(PaymentTotal) <=10
ORDER BY avgPaymentTotal;

Processing the select with where


Step 1: Process WHERE clause to eliminate some rows from
consideration
TablenName:
Invoices
VendorId

PaymentTotal

InvoiceTotal

001

20

001

10

20

001

15

1500

002

10

10

002

20

4000

002

30

4000

Processing the select with


where
Step 2: Process the GROUP BY to create groups from the remaining
rows.
Interim result set
TablenName:
Invoices

avgPaymentTotal

VendorId

PaymentTotal InvoiceTotal

001

20

001

10

20

001

15

1500

002

10

10

002

20

4000

002

30

4000

group1

group2

Processing the select with


where
Step 3: Process the HAVING clause to possibly remove some rows from
the result set (in this example no rows need to be removed)
Final results
TablenName:
Invoices

avgPaymentTotal

VendorId

PaymentTotal InvoiceTotal

001

20

001

10

20

001

15

1500

002

10

10

002

20

4000

002

30

4000

group1

group2

Semicolon after SQL Statements?


Semicolon is the standard way to separate each SQL statement in
database systems that allow more than one SQL statement to be
executed in the same call to the server.
Some SQL tutorials end each SQL statement with a semicolon. Is
this necessary? We are using MS Access and SQL Server 2000
and we do not have to put a semicolon after each SQL statement,
but some database programs force you to use it.

Question & Answer


1. Group by clause with select statement is used to :
a) Create separate group of rows
b) Filter unwanted rows from the Table
c) Retrieves all the rows from the Table
d) All of the above

2. ORDER BY clause allows you to sort the results in a particular order


returned by SELECT table.
a) TRUE
b) FALSE

Question & Answer


3. To select only DIFFERENT values from the column of a
Table, we use DISTINCT keyword is used with SELECT
statement.
a)

TRUE

b) FALSE

4. While processing a SELECT statement DBMS executes first


where clause then FROM
a)

TRUE

b) FALSE

5. SQL Statement : SELECT * FROM EMP_INFO will


a) Retrieve one and only one column from EMP_INFO Table
b) Retrieve all the column from EMP_INFO Table
c) Will delete all the rows of EMP_INFO Table.
d) None of the above

Question & Answer


6. Group by clause is executed before Having clause :
a) True
b) False

7. MIN( ) & SUM() are Example of Group Function


a) TRUE
b) FALSE

8. SUN() & COUNT( ) Function have same meaning.


a) TRUE
b) FALSE

Question & Answer


9. LIKE operation is performed to search :
a)
b)
c)
d)

Numeric Data Type


String Data Type
Date Data Type
All of the above

10. SELECT * FROM contacts WHERE name LIKE


J%;
a)
b)
c)
d)

Will display records where Name starts with J.


Will display records where Name does nt start with J.
Will display records where last character of Name is J
None of the above

Thank You
Please forward your query
To: gdubey@amity.edu
CC: manoj.amity@panafnet.com

PAN African e-Network Project


PGDIT
Data Base Management System
Semester II
Session - 7
Gaurav Dubey

Database Administrator (DBA)


A Database administrator (DBA) performs all activities
related to maintaining a successful database
environment.
DBA is said to be the custodian of Database.
DBA is a person or group of persons responsible for
managing the Database.

Responsibility of DBA Includes:


Designing, implementing, and maintaining the database system.
Establishing policies and procedures pertaining to the
management,
Security, maintenance, and use of the database management
system .
Training employees in database management and use.

Advantages of DBMS
Controlled data redundancy:
Data consistency:
More information from the same amount of data
Sharing of data:

Advantages of DBMS
Increased concurrency.
Improved data integrity:
Improved backup and recovery services

Disadvantages of DBMS

Complexity & Size,

Cost of Software & Additional H/W costs

Cost of conversion, Performance,

Higher impact of a failure.

Data Independence
Applications insulated from how data is structured and stored.
Logical data independence: Protection from changes in
logical structure of data.
Physical data independence: Protection from changes in
physical structure of data.

* One of the most important benefits of using a DBMS!

Functions of a DBMS
1.Data storage, retrieval, and update:
Support of Query Language

2. A user-accessible catalog:
Data Dictionary

3. Transaction support:
Transaction Manager

4. Concurrency control services:


Lock Manager

5. Recovery services.

Functions of a DBMS
6. Authorization services
7. Support for data communication
8. Integrity services
9. Services to promote data independence

Entity-Relationship Diagrams (ERD)


An entity-relationship ( ER ) diagram is a
specialized graphical method that illustrates the
interrelationships between entities in a
database.
ER diagrams often use symbols to represent
three different types of information in designing
database.

Entity-Relationship Diagrams (ERD)


Boxes are commonly used to represent entities.
ovals are used to represent attributes.
Diamonds are normally used to represent
relationships.

Entity
A person, place, object, event or concept in the user
environment about which the organization wishes to
maintain data
Represented by a rectangle in E-R diagrams
Entity Type / Set
A collection of entities that share common properties
or characteristics.
i.e. Student Entity Set , Customer Entity Set
Attribute
A named property or characteristic of an entity that is
of interest to an organization.
i.e registration_no , customer_id , customer_emailid

Question & Answer


WHAT IS DIFFERENCE BETWEEN
SIMPLE ATTRIBUTE AND COMPOSITE
ATTRIBUTE?
WHAT IS DIFFERENCE BETWEEN
DERIVED ATTRIBUTE AND MULTI VALUED
ATTRIBUTE?

Keys
Entities and relationships are distinguishable
using various keys:
A key is a combination of one or more Subject
than one
matter
title
isbn
attributes that uniquely identifies the instances of
SS#
entity
set
or relationship.
wrote
index
books
authors
subject
name social-security number,
e.g.,
carry
Member-id, quantity
Combination of
order_id
and Product_id
libraries
address
.

Candidate key
A candidate key is that uniquely
identifies either an entity or a relationship.
, e.g., social-security number,
phone number,
employment_id ,
email_id.

Alternate Key
Alternate Key: An Entity Set can have
various candidate Key. Among them only
one can be selected as primary key.
Remaining are known as alternate key.
Alternate Key = Candidate Key- Primary Key

Keys
A primary key is a candidate key that is chosen
by the database designer to identify the entities
of an entity set.
Simple Primary Key
Composite Primary Key

Criteria for Selecting Primary Key.


Only those column should be selected as
primary key which are permanent in nature or
very less likely to change.
Columns value are must
Example: employment_no , registration_no

Question and Answer

Example: A Employee Data has to be stored in


a Table containing following information.
i.e. emp_id , emp_name .emp_city ,emp_age
,emp_emailid ,emp_designation ,emp_salary
,emp_phone_no.
Explain the possible primary key i.e. Candidate Key

Question and Answer

Which column should be considered as primary


key and why?
Considering the previous example Mention all
the Alternate keys.

Keys
A foreign key is a set of one or more attributes of
a strong entity set that are employed to construct
the discriminator of a weak entity set.
The primary key of a weak entity set is formed
by the primary key of the strong entity set on
which it is existence-dependent.

Question & Answer


Differentiate between Foreign Key and Primary
Key.
Differentiate between Unique Key and Primary
Key.

Functional Dependency: Types


Functional Dependency are categorized as
follows:

1. Full Functional Dependency


2. Partial Functional Dependency
3. Transitive Functional Dependency

Normalization
Database normalization It is the step by step process of
removing redundant data from the database in order to
improve storage efficiency, data integrity, and consistency.
Normalization generally involves splitting existing tables into
multiple ones, which must be re-joined or linked each time a
query is issued.

Normalization
Decomposition process in Normalization
are of two types:
Lossless Decomposition.
Lossy Decomposition.
Decomposition always should be Lossless
Decomposition

Difference Between 2NF & 3NF


Relation in 2 NF only removes the Partial
dependency among the attributes.
Where as a Relation in 3 NF also takes care of
Transitivity.

Note
The third normal form is often reached in practice by
inspection, in a single step.
Its meaning seems intuitively clear; it represents a
formalization of designers common sense.
This level of normalization is widely accepted as the
initial target for a design which eliminates redundancy.
However, there are higher normal forms which, although
less frequently invoked, highlight further redundancy
problems which may affect the designer

De Normalization
De Normalization is reverse of Normalization:
Some times to improve the performance of the
system we need to de-normalize the Relations
Before De Normalization following must be
considered: Use with caution
Normalize first, then de-normalize
Use only when you cannot optimize

Data Integrity
1. Data into the database must be as per predefined
set of rules, as determined by:
The DBA or Application developer.
2. When an integrity constraint applies to a table, all
data in the table must conform to the corresponding
rule.
3. When you issue a SQL statement that modifies data
in the table, Oracle Database ensures that the new
data satisfies the integrity constraint, without the
need to do any checking within your program.

Data Integrity
1.

You can enforce rules by defining integrity constraints


more reliably than by adding logic to your application.

2.

Oracle Database can check that all the data in a table


obeys an integrity constraint faster than an application
can

Data Integrity
Example of data integrity:
Consider the tables employees and departments and
the business rules for the information in each of the
tables,
As illustrated in Figure : ensure that each employee
works for a valid department, first create a rule that all
values in the department table are unique and value in
the foreign key column must be same as value in
primary key column or NULL

Types of Data Integrity


Primary Key Values
A rule defined on a column or set of columns that
specifies that each row in the table can be uniquely
identified by the values in the key.
Referential Integrity Rules
A referential integrity rule is a rule defined on a key (a
column or set of columns) in one table that guarantees
that the values in that key match the values in a key in a
related table (the referenced value).

Definition:-- Transaction
A transaction is the basic logical unit of execution in
an information system.
A transaction is a sequence of operations that must be
executed as a whole.
.

It is process of taking one consistent (& correct)


database state into another consistent (& correct)
database state.

Schedules of Transactions

A schedule S of n transactions is a
sequential ordering of the operations
of the n transactions.
The transactions are interleaved

Schedules of Transactions

A schedule maintains the order of


operations within the individual
transaction.
For each transaction T if operation a is
performed in T before operation b, then
operation a will be performed before operation
b in Schedule S.
The operations are in the same order as they
were before the transactions were interleaved

Serial and Non-serial Schedules


Serializability theory attempts to determine
the 'correctness' of the schedules.
A schedule S of n transactions is
serialisable if it is equivalent to some serial
schedule of the same n transactions.

Properties of Transaction
A
Atomicity: a transaction is an atomic unit of processing and
it is either performed entirely or not at all
C
Consistency Preservation: a transaction's correct execution
must take the database from one correct state to another.
I
Isolation/Independence: the updates of a transaction must
not be made visible to other transactions until it is committed

Example
T1:
r e a d _ it e m ( X ) ;
X := X - N ;

T2:
r e a d _ it e m ( X ) ;
X := X + M ;

w r it e _ it e m ( X ) ;
r e a d _ it e m ( Y ) ;
w r it e _ it e m ( X ) ;
Y := Y + N ;
w r it e _ it e m ( Y ) ;

Properties
of Transaction
D

Durability (or Permanency): if a transaction


changes the database and is committed, the changes
must never be lost because of subsequent failure
READ A
A=A-1500
WRITEA
READ B
B= B+1500
WRITE B

Example

Locking Techniques for Concurrency Control


The concept of locking data items is one of the
main techniques used for controlling the
concurrent execution of transactions.
A lock is a variable associated with a data item
in the database. Generally there is a lock for
each data item in the database.

Types of Locks

Binary locks have two possible states:


1. locked (lock_item(X) operation) and
2. unlocked (unlock_item(X) operation

Types of Locks

Multiple-mode locks allow concurrent access


to the same item by several transactions.
Three possible states:
1. read locked or shared locked (other transactions are
allowed to read the item)
2. write locked or exclusive locked (a single
transaction exclusively holds the lock on the item)
and
3. unlocked.

Two-Phasing Locking
Basic 2PL
When a transaction releases a lock, it may not
request another lock
Conservative 2PL or static 2PL
A transaction locks all the items it accesses
before the transaction begins execution
Pre-declaring read and write sets

Locking Techniques for Concurrency Control


A lock describes the status of the data item with
respect to possible operations that can be
applied to that item.
It is used for synchronizing the access by
concurrent transactions to the database items.
A transaction locks an object before using it
When an object is locked by another transaction,
the requesting transaction must wait

Locking Granularity
A database item could be
a database record
a field value of a database record
a disk block
the whole database

Locking Granularity
Trade-offs
Coarse granularity
the larger the data item size, the lower the degree
of concurrency

Fine granularity
the smaller the data item size, the more locks to be
managed and stored, and the more lock/unlock
operations needed.

Database Backup and Recovery Concepts


Backup of Database means to make single or multiple
copies of data files, control file, and archived redo logs

Restoring a Database means copying the physical files


that make up the database from a backup medium,
typically disk or tape, to their original or to new
locations

Database Backup and Recovery Concepts

A backup is either consistent or inconsistent.


To make a consistent backup, database must have been shut
down cleanly and remain closed for the duration of the backup.
All committed changes in the redo log are written to the data
files, so the data files are in a transaction-consistent state.
When restoring data files from a consistent backup, you can
open the database immediately

Writing SQL Statements

SQL statements are not case sensitive


(but criteria within quotation marks are for some RDBMS)
SQL statements can be on one or more lines
Clauses are usually placed on separate lines
Keywords cannot be split across lines
Tabs and spaces are allowed to enhance readability
Each SQL statement (not line) ends with a semicolon (;)

50

SQL*Plus
Command line tool that process users SQL statements
Requires Oracle account

DDL

Data Definition

DML

Data Manipulation

DCL

Data Control

SQL

51

Help command

52

Typing a SQL command

Saving SQL command in a file


Editing SQL command in a file

53

Structured Query Language features


DQL (Data Query Language)
SELECT
Used to get data from the database and impose ordering
upon it.
DML (Data Manipulation Language)
DELETE, INSERT, UPDATE
Used to change database data.
DDL (Data Definition Language)
DROP, TRUNCATE, CREATE, ALTER
Used to manipulate database structures and definitions.
RIGHTS
REVOKE, GRANT
Used to give and take access rights to database objects.

DDL, DML, DCL, and the database


development process

55

Data Types
VARCHAR2

Character data type.


Can contain letters, numbers and punctuation.
The syntax : VARCHAR2(size)
where size is the maximum number of alphanumeric
characters the column can hold.
In Oracle8, the maximum size of a VARCHAR2 column
is 4,000 bytes.

56

Data Types
NUMBER
Numeric data type.
Can contain integer or floating point numbers
only.
The syntax : NUMBER(precision, scale)
where precision is the total size of the number
including decimal point and scale is the
number of places to the right of the decimal.
For example, NUMBER(6,2) can hold a
number between -999.99 and 999.99.

Data Types
DATE
Date and Time data type.
Can contain a date and time portion in the
format: DD-MON-YY HH:MI:SS.
No additional information needed when
specifying the DATE data type.
the time of 00:00:00 is used as a default.
The output format of the date and time can be
modified
58

Data Types
RAW
Free form binary data.
Can contain binary data up to 255 characters.
Data type LONG RAW can contain up to 2
gigabytes of binary data.
RAW and LONG RAW data cannot be indexed
and can not be displayed or queried in
SQL*Plus.
Only one RAW column is allowed per table.

Data Types
LOB
Large Object data types.
These include BLOB (Binary Large OBject)
and CLOB (Character Large OBject).
More than one LOB column can appear in a
table.
These data types are the prefferred method
for storing large objects such as text
documents (CLOB), images, or video (BLOB).

SQL
statement
processing
order

61

Advantages of Views
Simplify query commands
Assist with data security (but don't rely on views for
security,
there are more important security measures)
Enhance programming productivity
Contain most current base table data
Use little storage space
Provide customized view for user
Establish physical data independence

Disadvantages of Views

Use processing time each time view is


referenced
May or may not be directly updateable

63

Question & Answer


1 . REFERENTIAL

a)
b)
c)
d)

INTEGRITY constraint designates a column or


combination of columns as
foreign key
Primary Key
Unique KEY
None

2.. NOT NULL constraint - specifies that a column has .

a)
b)
c)
d)

Unique Values
Not Null Values
Null Values
Both a and c

Question & Answer


3 . Update Command is used to Modify
a) Data from the Table.
b) Structure of the Table.
c) Remove column from the Table
d) None of the above
4. . Insert command is used to
a) Store data into the Table
b) Save data permanently into the Table.
c) Modify data into the Table
d) All of the above

Question & Answer


5 . DELETE FROM Table Name
a) Delete a particular row from the Table.
b) Delete all the rows from the Table.
c) Remove primary key from the table.
d) Remove the structure of the Table.
6. Unique Key column can accept null value
a) True
b)False

Question & Answer

7 . DELETE FROM Table Name


a) Delete a particular row from the Table.
b) Delete all the rows from the Table.
c) Remove primary key from the table.
d) Remove the structure of the Table.
8 Commit command makes all the changes permanent
a) True
b) False

Question & Answer


9. To select only DIFFERENT values from the column of a
Table, we use DISTINCT keyword is used with SELECT
statement.
a)

TRUE

b) FALSE

10 While processing a SELECT statement DBMS executes first


where clause then FROM
a)

TRUE

b) FALSE

11. SQL Statement : SELECT * FROM EMP_INFO


a) Retrieve one and only one column from EMP_INFO Table
b) Retrieve all the column from EMP_INFO Table
c) Will delete all the rows of EMP_INFO Table.
d) None of the above

will

Question & Answer


12. LIKE operation is performed to search :
a)
b)
c)
d)

Numeric Data Type


String Data Type
Date Data Type
All of the above

13. SELECT * FROM contacts WHERE name LIKE


J%;
a)
b)
c)
d)

Will display records where Name starts with J.


Will display records where Name does nt start with J.
Will display records where last character of Name is J
None of the above

Question & Answer


14. . A table can have how many unique key

A). 1
B). any number
C). 255
D). None of the above.

15. Entity is represented by the symbol.

A) Double Circle
B) Ellipse
C) Rectangle
D) Square

Question & Answer


16. Attributes are

i) Properties of relationship
ii) Degree to entities
iii) Properties of members of an entity set
(a ) i (b) i and ii
(c) i and iii
(d) iii

17 A relationship is
a) an item in an application
b) a meaningful dependency between entities
c) a collection of related entities
d) related data

Question & Answer


18. A relation is said to be in BCNF when
a) It has overlapping composite keys
b) It has no composite keys
c) It has no multi valued dependencies
d) It has no overlapping composite keys which have
related attributes
19. Fourth Normal form (4 NF) relations are needed when.
there are multi valued dependencies between
attributes in composite key
there are more than one composite key
there are two or more overlapping composite keys
there are multi valued dependency between non-key
attributes

Question & Answer


20 Transitive Dependency explains :
a)
b)
c)

Dependency of Key attributes to another Key attributes


Dependency of Key attributes to another non Key attributes
Dependency of non Key attributes to another non Key
attributes
d) All of the above
21. Partial Dependency exist in a realtion when
a) A Non key attribute depends on part of the primary key
b) Priamary Key doesnt exist
c) A non key attribute is dependent on another non key attriburte
d) None of the above

Question & Answer


22. Which among the following is types of LOCK is known
as exclusive lock
a) Read Lock
b) Write LOCK
c) Shared Lock
d) All of the above
23. A transaction reaches its commit point when all
operations accessing the database are completed and the
result has been recorded in the log
a) True
b) False

Question & Answer


24. Which among the following is data recovery method.
a) System Logging

b) Mirroring
c) Both a and b
d) None

25 A transaction cannot proceed for an indefinite period of


time while other transactions in the system continue
normally is called

a) Dead Lock
b) Live Lock
c) Binary Lock
d) Both a and b

Thank You
Please forward your query
To :gdubey@amity.edu

Potrebbero piacerti anche