Sei sulla pagina 1di 71

Constraints

Integrity Constraints
used to ensure accuracy and consistency of data
in a relational database. Data integrity is handled
in a relational database through the concept of
referential integrity.
something like 'be right' and consistent.
The constraints available in SQL are Foreign Key,
Primary Key, Not Null, Unique, Check.
Constraints can be defined in two ways:
The constraints can be specified immediately after the
column definition. This is called column-level
definition.
The constraints can be specified after all the columns
are defined. This is called table-level definition.

1) SQL Primary key:


This constraint defines a column or combination of
columns which uniquely identifies each row in the
table.
Syntax to define a Primary key at column level:

column name datatype [CONSTRAINT


constraint_name] PRIMARY KEY
Syntax to define a Primary key at table level:

[CONSTRAINT constraint_name] PRIMARY KEY


(column_name1,column_name2,..)
column_name1, column_name2 are the names of
the columns which define the primary Key.
The syntax within the bracket i.e. [CONSTRAINT
constraint_name] is optional.

2) SQL Foreign key or Referential


Integrity
This
constraint
identifies
any
column
referencing the PRIMARY KEY in another table.
It establishes a relationship between two
columns in the same table or between
different tables.
For a column to be defined as a Foreign Key,
it should be a defined as a Primary Key in the
table which it is referring.
One or more columns can be defined as
Foreign key.

3) SQL Not Null Constraint


This constraint ensures all rows in the
table contain a definite value for the
column which is specified as not null.
Which means a null value is not allowed.
Syntax
to
constraint:

define

Not

Null

[CONSTRAINT constraint name] NOT NULL

4) SQL Unique Key:


This constraint ensures that a column or a group
of columns in each row have a distinct value. A
column(s) can have a null value but the values
cannot be duplicated.
Syntax to define a Unique key at column
level:
[CONSTRAINT constraint_name] UNIQUE
Syntax to define a Unique key at table level:
[CONSTRAINT constraint_name]
UNIQUE(column_name)

5) SQL Check Constraint :


This constraint defines a business rule on a
column. All the rows must satisfy this rule.
The constraint can be applied for a single
column or a group of columns.
Syntax to define a Check constraint:
[CONSTRAINT
(condition)

constraint_name]

CHECK

Domain Integrity
Definition of a valid set of values for an attribute.
They are easy to test for when data is
entered
data type,
lenght or size,
is null value allowed,
is the value unique or not for an attribute.
For example,
a domain of date is the set of all possible valid dates,
a domain of integer is all possible whole numbers,
a domain of day-of-week is Monday, Tuesday ... Sunday.

Referential Integrity Constraint


Specified between two tables and it is used to
maintain the consistency among rows
between the two tables.
Examples:

Feature of Relational DB
Bad design with following
Problems
Data Redundancy
Update Anomalies
Insertion Anomalies
Deletion Anomalies

Good design
Goal of relational schema design is
to
avoid
anomalies
and
redundancy.
Eliminate data dependency

What is an Anomaly?
Definition
Problems that can occur in poorly
planned, un-normalized databases
where all the data is stored in one
table (a flat-file database).
Types of anomalies:
Insert
Delete
Update

Insert Anomaly
occurs when certain attributes cannot be
inserted into the database without the
presence of other attributes.
Course _no

Tutor

Room

Room_size

En_limit

353

Smith

A532

45

40

351

Smith

C320

100

60

355

Clark

H940

400

300

456

Turner

H940

400

45

e.g. we have built a new room (e.g. B123) but it


has not yet been time tabled for any courses or
members of staff.

Delete Anomaly
Exists when certain attributes are lost
because of the deletion of other
attributes.
Course_no

Tutor

Room

Room_size

En_limit

353

Smith

A532

45

40

351

Smith

C320

100

60

355

Clark

H940

400

300

456

Turner

H940

400

45

e.g. if we remove the entity, course_no:351 from the


above table, the details of room C320 get deleted.
Which implies the corresponding course will also get
deleted.

Update Anomaly
Exists when one or more instances of
duplicated data is updated, but not all.
Course_no

Tutor

Room

Room_size

En_limit

353

Smith

A532

45

40

351

Smith

C320

100

60

355

Clark

H940

400

300

456

Turner

H940

400

45

e.g. Room H940 has been improved, it is now of


RSize = 500. For updating a single entity, we
have to update all other columns where
room=H940.

How To Avoid Anomalies??


The use of normalization.
Normalization of data can be defined as a process during
which redundant relation schemas are decomposed by
breaking up their attributes into smaller relation
schemas that possess desirable properties.
The goal of the normalization process is to define
relations
So that each relation is about one kind of thing. Not
two. Not three.One.
This seems like a reasonable condition, given the
problems that it prevents

How Normalization works??


If you know a customer id, then you know
the person's name and address.
If you know a stock identifier, then you
know its current price and most recent
dividend.
Finally, for any pairing of a customer id and
a stock identifier, you know how many
shares that person owns of that stock

Dependencies
Normalization is based on the concept of
dependencies between the attributes of a
table.
Types of dependencies:
Functional dependency
Full Functional dependency
Partial dependency
Transitive dependency
Multi-Valued dependency
Join dependency

Functional Dependency

Functional Dependency
Y is determinant & X is
determined

Functional Dependency

Functional Dependency

FFD
Rollno, Name, course_id, Course_name,
Grade
Grades FFD on Rollno and course_id.
Name and course_name Not FFD
Because only rollno and course_id are
needed to determine the values of Name
and course_name.

Fully Functional Dependency

Partial Dependency
A relationship between attributes such that the value of
one attribute is dependent on or determined by the value
of another attribute which is part of the composite key.

Student: {rollno, name, c_id, c_title,


grade}
name attribute is partial dependent on the
rollno, because rollno is a part of composite
key (Composite key is a Primary key of two or
more attributes that uniquely identifies the row.).

Transitive Dependency
Exists when values of an attribute is dependent on the
value of another dependent attribute.
For example:
A

Is a transitive dependency which shows that attribute C


is FD on B which is further dependent on attribute A.

Departm
ent

Dept_id
1

CSE

Balraj Singh

IT

Kewal Krishan

ECE

R K Sharma

Transitive Dependency:
3
Dept_id

Dept_nam Cod_name
e

Dept_name

Cod_name

Multi-Valued Dependency
Attribute B has a MVD on attribute A, if for each value of
attribute A, there are more than one values of attribute B.
For example:

Name
Mobile_No

Perso
n

Name

Mobile_No

Preeti

9878793933

Sweety

9915656789

Akanksha

9813234343

Preeti

9912213343

Akanksha

9843432234

Supriti

9984348989

Both Preeti & Akanksha has two Mobile_No, means


there are more than one values of attribute
Mobile_No for each value of attribute Name.

Join Dependency
A table T is subject to a Join dependency,
if T can always be reconstructed by
joining multiple tables each having a
subset of the attributes of T.
If one of the tables in the join has all the
attributes of the table T, then the join
dependency -> TRIVIAL.
It is used in the Fifth Normal Form ->
project join normal form.

Normalization
Normal form represents a good DB design.
Used to eliminate:
Anomalies
Inconsistencies

Types of NF:
First Normal Form
Second Normal Form
Third Normal Form
Boyce Codd Normal Form
Forth Normal Form
Fifth Normal Form

Normalization

Normal Forms
A relation is in a particular normal form if it satisfies
certain normalization properties.
There are several normal forms defined:

1NF - First Normal Form


2NF - Second Normal Form
3NF - Third Normal Form
BCNF - Boyce-Codd Normal Form
4NF - Fourth Normal Form
5NF - Fifth Normal Form

Each of these normal forms are stricter than the next.


For example, 3NF is better than 2NF because it removes more
redundancy/anomalies from the schema than 2NF.

Normal Forms

1NF
A relational schema is in 1NF, if the values
in the domain of each attribute of the
relation are simple or atomic
Only one value is associated with
each attribute & the value is not a
set of values.
A DB schema is in 1NF, if all relation
schemas in the DBs are in 1NF.

1NF
Take the following table.
StudentID is the primary key.

Is it 1NF?

1NF
No. There are repeating groups
(subject, subjectcost, grade)

How can you make it 1NF?

1NF
Create new rows so each cell
contains only one value

But now look is the studentID


primary key still valid?

1NF
No the studentID no longer
uniquely identifies each row

You now need to declare studentID


and subject together to uniquely
identify each row.
So the composite key is StudentID and
Subject.

1NF
So. We now have 1NF.
Stude
nt

Subjects
detail

Is it 2NF?

A non-1NF Relation
Two ways to convert a non-1NF relation to a 1NF relation:
1) Splitting Method - Divide the existing relation into two relations: nonrepeating attributes and repeating attributes.
2) Flattening Method - Create new tuples for the repeating data combined
with the data that does not repeat.

Another Example on First


Normal Form
The following in not in 1NF
EmpNum
123
333
679

EmpPhone
233-9876
233-1231
233-1231

EmpDegrees
BTech
BA, BSc, PhD
BSc, MSc

EmpDegrees is a multi-valued field:


employee 679 has two degrees: BSc and MSc
employee 333 has three degrees: BA, BSc, PhD

First Normal Form


EmployeeDegree

Employee
EmpNum
123
333
679

EmpPhone
233-9876
233-1231
233-1231

EmpNum EmpDegree
333
BA
333
BSc
333
PhD
679
BSc
679
MSc

An outer join between Employee and EmployeeDegree will


produce the information we saw before

Converting a non-1NF Relation


to 1NF Using Flattening

1. In the __________ normal form, a composite attribute is converted to


individual attributes.
A) First
B) Second
C) Third
D) Fourth
2. A table on the many side of a one to many or many to many relationship
must:
a) Be in Second Normal Form (2NF)
b) Be in Third Normal Form (3NF)
c) Have a single attribute key
d) Have a composite key
Functional Dependencies are the types of constraints that are based on______
a) Key
b) Key revisited
c) Superset key
d) None of these

2NF
A relation is in 2NF if:
It is in 1NF &
All its non-primary attributes are FFD on
the primary key.

Second Normal Form (2NF)


A relation is in second normal form (2NF) if it is in 1NF and
every non-primary key (non-prime) attribute is fully
functionally dependent on the primary key.

Every non-key column depends on all candidate keys,


not a subset of any candidate key. Elimination of
partial dependency
Note: By definition, any relation with a single primary key
attribute is always in 2NF.
If a relation is not in 2NF, we will divide it into separate
relations each in 2NF by insuring that the primary key of each
new relation functionally determines all the attributes in the
relation.

Third Normal Form (3NF)


Third normal form (3NF) is based on the notion of transitive
dependency. A transitive dependency A C is a FD that can be
inferred from existing FDs A B and B C.
A relation is in third normal form (3NF) if it is in 2NF and there is no
non-primary key (non-prime) attribute that is transitively dependent
on the primary key.
Alternate definition from your text: A table is in 3NF if it is in 2NF and
each nonkey column depends only on candidate keys, not on other
nonkey columns
Converting a relation to 3NF from 2NF involves the removal of
transitive dependencies. If a transitive dependency exists, we remove
the transitively dependent attributes from the relation and put them
in a new relation along with a copy of the determinant (LHS of FD).

Boyce-Codd Normal Form


(BCNF)
A relation is in Boyce-Codd normal form (BCNF) if and
only if every determinant is a candidate key.
To test if a relation is in BCNF, we take the determinant of
each FD in the relation and determine if it is a candidate key.
The difference between 3NF and BCNF is that 3NF
allows a FD X Y to remain in the relation if X is a
super key or Y is a prime attribute. BCNF only allows
this FD if X is a super key.
Thus, BCNF is more restrictive than 3NF. However, in
practice most relations in 3NF are also in BCNF.

Boyce-Codd Normal Form


(BCNF)

Consider the WorksOn relation where we have the


added constraint that given the hours worked, we
know exactly the employee who performed the
work. (i.e. each employee is FD from the hours that
they work on projects). Then:

Note that we lose the FD eno,pno resp, hours.

Multi-Valued Dependencies
A multi-valued dependency (MVD) occurs when two
independent, multi-valued attributes are present in the schema.
When these multi-valued attributes are flattened into a 1NF
relation, we must have a tuple for every combination of the
values in the two attributes.
It may seem strange why we would want to do this as it
obviously increases the number of tuples and redundancy.
The reason is that since the two attributes are independent it
does not make sense to store some combinations and not the
others because all combinations are equally valid. By leaving
out some combination, we are unintentionally favoring one
combination over the other which should not be the case.

Multi-Valued Dependencies
Example Employee may:
- work on many projects
- be in many departments

Fourth Normal Form (4NF)


Example

Lossless-join Dependency
The lossless-join dependency refers to the
fact that whenever we decompose relations
using normalization we can rejoin the relations
to produce the original relation such that no
spurious tuples are generated when relations
are natural joined.

Fifth Normal Form (5NF)


Fifth normal form (5NF) is based on join
dependencies.
A relation is in fifth normal form (5NF) if and only
if every nontrivial join dependency is implied by the
super keys of R.
A join dependency (JD) denoted by JD(R1, R2, ,
Rn) on relational schema R specifies a constraint on
the states r of R. The constraint states that every
legal state r of R is equal to the join of its projections
on R1, R2, , Rn. That is for every such r we have:
R1(r) R2(r) Rn(r) = r

Fifth Normal Form (5NF) Example

Let R be in BCNF and let R have no composite keys. Then R is in 5NF


Note: That only joining all three relations together will get you back to the original
relation. Joining any two will create spurious tuples!

K
N
A
TH
!
!
!

U
YO
T
S
E
B
E
H
T
L
AL

E
T
M
R
FO

Potrebbero piacerti anche