Sei sulla pagina 1di 4

Relational algebra

Query Algebra

 Can use the relational algebra: an algebra of relations


 Not seen by the user. User sees SQL
 Makes it easier to manipulate relations, however:
o SQL uses “bags” (allows duplicate rows) not “sets”
 SQL adds more operations:
o Aggregation
o Grouping
o Sorting
 Different algebraic variations of a query are called plans

Relational Algebra

 The results of an operator are themselves relations


 The relational algebra forms a closed system, and operators can be composed:
 Op3(Op2(Op1(R)))

Relational Algebra operator

 Set operators: Union, intersection, difference; in SQL, UNION, INTERSECT and EXCEPT
 Selection; in SQL a select with optional where clause
 Projection: chooses columns; in SQL specify them in a select, rename them, and compute
them
 Product: in SQL a join with no conditions
 Join: in SQL several types of joins

Extension operators

 Elimination of duplicates: turns a bag into a set; in SQL use distinct in the select clause
 Grouping: in SQL use GROUP BY and aggregate functions
 Sorting: in SQL use ORDER BY

An algebra of bags

 R ⋃ S : a tuple is in the result the number of times it is in R plus the number of times in S
 R ∩ S : a tuple is in the result the minimum of times it is in R and in S
 R – S : a tuple is in the result the number of times it is in R minus the times it is in S, but not
fewer than 0 times
 R and S must have the same schema

If R and S are sets

 R ∩ S and R – S would give the same result as na algebra of sets


 R ⋃ S would not: an element in both sets would appear twice in the result

In SQL

 UNION, INTERSECT and EXCEPT eliminate duplicates from the result


 Unless we specify the keyword ALL
 select …. UNION ALL select ….
Selection operator

 σc (R) : applies condition c to relation R


 Result is the bag of rows that satisfies c
 Condition c may involve:
 Arithmetic,…., string operators, comparisons, and boolean operators (AND, …)
 In SQL we can have sub-queries and operators such as EXISTS
 Sub-queries should be expressed as joins in this relational algebra (cannot be expressed as
conditions)

Projection operator

 πL (R) : projects the relation R into the list L


 The list L can also rename attributes, and use string and numeric operations
 The schema of the resulting relation is L
 Note that the result may have duplicates even if R does not have

Product operator

 R × S : the schema of the resulting relation has all attributes from R plus all from S (qualified
if they have the same name)
 A tuple in the result is formed by a tuple of R concatenated by a tuple of S
 And this for every tuple of R

Join operator

 R ⋈ S : equivalent to πL ( sc (R x S))
 Natural join: compares common columns of both tables using a condition c
o Should make sure common columns exist
o Projects only one of each equated attribute
 Cartesian join: the product of two tables
 θ-join: or equi-join if θis the = operator
o The equi-join comparing two attributes is the most common form of join (also called
inner join)
o Equated columns appear in the result

Renaming

 ρ(R)
 Used to rename result relations or columns
 ρects ← credits(course)

Duplicate elimination

 δ (R) : the result has no duplicates, as with SQL DISTINCT keyword


 Converts a relation from a multiset (allows duplicates) to a set (no duplicates)

Sorting operator

 τL (R)
 Converts a bag to a list (implies ordering)
 The tuples in relation R are sorted according to the sorting attributes (in ascending order by
default)
 In SQL, ORDER BY the attributes in the list L
 To order by descending order use the keyword DESC
 If another operator is applied next, order is lost

Grouping

 γL (R)
 Aggregates tuples within a relation into subsets
 Aggregation uses a function f in a single attribute (min, max, count, sum, avg)
 The elements in L are either:
o A grouping attribute
o An aggregation function

Outer joins

 Outer joins allow dangling tuples (no matching pair) to appear in the result by padding them
with ꓕ(NULL) values
 Left outer joins keep dangling tuples from the left relation
 Right outer joins keep dangling tuples from the right relation

Semi-join

 R ⋉ S and R ⋊ S
 Join and keep only the schema at left (right)

Anti-joins

 R⊳S
 Returns tuples of R having no matching in S
 Also known as anti-semi-join, can be writen as
 R⊳S=R−R⋉S

Division

 R÷S
 Returns tuples of R with a match in all tuples of S There is no corresponding SQL operator,
but there are several alternatives
 Can answer questions with “give all tuples that match a given condition”
 Can be translated as:

Properties of the operators

 Natural join, multiplication, union and intersction are associative and commutative
o R ⋈ S ≡ R ⋈ S, (R ⋈ S) ⋈ T ≡ R ⋈ (S ⋈ T)
o R × S ≡ R × S, (R × S) × T ≡ R × (S × T)
o R ⋃ S ≡ R ⋃ S, (R ⋃ S) ⋃ T ≡ R ⋃ (S ⋃ T)
o R ∩ S ≡ R ∩ S, (R ∩ S) ∩ T ≡ R ∩ (S ∩ T)
 Selection is commutative:
o σ1 (σ 2 (R)) ≡ σ 2 (σ 1 (R))
 Difference is neither associative nor commutative
 Teta join may not be associative (natural join is)
Equivalence rules

Expression trees

 Operators can be chained, resulting in expression trees


 Leaves of the tree are names of relations
 Interior nodes are operators
 Execution proceeds bottom-up
 There are generally equivalent trees

Example tree

 A typical SPJ (select-project-join) expression


 Write the relational algebra expression and the SQL of:

Transformations

 Expression trees can be improved by the DBMS


 Improvements depend on heuristics and the information from the catalog (size of columns,
number of tuples)
 There is a special module in charge of optimizing queries: the query optimizer!

Potrebbero piacerti anche