Sei sulla pagina 1di 30

Database Theory

Introduction to Data Retrieval

21 Jan 2012

EKSWAcademy

Role of Databases
Report
Native Language C/C++, Java etc.

SQL
Program

Database

App

EKSWAcademy

What is inside a Database


Examples: Access, MS SQL, Oracle, My SQL, DB2 etc
Database or a Schema

Tables or Relations

Row, Tuple or Item


Column, Field, or Attribute

EKSWAcademy

Structured Query Language

A standard language for interacting with databases High level of abstraction Declarative in nature Interactive as well as programmatic
Vendor Specific Names:
q q

Transact-SQL (Microsoft) PL/SQL (Oracle)

EKSWAcademy

Relational Algebra / Set Theory


Universe Intersection
Compliment

Two Sets

Union

Difference

Image Credits

EKSWAcademy

Types of SQL

For Data Retrieval


For Data Modification For Data Definition For Data Control

DML DDL DCL

We will focus entirely on DML today

EKSWAcademy

Data Retrieval

Three Basic Groups We can mix and match

Select to Retrieve Rows/Records Project to Retrieve Columns/Fields Join to Merge multiple Tables

EKSWAcademy

Basic Selection

Select * from Employees Select * from Employees where BIRTHDATE = '1980-01-01' Select top 10 * from Employees Select EMPLID, NAME, CITY from Employees where BIRTHDATE = '1980-01-01' and CITY='New York'

EKSWAcademy

Simple Select Statement

You can Alias Tables and Columns


Database Differences

Select EMPLID SECURE_ID, NAME EMPLOYEE From EMPLOYEES E where E.COUNTRY='US'

Some Databases are Case Sensitive (install option) Top N, Fetch N Rows Only, LIMIT, ROWNUM etc The statement end character is different

EKSWAcademy

Simple Data Types

String Types : Char, Varchar, Text etc

NAME, CITY, ZIP etc


Items, Salary, Rate, Price, Discount

Number Types : Integer, Decimal etc

Date Types : Date, Date and Time

DOB, Hire Date, Purchase Date Attachment, Picture, File EKSWAcademy

Binary Types : Blob, Clob etc

Principle of Closure
You can mix columns of the same type

ITEMS_SOLD + ITEMS_INSTOCK = TOTAL_ITEMS FIRST_NAME || LAST_NAME = NAME NUM_MILES || 'MILES' = may be invalid!

Type of the result of an operation should be the same as the types of inputs to the operation
In SQL, the Result of a Query is Always equivalent to another Table May be valid, but does not obey the closure rule

Today BIRTHDATE = Days


EKSWAcademy

Common Column Operations


Arithmetic

Most common

+, -, *, /, mod
String

Most useful

concat, trim, substring, replace, pad, left


Date

Most complex

Current date, Date Add, DateDiff, Format


EKSWAcademy

Revisit Closure
Most column operations require types should match You can convert types from one type to another

Two Type Conversion Functions

Cast and Convert


CAST ( DISTANCE AS CHAR(6)) || 'MILES'

EKSWAcademy

Selection with Sort


You can order the records retrieved
Select * from Employees Order by BIRTHDATE Select * from Employees Order by CITY, BIRTHDATE
q q

Select top 5 * from SALES order by ITEMS_SOLD

Limit the rows with a sort

Without an explicit ORDER BY the retrieval should be arbitrary ORDER BY is expensive, so use only when needed

Select * from Employees Order by CITY, (TODAY - BIRTHDATE)


q

You can use simple or complex expressions for ordering

EKSWAcademy

Missing Data / NULL Value Columns Oh! Noes!! I don't know her BIRTHDATE!
You simply put a NULL where missing

(if permitted)

Space is not a NULL You can ask the DB not to Zero is not a NULL allow nulls in any column !! False is not null NULL is an Unknown, Unavailable, or Missing

NULL:s need to handled carefully


EKSWAcademy

Handling NULL columns


Select Name, coalesce(CITY,'Unknown') as CITY From Employees
The

Coalesce

function forces a default value when Value is missing

NVL, NULLIF etc


Select Name, coalesce(CITY,'Unknown') as CITY From Employees where

BIRTHDATE is null
is always false!
is also always false! EKSWAcademy

This gives the employees whose BIRTHDATE is missing

BIRTHDATE = null
BIRTHDATE != null

Conditional Expression
Select Case When years(Today, BIRTHDATE) between 1 and 35 then 14 When years (Today, BIRTHDATE) between 36 and 55 then 21 Else 28 end VACTION From Employees Where BIRTHDATE is not null

You can use the Conditional Expressions in Where Clause also

Coalesce is a simple case statement!

EKSWAcademy

Boolean Logic
AND, OR, NOT, BETWEEN, IN, IS, NULL
SELECT CustomerName, QuantityPurchased FROM Orders WHERE QuantityPurchased > 3 AND PURCHASE_DATE > '2011-01-01' AND PURCHASE_DATE < '2011-12-31'

SELECT CustomerName, QuantityPurchased FROM Orders WHERE AND AND AND (QuantityPurchased > 3 or QuantityReserved > 3) (PURCHASE_DATE BETWEEN '2011-01-01'AND '2011-12-31') PURCHASE_DATE is not NULL CITY in ('New York', 'Buffalo', 'Rochester')

EKSWAcademy

Wild Cards You can search by incomplete data


Select * from Movies Where Title like '%Indiana Jones%'
Soundex method to find words by sounding like Levenshtein distance to find words similar but not the same

EKSWAcademy

Avoiding Duplicates

Select Distinct ARTIST from ALBUMS Where YEAR_RECORDED Between 2000 and 2012

Distinct is an expensive operation. Use of distinct can be minimized with more stringent data design

EKSWAcademy

Grouping the Data Grouping accompanies Summarization Summarization Functions are

Sum Avg Count Max Min

EKSWAcademy

Group By
Select CITY, MONTH, Sum (coalesce(Products_Shipped,0)), Sum (coalesce(Products_Reserved,0)), Sum (coalesce(Products_Returned,0)), Count (distinct CUSTOMER_ID), Avg (days(Product_Ordered,Product_Shipped)) From ORDERS GROUP By CITY, MONTH
All columns in the Select List must be in either Group BY or use a Summarization Function

EKSWAcademy

Using Summary Functions in where


Select CITY, MONTH, Sum (coalesce(Products_Shipped,0)), Sum (coalesce(Products_Reserved,0)), Sum (coalesce(Products_Returned,0)), Count (distinct CUSTOMER_ID) From ORDERS GROUP By CITY, MONTH Having Avg (days(Product_Ordered,Product_Shipped)) > 3
Make sure that the Group function in Having does not evaluate to null !!

EKSWAcademy

Orthogonality
q

The Components of a Language should be independent of each other They should interact with each other in predictable manner

TABLE : COLUMN: QUERY:

Nothing but a set of Rows (A Vector) Nothing but a single value (A Scalar) Nothing but a Vector

A Query can use another query that returns a set of rows in place of a Table A Query can use another query that returns a single value in place of a Column/Scalar

EKSWAcademy

Revisiting Closure
The final Output of ANY Query is equivalent to a Table with rows and columns

As such, any Query can become Part of another Query, since every Query needs a Table!!

EKSWAcademy

Sub-Queries
Several Types:

Returning Scalars Returning Rows Correlating Non-Correlating

EKSWAcademy

Simple Scalar SubQuery


select EMPLID, NAME, ZIP, (select max(NUM_DAYS) from VACATION where VACATION_TYPE='Public') PUBLIC_HOLIDAYS from EMPLOYEES

The PUBLIC_HOLIDAYS are available in Vacation Table, so we used a subquery To get them.
q q q

It returns a single value (Should, rather) It is independent, so called, non-correlated Quite efficient, usually run only once for the entire query irrespective of the returned rows

EKSWAcademy

Correlated Sub Query


It is run for once every row! select EMPLID, NAME, ZIP, (select NUM_DAYS from VACATION where VACATION_TYPE='Public' and VACATION.ZIP = EMPLOYEES.ZIP) PUBLIC_HOLIDAYS from EMPLOYEES

It runs the sub query for every row. Hence called Correlated. The reason Is, it needs the ZIP from the current row to find the value for that row. Expensive, but more meaningful.

EKSWAcademy

Scalar Sub Query in where clause


select EMPLID, NAME, ZIP from EMPLOYEES where (select NUM_DAYS from VACATION where VACATION_TYPE='Public' and VACATION.ZIP = EMPLOYEES.ZIP) < 10

In Essence: Follow Closure: To mix it with other scalars of the same Type To Place it wherever a Scalar is expected

Follow Orthogonality: Make Sure:

That the subquery produces a scalar always! That means only one Value, one Row Use Coalesce() to prevent nulls

Use Max() to prevent mishaps

EKSWAcademy

Vector Sub Query


Simplest Form: It has to produce rows and columns It can take the place of any table in the query

Select NAME, ZIP, EMPLID from (select * from EMPLOYEES) A Where NAME like '%Jim%'

EKSWAcademy

Potrebbero piacerti anche