Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introduction
You can use SQL SELECT statements to retrieve data. The actual combination of clauses you employ in your statement depends on the kind of data you need. In most situations, the data you require would be spread across several tables. SQL provides joins, unions, and subqueries to retrieve data located across tables. To obtain summarized data, you can use aggregate functions combined with the GROUP BY ... HAVING, COMPUTE, ROLLUP, and CUBE clauses.
We will examine unions and subqueries in this session and joins in a later session.
Unions
The UNION operator combines the result of two or more SELECT statements into a single result set. Each SELECT statement mus have the same structure of compatible column types, and column numbers. The column names can be different in each SELECT statement. The result set will display the column names of the first SELECT statement only. Syntax: SELECT statement UNION [ALL] SELECT statement By default, the UNION operator removes duplicates from the result set. But, if you use the ALL clause with the UNION statement, the all the rows are returned.
Unions
Let us consider a banking application. The Saving_Account and the Current_Account tables store information on customers with savings and current accounts respectively. To see the Account_No and Name of all the bank customers, you have to retrieve the records from both tables.
Saving_Account Table Account_No S001 S002 S003 S004 Name James Rita Mary Valentina Current_Account Table Account_No Name C001 Michael C002 Robin
Unions
The following query will retrieve the details of all customer of the bank.
SELECT Account_No, Name FROM Saving_Account UNION SELECT Account_No, Name FROM Current_Account The above query will display the following result:
Account_No S001 S002 S003 S004 C001 C002 Name James Rita Mary Valentina Michael Robin
Subqueries
You can use one SELECT statement to return records that another SELECT statement will use. The encapsulating query is called the PARENT KEY and the inner query is called the subquery.
Parent Query
Select <Column Name> from <table>
WHERE <Column Name> =
Subquery
Prepared: Alexander Montealegre Ramirez
Subqueries
The subquery should return the column used in the criteria of the first SELECT statement. The innermost SELECT statement is executed first. This saves you the steps in operations, which would otherwise require two separate queries to locate the information. You can formulate many SQL statements that include subqueries, as joins.
Subqueries
For example, let us execute a subquery for the Fly Safe Airways case study. Consider that you want to know the start point (source city name) for the available Air China Flights. The Flight table holds the city_code, but it is no useful in this case, as you want the city name. The city information is stored in the City_master table.
Subqueries
You can use a subquery as a replacement for a value in the SELECT clause, or as part of the WHERE clause. When a subquery is introduced with an operator, there are restrictions on the columns and rows that the subquery returns. These restrictions
One Row One Column Many Columns Use =, >, < and the Use EXISTS other comparison operators Use ANY, ALL, IN and Use EXISTS EXISTS
Many Rows
If a comparison operator is used with a subquery, and the query returns more than one row, SQL Server will return an error. Many subqueries can be re-written with joins
Prepared: Alexander Montealegre Ramirez
You have to use the EXISTS keyword, for example, to display the names of cities that are source cities for the stored flights. The EXISTS keyword ensures that the city name is displayed only if the subquery returns a TRUE for the condition that checks the city codes existence.
SELECT * FROM city_master where exists (select source from flight Where flight.source = city_master.city_code)
Prepared: Alexander Montealegre Ramirez
Similary, you can sue the NOT EXISTS keyword to determine cities that are not the destination points of the available flights.
SELECT * FROM city_master where not exists (select destination from flight Where flight.destination = city_master.city_code)
Nested Subqueries
Nested Subqueries can call other subqueries. However, a deep level of nesting lowers the performance of the query. For example, we want to display the destination city for passengers with PNR_no 1. This query involves the following three tables:
Reservation table to determine the aircraft code for PNR_no 1 Flight table to find the destination city code for the aircraft City_master to obtain the name of the destination City
select city_name from city_master where city_code= (select destination from flight where aircraft_code = (select aircraft_code from reservation where pnr_no = 1))
Prepared: Alexander Montealegre Ramirez
Correlated Queries
In most queries contains subqueries, the subquery needs to be evaluated only once to provide the values the parent query needs. In most queries, the value in the subquery remains constanst since the subquery makes no reference to the parent query. This is because the search criterion in the subquery depends on the value of a particular record in the parent query. When a subquery takes parameters from its parent query, it is know as a correlated subquery
Prepared: Alexander Montealegre Ramirez
Correlated Queries
Consider that you want to determine the cities from where Air China flights are available. This query involves the following two tables:
City_master table: provides details of various cities. Flight table: provides details of various flights
SELECT city_name FROM city_master Where city_code in (select source from flight Where city_master.city_code = flight.source)
? Exercises Subqueries
Construct a SELECT statement using subqueries to show the City, Address of the passengers for which the reservations have been done.
Aggregate Functions
SQL server provides several aggregate functions to generate summary values. They can be applied to all the rows in a table, or to the rows that a WHERE clause specifies. Aggregate functions generate a single value from each set of rows. Each column in the SELECT statement should use an aggregate function.
SUM
SUM returns the total of all the values in an expression. SUM supports the use of DISTINCT to summarize only the unique values in an expression, while ignoring the NULL values. SUM can be used only for numeric columns. Syntax: SUM(Expression) SELECT SUM(No_of_seats) as Seats Cancelled from reservation Where Status = C
Prepared: Alexander Montealegre Ramirez
AVG
The AVG functions returns the average of all the values in an expression. This function can only be used with numeric columns. It automatically ignores NULL values. Syntax: AVG ([ALL!DISTINCT]Expression) SELECT AVG(journey_hrs) as Average tiem to New York from flight Where destination = NY
COUNT
COUNT returns the number of non_NULL values in the provide expression. If used with DISTINCT, COUNT finds the number of unique values. COUNT supports both numeric and character columns. The PRIMARY KEY and FOREING KEY columns can be safely used with COUNT, as they do not contain NULL values. You can also use an asterik (*) as the COUNT expression. If you use an ansterik, you do not need specify a particular column name, and all rows will be counted.
? Exercises Functions
Construct a SELECT statement to display the total number of flights from Shanghai...
MAX
MAX returns the maximun value in an expression. MAX can be used with numeric, character, and datetrime data types. It ignores NULL values. Syntax: MAX (Expression) SELECT MAX(Fare) as Maximum Fare from flight_details
MIN
MIN returns the minimun value in an expression. MIN can be used with numeric, character, and date\time columns. When MIN is used with character columns, it returns the lowest value in collating order. This function ignores the NULL values. Syntax: MIN (Expression) SELECT MIN(Fare) as Minimun Fare from flight_details
GROUP BY Clause
The GROUP BY clause partitions the result set into one or more subsets. Each subset has values and expressions in common. If an aggregate functions is used in a SELECT statement, the GROUP BY clause produces a single value per aggregate. Syntax: GROUP BY (Column Name)
? Exercises Group By
Identify the error in following SELECT statement SELECT aircraft_code, class_code, COUNT(pnr_no) GROUP BY aircraft_code
GROUP BY Clause
The GROUP BY clause can have more than one grouped column. For example, consider that you want to find out the number of tickets for each class, on every flight. SELECT aircraft_code, class_code, sum(no_of_seats) As Seats booked from reservation Group by aircraft_code, class_code This query will first group the results aircraft_code, and then, within each aircraft, it will group for every class.
SELECT Min(Dep_time) As Earliest Flight, aircode From flight Where source = Sha Group by aircode
Prepared: Alexander Montealegre Ramirez
? Exercises Group By
Identify the error (if any) in following SELECT statement
SELECT aircraft_code, SUM(no_of_seats) AS Total Seats FROM reservation WHERE aircraft_code IN (SELECT aircraft_code FROM flight) GROUP BY aircraft_code HAVING journey_date > 03/14/2001
Prepared: Alexander Montealegre Ramirez
GROUP By
WHERE filters the rows that result from the operations specified in the FROM clause. GROUP BY groups the output of the WHERE clause. HAVING filters rows from the grouped result
WHERE
GROUP BY
HAVING
Summarizing Data
You can use the GROUP BY clause to return summarized data. However, you cannot include detail rows in the result set. SQL server provides some operators and clauses that give the summary and detailed information together in a single result set. For example, in the marks table, consider that you have the columns Roll_No, Subject, and Marks. If you use the GROUP BY clause to display th total marks for each roll number, you will no be able to see the marks that the student with that roll number obtained for each subject. For this, you need to use the following operators: CUBE ROLLUP
Prepared: Alexander Montealegre Ramirez
CUBE
CUBE is an aggregate operator that produces a super-aggregate row. This is the summary of the row information that the GROUP BY clause generates. Creating all possible combinations of groupings from the list of columns in the GROUP BY clause generates the super-aggregate rows.
Syntax: SELEC ........... :: GROUP BY WITH CUBE
Prepared: Alexander Montealegre Ramirez
CUBE
For example, the sales table in the pubs sample database has three columns: Stor_id, payterms, and Qty. It has the following records:
Stor_Id
6380 7066 7067 7067 7067 7067 7131 7131 7131 7131 7131 7131 7896 8042 8042 8042
Payterms Qty
Net 60 Net 30 Net 30 Net 30 Net 30 Net 60 Net 30 Net 30 Net 60 Net 60 Net 60 Net 60 ON invoice Net 30 ON invoice ON invoice 5 50 40 20 20 10 20 25 20 25 15 25 35 30 15 10
CUBE
The following query (does not include the CUBE operator) is executed: SELECT Stor_id, Payterms, SUM(Qty) as Total_Quantity FROM Sales GROUP BY Stor_id, Payterms ORDER BY Stor_id, Payterms
Stor_Id
6380 7066 7067 7067 7131 7131 7896 8042 8042
Payterms
Net 60 Net 30 Net 30 Net 60 Net 30 Net 60 ON invoice Net 30 ON invoice
Qty
5 50 80 10 45 85 35 30 25
CUBE
Let us now include the CUBE operator. The query will be as follows: SELECT Stor_Id, Payterms, SUM(Qty) AS Total_Quantity FROM Sales GROUP BY Stor_Id, Payterms WITH CUBE
ROLLUP
The ROLLUP operator is useful while generating reports that contain subtotals and totals. THE ROLLUP operator generates a result set that is similar to the result set that CUBE operator generates. CUBE generates a result set showing aggregates for all combinations of values in the selected columns. ROLLUP generates a result set showing aggregates for a hierachy of values in the selected columns. Syntax: SELECT ............ : GROUP BY WITH ROLLUP
ROLLUP
In the above query, if we replace the CUBE with the ROLLUP operator, you will notice that the ROLLUP operator only returns the total for Stor_id, and not for Payterms.