Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Agenda
1. Basic of SQL 2. Best Practices for writing SQL query 3. Best Practices for PL/SQL
SQL processing uses the following main components to execute a SQL query: The Parser checks both syntax and semantic analysis. The Optimizer uses costing methods, cost-based optimizer (CBO), or internal rules, rule-based optimizer (RBO), to determine the most efficient way of producing the result of the query. The Row Source Generator receives the optimal plan from the optimizer and outputs the execution plan for the SQL statement. The SQL Execution Engine operates on the execution plan associated with a SQL statement and then produces the results of the query.
Joining Tables
The syntax of the Oracle SQL proprietary join format is as shown: SELECT { [ [schema.]table. | [alias.] ] { column | expression [, ... ] } | * } FROM [schema.]table [alias] [, ] [ WHERE [ [schema.]table.|alias.] { column | expression [(+)] } comparison condition [ [schema.]table.|alias.] { column | expression [(+)] } [ { AND | OR } [ NOT ] ] ] [ GROUP BY ] [ ORDER BY ]
ANSI JOIN
SELECT {{[[schema.]table.|alias.] {column|expression} [, ]}|*} FROM [schema.]table [alias] [ CROSS JOIN [schema.]table [alias] | NATURAL [INNER | [ LEFT | RIGHT | FULL] OUTER] JOIN [schema.]table [alias] |{ [INNER | [LEFT | RIGHT | FULL] OUTER] JOIN [schema.]table [alias] { ON (column = column [{AND | OR} [NOT] column = column ]) | USING (column [, column ]) } } ] [ WHERE ] [ GROUP BY ] [ ORDER BY ];
Examples Of Joins
For oracle proprietary format
SELECT di.name, de.name FROM division di, department de WHERE di.division_id = de.division_id(+);
ANSI SELECT di.name, de.name FROM division di LEFT OUTER JOIN department de USING (division_id);
Types of Joins
Cross-Join. A cross-join (or Cartesian product) is a merge of all rows in both tables, where each row in one table is matched with every other row in the second table: SELECT * FROM division, managerof; Inner or Natural Join. An inner join is an intersection between two tables, joining based on a column or column names: SELECT * FROM division NATURAL JOIN managerof; Outer Join. An outer join joins rows from two tables. Rows joined are those both in the intersection plus rows in either or both tables, and not in the other table.
Left Outer Join. A left outer join joins all intersecting rows plus rows only in the left table: SELECT * FROM division NATURAL LEFT OUTER JOIN managerof; Right Outer Join. A right outer join is the opposite of a left outer join: the intersection plus all rows in the right table only: SELECT * FROM division NATURAL RIGHT OUTER JOIN managerof; Full Outer Join. A full outer join retrieves all rows from both tables: SELECT * FROM division NATURAL FULL OUTER JOIN managerof; Self-Join. A self-join joins a table to itself: SELECT manager.name, employee.name FROM employee manager JOIN employee employee ON (employee.manager_id = manager.employee_id);
Equi-/Anti-/Range Joins. These joins use the appropriate comparison conditions to join rows in tables: SELECT * FROM division di JOIN department de ON(di.division_id = de.division_id);
Subqueries
There are specific types of subqueries: Single-Row Subquery. A single-row subquery returns a single row from the subquery. Some comparison conditions require a single row with a single column: SELECT * FROM project WHERE projecttype_id = (SELECT projecttype_id FROM projecttype WHERE projecttype_id = 1); Multiple-Row Subquery. A multiple-row subquery returns one or more rows from the subquery back to the calling query: SELECT project_id FROM project WHERE projecttype_id IN (SELECT projecttype_id FROM projecttype);
Multiple-Column Subquery. A multiple-column subquery returns many columns: SELECT COUNT(*) FROM(SELECT * FROM project); Regular Subquery. A regular subquery executes a subquery in its entirety where there is no communication between calling query and subquery: SELECT * FROM department WHERE division_id IN (SELECT division_id FROM division);
Correlated Subquery. A correlated subquery can use a value passed from a calling query as a parameter, to filter specific rows in the subquery. Values can only be passed from calling to subquery, not the other way around SELECT * FROM division WHERE EXISTS (SELECT division_id FROM department WHERE division_id = division.division_id);
The Basics of efficient SQL Name the Columns in a Query: There are three good reasons why it is better to name the columns in a query rather than to use "select * from ...". 1. Network traffic is reduced. This can have a significant impact on performance if the table has a large number of columns, or the table has a long or long raw column (both of which can be up to 2 GB in length). These types of columns will take a long time to transfer over the network and so they should not be fetched from the database unless they are specifically required. 2. The code is easier to understand.
3. It could save the need for changes in the future. If any columns is added to or removed from the base table/view, then select * statement can produce wrong results set and statement may fail. For example: SELECT division_id, name, city, state, country FROM division; Is faster than: SELECT * FROM division; Also, since there is a primary key index on the Division table: SELECT division_id FROM division;
EXPLAIN PLAN SET statement_id='TEST' FOR SELECT * FROM stock; Query Cost Rows Bytes -------------------------------------- --------SELECT STATEMENT on 1 118 9322 TABLE ACCESS FULL on STOCK 1 118 9322 EXPLAIN PLAN SET statement_id='TEST' FOR SELECT stock_id FROM stock; Query Cost Rows Bytes ---------------------------- ------ ------- --------SELECT STATEMENT on 1 118 472 INDEX FULL SCAN on XPKSTOCK 1 118 472
SELECT COUNT (*) FROM emp WHERE sal < 2000; SELECT COUNT (*) FROM emp WHERE sal BETWEEN 2000 AND 4000; SELECT COUNT (*) FROM emp WHERE sal>4000; However, it is more efficient to run the entire query in a single statement. Each number is calculated as one column. The count uses a filter with the CASE statement to count only the rows where the condition is valid. For example: SELECT COUNT (CASE WHEN sal < 2000 THEN 1 ELSE null END) count1, COUNT (CASE WHEN sal BETWEEN 2001 AND 4000 THEN 1 ELSE null END) count2, COUNT (CASE WHEN sal > 4000 THEN 1 ELSE null END) count3 FROM emp;
IN v/s EXISTS
IN should be used to test against literal values and EXISTS to create a correlation between a calling query and a subquery. IN will cause a subquery to be executed in its entirety before passing the result back to the calling query. EXISTS will stop once a result is found. IN is best used as a preconstructed set of literal values There are two advantages to using EXISTS over using IN. The first advantage is the ability to pass values from a calling query to a subquery, never the other way around, creating a correlated query. The correlation allows EXISTS the use of indexes between calling query and subquery, particularly in the subquery.
The second advantage of EXISTS is that, unlike IN, which completes a subquery regardless, EXISTS will halt searching when a value is found. The benefit of using EXISTS rather than IN for a subquery comparison is that EXISTS can potentially find much fewer rows than IN. IN is best used with literal values, and EXISTS is best used as applying a fast access correlation between a calling and a subquery.
Joins
A join is a combination of rows extracted from two or more tables. Joins can be very specific, for instance, an intersection between two tables, or they can be less specific, such as an outer join. An outer join is a join returning an intersection plus rows from either or both tables, not in the other table. Efficient Joins An efficient join is a join SQL query that can be tuned to an acceptable level of performance. Certain types of join queries are inherently easily tuned and can give good performance. In general, a join is efficient when it can use indexes on large tables or is reading only very small tables
JOINS
Intersections An inner or natural join is an intersection between two tables. In mathematical set parlance, an intersection contains all elements occurring in both of the sets (elements common to both sets). An intersection is efficient when index columns are matched together in join clauses. Intersection matching not using indexed columns will be inefficient.
Nested Subqueries
Subqueries can be nested where a subquery can call another subquery. The following example using the Employees schema shows a query calling a subquery, which in turn calls another subquery: EXPLAIN PLAN SET statement_id='TEST' FOR SELECT * FROM division WHERE division_id IN (SELECT division_id FROM department WHERE department_id IN (SELECT department_id FROM project));
4.When testing against subqueries, retrieve, filter, and aggregate on indexes, not tables. 5. Do not be too concerned about full table scans on very small static tables.
EXPLAIN PLAN SET statement_id='TEST' FOR SELECT c.name FROM customer c JOIN orders o USING(customer_id) JOIN ordersline ol USING(order_id) JOIN transactions t USING(customer_id) JOIN transactionsline tl USING(transaction_id) WHERE c.balance > 0;
EXPLAIN PLAN SET statement_id='TEST' FOR SELECT c.name FROM customer c WHERE c.balance > 0 AND EXISTS( SELECT o.order_id FROM orders o WHERE o.customer_id = c.customer_id AND EXISTS( SELECT order_id FROM ordersline WHERE order_id = o.order_id )) AND EXISTS( SELECT t.transaction_id FROM transactions t WHERE t.customer_id = c.customer_id AND EXISTS( SELECT transaction_id FROM transactionsline WHERE transaction_id = t.transaction_id ) );
EXPLAIN PLAN SET statement_id='TEST' FOR SELECT c.name, tl.amount FROM customer c JOIN orders o USING(customer_id) JOIN ordersline ol USING(order_id) JOIN transactions t USING(customer_id) JOIN transactionsline tl USING(transaction_id) WHERE tl.amount > 3170 AND c.balance > 0;
SELECT c.name, b.amount FROM customer c, (SELECT t.customer_id, a.amount FROM transactions t,( SELECT transaction_id, amount FROM transactionsline WHERE amount > 3170) a WHERE t.transaction_id = a.transaction_id )b WHERE c.balance > 0 AND EXISTS( SELECT o.order_id FROM orders o WHERE o.customer_id = c.customer_id AND EXISTS( SELECT order_id FROM ordersline WHERE order_id = o.order_id ) );
Best code
Use optimal resource Provides quicker solution
Modularized Design
Bad design
Dump your logic in a single procedure Having lots of selects inserts updates and deletes.etc
Good design
Break your logic into small blocks Grouping related logic as a single block or program
Modularize
Modularize will
reduce complexity make your tasks manageable make your resulting code maintainable
Use Packages
For each major functionality With repeated DML as procedure With repeated select as functions
Naming convention
Follow a standard throughout your code
Easy to understand Easy for maintain and change
Example
Local variable l_var_name Procedure parameter p_var_name Global variable g_var_name
CREATE OR REPLACE PROCEDURE GEN_SWIP ( an_document_number IN asap.serv_req_si.document_number%TYPE, an_serv_item_id IN asap.serv_req_si.serv_item_id%TYPE, an_srsi_ip_addr_seq IN asap.srsi_ip_addr.srsi_ip_addr_seq%TYPE,) as begin null; end;
Bind Variables
Oracle performs a CPU intensive hard parse for all new statements. Statements already present in the shared pool only require a soft parse. Statement matching uses Exact Text Match, so literals, case and whitespaces are a problem. bind_variable_usage.sql Unnecessary parses waste CPU and memory. bind_performance.sql
Oracle Server
Executor
SQL Statement
PL/SQL contains procedural and SQL code. Each type of code is processed separately. Switching between code types causes an overhead. The overhead is very noticeable during batch operations. Bulk binds minimize this overhead.
M in
Oracle server
/
/ block
ti e
i e
roced ral state e t exec tor
i e
FOR rec IN emp_cur LOOP UPDATE employee SET salary = ... WHERE employee_id = rec.employee_id; END LOOP;
Oracle server
/
/ block
FORALL indx IN list_of_emps.FIRST.. list_of_emps.LAST UPDATE employee SET salary = ... WHERE employee_id = Update... list_of_emps(indx); Update... Update... Update... Update... Update... Update... Update... Update... Update... Update... Update...
ti e
i e
roced ral state e t exec tor
i e
FORALL Use with inserts, updates, deletes and merges. Move data from collections to tables. BULK COLLECT Use with implicit and explicit queries. Move data from tables into collections. In both cases, the "back back" end processing in the SQL engine is unchanged. Same transaction and rollback segment management Same number of individual SQL statements will be executed. But BEFORE and AFTER statement-level triggers only fire once per FORALL INSERT statements
Bulk-Binds FORALL
Bind data in collections into DML using FORALL. FORALL i IN l_tab.FIRST .. l_tab.LAST INSERT INTO tab2 VALUES l_tab(i); l_tab( Insert_all.sql Use INDICIES OF and VALUES OF for sparse collections. Use SQL%BULK_ROWCOUNT to return the number of rows affected by each statement.
The SAVE EXCEPTIONS allows bulk operations to complete. Exceptions captured in SQL%BULK_EXCEPTIONS.
Declarations in Loops
Code within loops gets run multiple times. Variable declarations and procedure/function calls in loops impact on performance. Simplify code within loops to improve performance.
-- Bad idea. FOR i IN 1 .. 100 LOOP DECLARE l_str VARCHAR2(200); BEGIN -- Do Something. END; END LOOP;
-- Better idea. DECLARE l_str VARCHAR2(200); BEGIN FOR i IN 1 .. 100 LOOP -- Do Something. END LOOP; END;
INTEGER Types
NUMBER and its subtypes use an Oracle internal format, rather than the machine arithmetic. INTEGER and other constrained type need additional runtime checks compared to NUMBER. PLS_INTEGER uses machine arithmetic to reduce overhead. BINARY_INTEGER is slow in 8i and 9i, but fast in 10g because it uses machine arithmetic. integer_test.sql 11g includes SIMPLE_INTEGER which is quick in natively compiled code. Use the appropriate datatype for the job.