Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Teradata Objects
There are five fundamental objects which may be found in a Teradata database. Tables - rows and columns of data Views - predefined subsets of existing tables Macros - predefined, stored SQL statements Triggers - SQL statements associated with a table Stored Procedure - program stored within TD These objects are created, maintained and deleted using Structured Query Language (SQL). Object definitions are stored in the Data Dictionary DEFINITIONS OF ALL DD/D Directory (DD/D).
DATABASE or USER TABLE 1 VIEW 1 MACRO 1 TRIGGER 1 Stored Procedure 1 TABLE 2 VIEW 2 MACRO 2 TRIGGER 2 Stored Procedure 2 TABLE 3 VIEW 3 MACRO 3 TRIGGER 3 Stored Procedure 3
DATABASE OBJECTS
Databases
A Teradata database is a defined logical repository for tables, views, macros, SPs. A database is empty until objects are created within it. Teradata has the concept of parent and child databases. A database has one and only one creator. The owner can be different from the creator if the database is given to another user.
Users
A Teradata user is a database with an assigned password. A user may logon to Teradata and access objects within itself other databases for which it has access rights. A user is an active repository while a database is a passive repository. A user is empty until objects are created within it.
Examples of views: DBC.Tables DBC.Users DBC.AllRights DBC.AllSpace - info about all tables - info about all users - info about access rights - info about space utilization
Indexes
An index is a mechanism that can be used by the SQL query optimizer to make table access more performant.
Teradata provides four different index types. Primary index All Teradata tables require a primary index because the system distributes table rows to the AMPs based on their primary index values. Primary indexes types are: Unique primary index (UPI) Nonunique primary index (NUPI) Nonpartitioned primary index (NPPI) Partitioned primary index (PPI) Secondary index Unique secondary index (USI) Nonunique secondary index (NUSI) Join index Multitable join index Single-table join index Hash index
Primary Index
Controls data distribution and retrieval using the Teradata hashing algorithm. Defined with the CREATE TABLE data definition statement. If no explicit primary index is defined, then CREATE TABLE assigns one automatically. Can be unique or non-unique and partitioned or non-partitioned. If the primary index is not defined explicitly as unique, then the definition defaults to non-unique. Can be composed of as many as 64 columns. Can be generated automatically if defined on an identity column. A minimum of one and a maximum of one must be defined per table. Improves performance when used correctly in the WHERE clause of an SQL data manipulation statement to perform the following actions Single-AMP retrievals Joins between tables with identical primary indexes, the optimal scenario.
Secondary Index
Can enhance the speed of data retrieval. Can be Unique (USI) or non-unique (NUSI). NUSIs can be hash-ordered or value-ordered. Do not affect base table data distribution. Maximum of 32 secondary and join indexes defined per table. Can be composed of as many as 64 concatenated columns. Can be created or dropped dynamically as data usage changes or if they are found not to be useful for optimizing data retrieval performance. Require additional disk space to store subtables. Require additional I/Os on INSERTs, DELETEs, and possibly on UPDATEs. Should not be defined on columns whose values change frequently. Composite secondary index is useful if it reduces the number of rows that must be accessed.
Join Index
Join indexes are file structures designed to permit queries (join queries in the case of multitable join indexes) to be resolved by accessing the index instead of having to access and join their underlying base tables. Joins multiple tables (optionally with aggregation) in a prejoin table. Replicates all, or a vertical subset, of a single base table and partitions its rows using a different primary index than the base table, such as a foreign key column to facilitate joins of very large tables by hashing them to the same AMP. Aggregates one or more columns of a single table as a summary table. Join indexes are useful for queries where the index table contains all the columns referenced by one or more joins, thereby allowing the Optimizer to cover all or part of the query by planning to access the index rather than its underlying base tables. queries that aggregate columns from tables with large cardinalities.
Hash Index
Hash indexes are file structures that share properties with both single-table join indexes and secondary indexes. Hash indexes are not indexes in the usual sense of the word. They are base tables that cannot be accessed directly by a query. A hash index always has at least one of the following functions. Replicates all, or a vertical subset, of a single base table and partitions its rows with a user-specified partition key column set, such as a foreign key column to facilitate joins of very large tables by hashing them to the same AMP. Provides an access path to base table rows to complete partial covers. Hash indexes are useful for queries where the index table contains the columns referenced by a query, thereby allowing the Optimizer to cover it by planning to access the index rather than its underlying base table.
Teradata RDBMS has the ability to execute all SQL in either Teradata mode or in ANSI mode.
Teradata mode: All SQL commands are implicitly a complete transaction. Therefore, once a change is made, it is committed and becomes permanent. It contains an implied COMMIT or an explicit END TRANSACTION (ET).
ANSI mode: All SQL commands are considered to be part of the same logical transaction. A transaction is not complete until an explicit COMMIT is executed.
PK
FK
FK
FK
SELECT Last_Name ,First_Name FROM Employee WHERE Hire_Date = 861015 ; FIRST LAST
NAME Stein Ryan Johnson NAME John Loretta Darlene
Answer
SELECT statement
Basic SELECT command:
SELECT * FROM Student_Table ;
Compound Comparisons:
SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR Grade_Pt = 4.0 AND Class_Code = 'FR' ;
Using Quantifiers vs IN
SELECT Last_Name ,Class_Code ,Grade_Pt FROM Student_Table WHERE Grade_Pt = ANY ( 2.0, 3.0, 4.0 ) ;
Derived Columns:
SELECT salary (format 'ZZZ,ZZ9.99') ,salary/12 (format 'Z,ZZ9.99') FROM Pay_Table ;
Order By:
SELECT * FROM Student_Table WHERE Grade_Pt > 3 ORDER BY Grade_Pt DESC;
Distinct Function:
SELECT DISTINCT Class_code FROM student_table ORDER BY class_code;
NAMED
SELECT salary (NAMED Annual_salary) ,salary/12 (NAMED Monthly_salary) FROM Pay_Table ;
Naming Conventions
When creating an alias only valid Teradata naming characters are allowed. The alias becomes the name of the column for the life of the SQL statement. The only difference is that it is not stored in the Data Dictionary.
Breaking Conventions:
When it is necessary or desirable to use non-standard characters in a name, double quotes (") are used around the name. This technique tells the PE that the word is not a reserved word and makes it a valid name. This is the only place that Teradata uses a double quote instead of a single quote (). SELECT salary "Annual salary" ,salary/12 "Monthly_salary" FROM Pay_Table ORDER BY "Annual Salary" ;
HELP Commands
Databases and Users: HELP HELP DATABASE USER customer_service ; Dave_Jones ;
Tables, Views, and Macros: HELP HELP HELP HELP TABLE VIEW MACRO COLUMN employee ; emp; payroll_3; employee.*; employee.last_name; emp.* ; emp.last; HELP HELP HELP INDEX STATISTICS CONSTRAINT employee; employee; employee.over_21;
*** Help information returned. 10 rows. *** Total elapsed time was 1 second.
Table/View/Macro name contact customer department employee employee_phone job location location_employee location_phone
Kind T T T T T T T T T
Comment ? ? ? ? ? ? ? ? ?
SHOW Command
SHOW commands display how an object was created. Command SHOW TABLE SHOW VIEW SHOW MACRO tablename; viewname; macroname; Returns CREATE TABLE statement CREATE VIEW statement CREATE MACRO statement
SHOW TABLE employee; CREATE SET TABLE CUSTOMER_SERVICE.employee ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( employee_number INTEGER, manager_employee_number INTEGER, department_number INTEGER, job_code INTEGER, last_name CHAR(20) NOT CASESPECIFIC NOT NULL, first_name VARCHAR(30) NOT CASESPECIFIC NOT NULL, hire_date DATE NOT NULL, birthdate DATE NOT NULL, salary_amount DECIMAL(10,2) NOT NULL) UNIQUE PRIMARY INDEX ( employee_number );
This information is useful for: predicting row counts predicting performance testing queries before production analyzing various approaches to a problem EXPLAIN
EXPLAIN SELECT last_name, department_number FROM employee; Explanation (partial): 3) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by way of an all-rows scan with no residual conditions into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 24 rows. The estimated time for this step is 0.15 seconds.
Data Conversion
CAST
Data can be converted from one type to another by using the CAST function.
SELECT CAST('ABCDE' AS CHAR(1)) AS Trunc ,CAST(128 AS CHAR(3)) AS OK ,CAST(127 AS INTEGER ) AS Bigger ,CAST(121.53 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; Trunc A OK. 128 Bigger 127 Whole 121 Rounder 122
Implied CAST
Prior to CAST, conversion was requested by placing the "implied' data type conversion in parentheses after the column name.
SELECT 'ABCDE' (CHAR(1)) AS Shortened ,128 (CHAR(3)) AS OK ,-128 (CHAR(3)) AS N_OK ,128 (INTEGER) AS Bigger ,121.13 (SMALLINT) AS Whole ; Shortened A OK_ N_OK_ 128 Bigger _ 121 Whole
Subquery Processing
Using IN
SELECT Order_number ,Order_total FROM Order_Table WHERE Customer_number IN ( SELECT Customer_number FROM Customer_table WHERE Customer_name LIKE 'Bill%');
Using NOT IN
SELECT Customer_name ,Phone_number FROM Customer_Table WHERE Customer_number NOT IN ( SELECT Customer_number FROM Order_table) ;
Using ANY
SELECT Customer_name ,Phone_number FROM Customer_Table WHERE customer_number = ANY (SELECT customer_number FROM Order_Table WHERE Order_total > ( SELECT AVG(Order_total) FROM Order_Table ) );
Using EXISTS
SELECT Customer_name FROM Customer_table AS CUST WHERE EXISTS ( SELECT * FROM Order_table AS OT WHERE CUST.Customer_number = OT.Customer_number ) ;
Join Processing
A join is the combination of two or more tables in the same FROM of a single SELECT statement.
Different types of Joins provided are: Inner Join Outer Join Left Outer Join Right Outer Join Full Outer Join Cross Join Self Join
The stored data for the date January 1, 1999 is converted to:
INTEGERDATE in the form of YY/M/DD is the default display format for most Teradata database client utilities Output date format can be changed by using DATEFORMAT
System Level Definition
Since Teradata stores the date as an INTEGER, it allows simple and complex mathematics to calculate new dates from dates Other functions provided are ADD MONTHS, EXTRACT, OVERLAPS etc.
Teradata provides character string processing functions like: CHARACTERS - used to count the number of characters stored in a data column. TRIM - used to eliminate space characters from fixed length data values. SUBSTRING - used to retrieve a portion of the data stored in a column. SUBSTR - the original Teradata substring operation. POSITION - used to return a number that represents the starting location of a specified character string with character data. INDEX - used to return a number that represents the starting position of a specified character string with character data. ANSI mode is case sensitive and Teradata mode is not. Therefore, the output from most of the string processing functions will differ accordingly.
OLAP Functions
Powerful OLAP (On-Line Analytical Processing) functions provide data mining capabilities to discover a wealth of knowledge from the data.
OLAP functions combined with standard SQL within the data warehouse, provide the ability to analyze large amounts of historical, business transactions from the past through the present
Like traditional aggregates, OLAP functions operate on groups of rows and permit qualification and filtering of the group result.
Unlike aggregates, OLAP functions also return the individual row detail data and not just the final aggregated value.
SET Operators
INTERSECT - used to match or join the common domain values from two or more sets. UNION - used to merge the rows from two or more sets. The join performed for a UNION is more similar to an OUTER JOIN. EXCEPT - used to eliminate common domain values from the answer set by throwing away the matching values. This is the primary SET operator that provides a capability not available using either an INNER or OUTER JOIN MINUS - is exactly the same as the EXCEPT. It was the original SET operator in Teradata before EXCEPT became the standard
NULLIF - only converts a zero to a NULL. It can convert anything to a NULL. ZEROIFNULL - compares the data value in a column and when it contains a NULL,
transforms it, for the life of the SQL statement, to a zero.
COALESCE - searches a value list, ranging from one to many values, and returns the first
Non-NULL value it finds. At the same time, it returns a NULL if all values in the list are NULL.
CASE - provides an additional test that allows for multiple comparisons on multiple columns
with multiple outcomes. It also incorporates logic to handle a situation in which none of the values compares equal.
View Processing
Views are pre-defined subsets of existing tables consisting of specified columns and/or rows from the table(s). A single table view: - is a window into an underlying table - allows users to read and update a subset of the underlying table - has no data of its own
EMPLOYEE (Table)
EMPLOYEE NUMBER MANAGER EMPLOYEE NUMBER
PK
FK
DEPT NUMBER
FK
JOB CODE
FK
LAST NAME
FIRST NAME
HIRE DATE
BIRTH DATE
SALARY AMOUNT
Emp_403 (View)
EMP NO 1005 801 DEPT NO 403 403 LAST NAME Villegas Ryan FIRST NAME Arnando Loretta HIRE DATE 870102 861015
Multi-Table Views
A multi-table view allows users to access data from multiple tables as if it were in a single table. Multi-table views are also called join views. Join views are used for reading only, not updating.
MANAGER EMPLOYEE NUMBER DEPT NUMBER JOB CODE LAST NAME FIRST NAME HIRE DATE BIRTH DATE SALARY AMOUNT
EMPLOYEE (Table)
EMPLOYEE NUMBER
PK
FK
FK
FK
1006 1019 301 1008 1019 301 1005 0801 403 1004 1003 401 1007 1005 403 DEPARTMENT (Table) 1003 0801 401
DEPT NUMBER
PK
501 301 302 403 402 401 201
DEPARTMENT NAME marketing sales research and development product planning education software support customer support technical operations EmpDept (View)
LAST NAME Stein Kanieski Ryan Johnson Villegas Trader
FK
research & development research & development education customer support education customer support
MACRO Processing
Macros are SQL statements stored as an object in the Data Dictionary (DD). Unlike a view, a macro can store one or multiple SQL statements. Additionally, the SQL is not restricted to only SELECT operations. INSERT, UPDATE, and DELETE commands are valid within a macro. When using BTEQ, conditional logic and BTEQ commands may also be incorporated into the macro. You can only have one DDL statement within a macro. If a macro contains DDL, it must be the last statement in the macro. Macro commands:
CREATE MACRO - initially builds a new macro REPLACE MACRO - used to modify an existing macro EXECUTE MACRO - used to run a macro DROP MACRO - deletes a macro of the DD.
Totals(WITH)
SELECT Last_Name ,First_Name ,Dept_no ,Salary FROM Employee_table WITH SUM(Salary);
Subtotals (WITH..BY)
SELECT Last_Name , First_Name , Dept_no , Salary FROM Employee_table WITH SUM(salary) (TITLE 'Departmental Salaries:') BY dept_no
Temporary Tables
Why Temporary Tables? You can usually use simpler SQL statements. The system doesn't have to do aggregation. The system may access Accounts based on the Primary Index value, which results in a fast response.
Temporary Table types DERIVED TABLES: Tables which are created in spool and dropped when the query is completed. VOLATILE TEMPORARY TABLES: Tables that do not survive a system restart. GLOBAL TEMPORARY TABLES :require a base definition which is stored in the Data Dictionary(DD). Remains materialized until it is dropped or session terminates.
Trigger Processing
A trigger is an event driven maintenance operation. The event is caused by a modification to one or more columns of a row in a table. Triggering Statement The user's initial SQL maintenance request that causes a row to change in a table and then causes a trigger to fire (execute).
It can be: It cannot be: INSERT, UPDATE, DELETE, INSERT/SELECT SELECT
Triggered Statement It is the SQL that is automatically executed as a result of a triggering statement.
It can be: It cannot be: INSERT, UPDATE, DELETE, INSERT/SELECT, ABORT/ROLLBACK, EXEC BEGIN/END TRANSACTION, COMMIT, CHECKPOINT, SELECT
Stored Procedures
Teradata provides Stored Procedural Language (SPL) to create Stored Procedures. These procedures allow the combination of both SQL and SPL control statements to manage the delivery and execution of the SQL.
The processing flow of a procedure is more like a program. It is a procedural set of commands, where SQL is a non-procedural language.