Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Teradata Overview
Teradata Node
Communication Interfaces
LAN Gateway
Channel
Teradata RDBMS
B Y N E T S W I T C H
AMP V-Proc
AMP V-Proc
AMP V-Proc
AMP V-Proc
AMP V-Proc
AMP V-Proc
AMP V-Proc
AMP V-Proc
DATA
DATA
DATA
DATA
DATA
DATA
DATA
DATA
Primary Indexes
Primary physical access path Mechanism used to assign a row to an AMP Table must have one and only one Primary Index Primary Index cannot be changed without recreating the table UPIs result in even distribution of the rows of the table across all AMPs. UPIs ensure no duplicate rows PI access are always one-AMP operations NUPIs will result in even distribution of the table rows proportional to the degree of uniqueness of the index and the number of AMPs Primary Indexes may or may not be the same as Primary Keys
AMP
AMP
AMP
Table A
RH
AMP
AMP
Rows ordered by RH
Agenda
Your Benefit
> Significantly improve performance for range constrained queries >Strategic queries still see all data in one table, but tactical queries look only at the subset they need >Performance improvements for other functions like deletes and updates >Read only a subset of table > Easy to manage >None of the pain of re-partitioning >All the self management you expect from Teradata >Reduce high-volume batch insert times by 90% >Delete large volumes of rows, nearly instantaneously >Drop unneeded secondary indexes or value-ordered join indexes
10
> Maximum of 65,535 partitions, numbered from one > One or more columns in partitioning expression
11
Performance Awareness
> Possible degradation of PI Access
If partitioning column is not qualified, all partitions will be read Joins on PI columns from a non-PPI table to a PPI table will result in comparing PI column in every PPI table partition
12
Partition
Data Columns
AMP
AMP
AMP
Table A
Partition 1 Partition 2
RH
AMP
AMP
13
14
15
> Conclusions
PPI can offer dramatic improvements in query response time and in high volume data load and maintenance operations May be degradations in PI access and in join steps due to PPI DBA should understand trade-off considerations Testing of various alternatives will usually be necessary to get the maximum benefit from PPI
16
Agenda
Secondary Indexes
> Unique Secondary Index (USI) > Non-unique Secondary Index (NUSI)
17
Secondary Indexes >A secondary index is an alternate path to the rows of a table. >Secondary indexes:
Do not affect table distribution. Add overhead, both in terms of disk space and maintenance. May be added or dropped dynamically as needed. Are chosen to improve access performance.
18
BYNET
AMP 3 AMP 4
USI Subtable
RowID Cust RowID
USI Subtable
RowID Cust RowID
USI Subtable
RowID Cust RowID
USI Subtable
RowID Cust RowID
PE
74 77 51 27
98 84 54 49
31 40 45 95
37 72 12 62
Table ID
Hashing Algorithm
Table ID 100 Row Hash USI Value 602 54
100
BYNET
AMP 1 AMP 2 AMP 3 AMP 4
Base Table
RowID Cust Name USI Phone NUPI
Base Table
RowID Cust Name USI Phone NUPI
Base Table
RowID Cust Name USI Phone NUPI
Base Table
RowID Cust Name USI Phone NUPI
37 84 31 40
45 98 72 74
49 12 27 62
77 95 54 51
19
PE
Customer table Id = 100 NUSI Value = Adams Hashing Algorithm
Table ID Row Hash NUSI Value
AMP 1
AMP 2
AMP 3
AMP 4
NUSI Subtable
RowID 448, 1 656, 1 567, 3 432, 8 Name White Rice Adams Smith RowID 107, 1 536, 5 638, 1 640, 1
NUSI Subtable
RowID Name RowID 567, 2 Adams 471, 1 717, 2 852, 1 Brown 555, 6 432, 3 Smith 884, 1
NUSI Subtable
RowID 432, 1 770, 1 567, 6 448, 4 Name Smith Young Jones Black RowID 147, 1 147, 2 338, 1 822, 1
NUSI Subtable
RowID 262, 1 396, 1 432, 5 155, 1 Name Jones Peters Smith Marsh RowID 639, 1 778, 3 778, 7 915, 9
100
567
Adams
Base Table
RowID Cust Name NUSI Phone NUPI
Base Table
RowID Cust Name NUSI Phone NUPI
Base Table
RowID Cust Name NUSI Phone NUPI
Base Table
RowID Cust Name NUSI Phone NUPI
37 84 31 40
49 12 27 62
77 95 54 51
20
21
Example 2: IF < 1 row per block qualifies, THEN NUSI access is faster than full table scan
> If 100 rows/block and 1 in 1000 rows qualify, then 1 in every 10 blocks would be read. NUSI will be used.
22
Example:*
> Large corporation with 100,000 calls / month would do Full Table Scan > Residential phone customer with 20 calls / month would use NUSI
*(Candidate
Sparse index is a special case of a STJI Create a join index, qualify with where clause. Cant just put a where clause on a SI.
23
24
Table Data
NUSI
25
26
Agenda
27
Sparse Index
> Any join index, whether simple or aggregate, multi-table or singletable, can be sparse. > Uses a constant expression in the WHERE clause of its definition to narrowly filter its row population.
Value-Ordered NUSI
> Very efficient for range conditions and conditions with an inequality on the secondary index column set.
Hash Index
> Used for the same purposes as single-table join indexes. > Create a full or partial replication of a base table with a primary index on a foreign key column table to facilitate joins of very large tables by hashing them to the same AMP. Limited to one table only.
28
29
Table Data
STJI
Different structure, query satisfied with STJI access Index maintained automatically
30
> Cannot join to a NUSI while can join to STJI > NUSI supported by MultiLoad, but not STJI > All columns of NUSI must be accessed for Index to be considered
31
32
SELECT item, COUNT (DISTINCT(Store_no)) FROM Sales_History WHERE On_hand_qty > 0 AND qty_sold > 0 AND item IN (x,y,z) AND date_sold IN (a,b,c) AND store IN (d,e,f) GROUP BY 1 ORDER BY 1;
33
Single Table Join Index (STJI) Same Primary Index Partial Covering
Use Same Primary Index for STJI so STJI row is on same AMP as table data row Similar to NUSI STJI row is on same AMP as data row Useful for partial covering Partial covering means qualification is done on Index columns before accessing primary data table for non-covered columns. This can reduce the number of rows to retrieve from base table Acts like a NUSI
> AMP-local and no BYNET traffic
34
Query scans LIKETAB (a very narrow table), qualifies rows with car_license LIKE ABC%, then uses ROWID to get data from table Customer_Info where row is on same AMP because tables have same primary index
* Optimizer will NOT scan NUSI for LIKE, instead it will scan base table
35
Build STJI with column for LIKE plus ROWID of base table
> Use LIKE clause on narrower table > Took 4 seconds
36
37
Value Ordered Non-Unique Secondary Index (VONUSI) Value Ordered Single Table Join Index (VOSTJI) Example
> Invoice Table - 60 Million rows
1500 days X 100 outlets X 400 sales / day
38
39
day
40
Your Benefit
> A sparse index can focus on the portion of the table(s) that are most frequently used:
Reduces the storage requirements for a join index and maintenance costs for updates Indexed Makes access faster since the size Index Column of the join index is smaller Null No change for the user: optimizer will evaluate all join indexes, and Null choose one if it's appropriate for the specific query Null
Null
41
42
800
600
Application needs quick response for Residential customer calls Use Sparse STJI instead of NUSI
>Saves space and maintenance time
400
200
0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
43
44
45
46
Hash Index can be value ordered Uses different DDL to create it than STJI syntax
47
Example - Marketing Campaign Reduce list of prospects with successive qualifying queries using Hash Index until final number achieved, then get detailed data.
48
49
50
Alison Torres
alison.torres@teradata.com
Additional Considerations
52
> Some queries access PI > No other tables have the same primary index (and will not join to it using the PI)
53
> Proposal: Convert to PPI, partitioned by transaction date with daily granularity > Could also consider partitioning by product_code or agent_id
Would improve some queries Would not improve batch insert or delete operations
54
> Many queries can benefit from partition elimination > PI access is not degraded much, if at all > Joins should not be degraded (but check EXPLAINs, and do comparison testing if they change)
If joins are slower, could consider weekly or monthly partitions instead of daily
55
56
57
58
PI access will use secondary indexes, will take two or three times as long as non-PPI PI access Joins to other tables with same PI will probably be degraded
59