Sei sulla pagina 1di 46

7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

 Teradata Database Administration – Teradata Internals

PREV NEXT
⏮ ⏭
Contents Chapter 2 - How Teradata Tracks Objects
  🔎

Chapter 1 - The Cold, Hard Teradata Facts

“Get your facts first, and then you can distort them as you please.”

- Mark Twain

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 1/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

What is Parallel Processing?

“After enlightenment, the laundry”

- Zen Proverb

“After parallel processing the laundry, enlightenment!”

-Teradata Zen Proverb

Two guys were having fun on a Saturday night when one said, “I've got to
go and do my laundry.” The other said, “What?!” The man explained that
if he went to the laundry mat the next morning, he would be lucky to get
one machine and be there all day. But, if he went on Saturday night, he
could get all the machines. Then, he could do all his wash and dry in two
hours. Now that's parallel processing mixed in with a little dry humor!

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 2/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Basics of a Single Computer

Data on disk does absolutely nothing. When data is requested, the


computer moves the data one block at a time from disk into memory.
Once the data is in memory it is processed by the CPU at lightning speed.
All computers work this way. The “Achilles Heel” of every computer is the
slow process of moving data from disk to memory. That is all you need to
know to be a computer expert!

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 3/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Teradata Parallel Processes Data

Teradata has been the pioneer in parallel processing since 1988, when
Wells Fargo bought the first Teradata system. In the picture above you see
that we have 16 orders, with four orders placed on each disk. It appears to
be four separate computers, but this is one system. Teradata systems work
just like a basic computer as they still need to move data from disk into
memory, but Teradata divides and conquers.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 4/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Parallel Architecture

The rows of a Teradata table are spread across the AMPs, so each AMP
can then process an equal amount of the rows when a USER queries the
table.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 5/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Teradata Architecture

The Parsing Engine (PE) takes the User's SQL and builds a Plan for each
AMP to follow to retrieve the data. Parallel Processing is all about each
AMP doing an equal amount of the work. If they start at the same time
and end the same time, they are performing true Parallel Processing. All
communication is done over the BYNET.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 6/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

All Teradata Tables are spread across ALL AMPS

Each table dreams of spreading their rows equally across the AMPs.
Above, are three tables with each table holding 9 rows (3-rows per AMP).

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 7/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Teradata Systems can Add AMPs for Linear Scalability

If you double your AMPs, the system is twice as fast! System number one
has only 4-AMPs, but system two has 8-AMPs and is twice as fast. When a
customer buys more hardware, they are adding AMPs to the system. Once
the hardware is configured, the AMPs will redistribute the data to include
the new AMPs.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 8/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Understand that Teradata can scale to incredible size

“If you do what you've always done, you'll get what you've always got.”

-Anonymous

The largest systems in the world have used Teradata for market
dominance for the past 20 years. Its Massively Parallel Processing (MPP)
technology analyzes on such a large scale that companies can run queries
they have never been able to run before. Recognize that you now have
something very powerful and that has the ability to analyze every aspect of
your business. So do what you've never done, and get something that
you've never got.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 9/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

AMPs and Parsing Engines (PEs) live inside SMP Nodes

AMPs and PEs are called Virtual Processors because each is a process that
lives inside a node's memory. Think of a node as a very powerful personal
computer. SMP stands for symmetric multi-processing which means each
CPU processor performs equally, and all CPUs share a pool of memory
and operate under one operating system. Each node is designed to
operate at maximum performance.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 10/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Each Node is attached via a Network to a Disk Farm

A Teradata AMP will be assigned a Virtual disk to store its tables and the
rows assigned to it. Only the AMP assigned to the virtual disk can read or
write to that disk. A node holds many AMPs. In the early days, each node
held around 8-10 AMPs, but with more power in a node due to CPU
advances, 64-bit architecture, and a ton more memory, many nodes today
will hold up to 40-50 AMPs. Each AMP is still attached to its virtual disk.
Think of a single node attached to a cable which then attaches to a single
disk farm. Now, each AMP in the node knows where its virtual disk
resides.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 11/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Two SMP Nodes Connected Become One MPP System

When nodes are connected to the BYNETs, then they become part of one
large Teradata system. In the picture above, there are two nodes. Each
node is connected to the BYNETs so now our system has 8 Parsing
Engines and 80 AMPs, but physically they are separate hardware nodes.
When a customer wants to grow their system, they add additional nodes,
which in turn add additional Parsing Engines, AMPs and disks. Two SMP
nodes connected via the BYNETs are now one Massively Parallel
Processing (MPP) system.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 12/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

There are Many Nodes in a Teradata Cabinet

Teradata has many different configurations, but I want you to understand


that nodes are kept in cabinets. Sometimes the disks are within the
cabinet, but sometimes they are not. The same goes for the BYNET
boards.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 13/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Inside a Teradata Node

Gateway and Channel-drive software run as processes. Users connecting


via the Mainframe access Teradata though the Channel and all other users
utilize the LAN gateway. The Parallel Database Extension (PDE) controls
the Access Module Processors (AMPs) and Parsing Engines (PEs) which
are referred to as Virtual Processors (Vprocs) and they reside in the nodes
memory. The operating system running the node is Linux.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 14/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Boardless BYNET and the Physical BYNET

Each node has an internal BYNET communication system within the


node, so the PEs and AMPs can communicate. One node is called a
Symmetric Multiprocessing Node (SMP), and if the Teradata system is a
single node system, it won't have a physical BYNET. Once multiple SMP
nodes are connected to produce a Massively Parallel Processing system
(MPP), then two physical BYNET boards connect the nodes together.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 15/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Parsing Engine

Each Parsing Engine (PE) can manage up to 120 individual


sessions.

When a user logs into Teradata, a PE will log them in and be


responsible for their session.

The PE checks the SQL syntax, creates the EXPLAIN, checks


security, and builds a plan for the AMPs to follow.

The PE uses the COLLECTED STATISTICS to build the best plan.

The PE is responsible for converting EDCDIC (from the mainframe


queries) to ASCII on the way in, and the AMPs are responsible for
converting from ASCII to EBCDIC on the way out.

The PE always delivers the final answer set to the user.

The Parsing Engine's biggest responsibility is building a parallel-aware,


cost-based plan for the AMPs to follow to retrieve the data.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 16/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The AMPs Responsibilities

AMPs are responsible for storing and retrieving rows from their
assigned disk (Vdisk).

AMPs lock the tables and rows.

AMPs sort rows and do all aggregation.

AMPs handle all the join processing.

AMPs handle all space management and space accounting.

AMPs convert ASCII to EBCDIC when returning answer sets to the


mainframe.

The AMPs biggest responsibility is to listen to the PE and follow the plan.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 17/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Teradata Parallel Processing

Each AMP holds a portion of the rows for every table in the system.

Teradata was born to be parallel. When a user queries a system, they


logon to a Parsing Engine for the entire session. The Parsing Engine
checks the User's SQL syntax, their Access Rights for security purposes,
then it comes up with a plan for the AMPs to retrieve the data. The
Parsing Engine passes the plan to the AMPs over one of the two BYNET
networks, and the AMPs work simultaneously in parallel to retrieve the
data.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 18/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Each Table has a Primary Index that is Unique or Non-Unique

CREATE TABLE Department_Table

( Dept_No INTEGER,

,Dept_Name CHAR (20),

, Budget DECIMAL (10,2))

UNIQUE PRIMARY INDEX ( Dept_No ) ;

UPI

CREATE TABLE Department_Table

( Dept_No INTEGER,

,Dept_Name CHAR (20),

, Budget DECIMAL (10,2))

PRIMARY INDEX ( Budget ) ;

NUPI

CREATE TABLE Department_Table

( Dept_No INTEGER,

,Dept_Name CHAR (20),

, Budget DECIMAL (10,2))

PRIMARY INDEX ( Dept_Name, Budget ) ;

Multi-Column NUPI

CREATE TABLE Department_Table

( Dept_No INTEGER,

,Dept_Name CHAR (20),

, Budget DECIMAL (10,2))

st
If no Primary Index is defined, then the 1 column is selected as a
NUPI.

Each Teradata table has only one Primary Index, and it is either a Unique
Primary Index (UPI) or a Non-Unique Primary Index (NUPI). The

Primary Index is established when the table is created inside the CREATE
Table statement.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 19/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Hash Map Determines which AMP will own the Row

Teradata uses one secret “Hash Formula” that runs a math formula on the
Primary Index value of each row. The hashing of the row results in an
answer called the “Row Hash”, and this alone, in conjunction with the
system's hash map, determines which AMP holds the row. The Parsing
Engine can rerun the Hash Formula again to quickly find the row.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 20/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Unique Primary Index Spreads the Data Evenly

The Row Hash is the result of the Primary Index value going through the
Hash Formula. It will stay with the row forever.

The Department_Table has a Unique Primary Index (UPI) on Dept_No.


The rows are spread evenly across the AMPs. Take notice of the Row Hash
in front of each row. When the row is hashed, the resulting Row Hash
sticks with the row like glue forever.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 21/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The AMP Adds a Uniqueness Value to Create the Row-ID

The Row Hash is the result of the Primary Index value going through the
Hash Formula.
The AMP adds a Uniqueness Value and the combination of the Row Hash
and the Uniqueness Value form a Row-ID.

Notice that each AMP adds a Uniqueness Value to the row hash and this
forms the Row-ID. The Row-ID is unique within a table. Why are all the
Uniqueness Values equal to a one (1)? Each Dept_No is unique, so each
“Row Hash” is unique, so a Uniqueness Value of one (1) is entered.
Teradata AMPs sort their rows by the Row-ID.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 22/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Each AMP Sorts Their Tables by the Row-ID

How does each AMP sort the rows they own? By the Row-ID.

Each AMP sorts their rows by the Row-ID. Now, an AMP can look up a
specific row like a person looks up a name in a phone book. People can
easily use a phone book because it is in alphabetic order. An AMP can
quickly find rows using an index because its rows are in binary order. A
person uses a phone book by looking in the middle and then moving up or
down a group of pages until the search is satisfied. An AMP searches the
same way, but using binary numbers to find a specific row hash quickly.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 23/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Non-Unique Primary Index Skews the Data

Because Budget was chosen as the Primary Index, and the Teradata hash
formula is consistent, all like values go to the same AMP. Notice that all of
the budgets of 50,000.00 went to AMP 1. The 40,000.00 budgets all went
to AMP 2, and the only row with a 30,000.00 budget went to AMP 3. The
data is skewed. Also, notice the Row Hash values and the increasing
Uniqueness ID for budgets with the same value.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 24/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Comparing the Same Table with Different Primary Indexes

The Department_Table is laid out twice, but with a different Primary


Index. The red colors denote the table's Primary Index, and the blue
colors denote the Row-ID. The top example uses a Unique Primary Index
(UPI) to distribute data evenly, but the bottom example uses Budget as a
Non-Unique Primary Index (NUPI), and the data is skewed.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 25/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Unique Primary Index Queries are a Single AMP Retrieve

SELECT*

FROM Department_Table

WHERE Dept_No = 100;

The above query uses the Unique Primary Index (UPI) on Dept_No in the
WHERE clause of the SQL. This results in a “Single AMP retrieve”. Only
AMP 1 is contacted to retrieve the row. The Parsing Engine distributes the
data by hashing the Primary Index value with a “Hash Formula”, so the
Parsing Engine's plan is to reengineer that process. The BYNET only
contacts AMP 1. Give Teradata any Primary Index value for a table, and it
knows which AMP has that row by rerunning the “Hash Formula”.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 26/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Non-Unique Primary Index is also a Single AMP Retrieve

SELECT* FROM Department_Table

WHERE Budget = 40000.00;

The above query uses a Non-Unique Primary Index (NUPI) on the column
Budget in the WHERE clause of the SQL. This also results in a Single
AMP retrieve. Only AMP two is contacted to retrieve the rows. The
Parsing Engine distributes the data by hashing the Primary Index value,
so the Parsing Engine's plan is to reengineer this process and contact only
the proper AMP. AMP two brings back all qualifying rows.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 27/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Teradata has a No Primary Index Table called a NoPI Table

CREATE TABLE Department_Table

( Dept_No INTEGER,

,Dept_Name CHAR(20),

, Budget DECIMAL(10,2))

NO PRIMARY INDEX ;

NoPI

Each AMP is assigned a Row Hash and then it merely increments its
Uniqueness Value

A NoPI Table has no primary index. A NoPI table guarantees even


distribution. It still has a Row-ID. How? Each AMP is assigned a Row
Hash, and then for every row the AMP receives it increments the
Uniqueness Value. A NoPI table is most often used as a loading staging
table or with a columnar designed table. Each row is appended quickly.
Distribution is always random, but even, so as a staging table it is quicker
to load.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 28/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

There are Normal Tables and then there are Partitioned Tables

The table above is a partitioned table, which means the AMPs are ordered
NOT to sort their rows by Row-ID, but instead to sort them by the
partition. This is referred to as a Partitioned Primary Index table or PPI
table. The only difference between a PPI table and a normal table (Non-
Partitioned Primary Index) or (NPPI) is how the AMPs sort their rows.
The example above has each AMP sort their rows by month of
Order_Date.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 29/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Visual of One Year of Data with Range_N Per Month

Each AMP above sorts their rows by Month (of Order_Date), which really
means the rows are first sorted by the Partition Number and then by Row-
ID within the partition. The combination of Partition Number and Row-
ID is called the Row-Key. A normal table has AMPs sort their rows by
Row-ID, but a PPI table has AMPs sort by Row Key.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 30/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Partitioning is designed to eliminate the Full Table Scan

SELECT * FROM Order_Table

WHERE EXTRACT (Month) From Order_Date = 10 ;

The query above performs an All-AMP retrieve but from a single partition,
and it has NOT done a Full Table Scan. This is a major concept around
Physical Database Design!

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 31/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Partition # and Row-ID = Row Key

CREATE TABLE Department_Table

(Order_Number INTEGER,

,Customer_Number INTEGER

,Order_Date DATE

,Order_Total Decimal (10,2)

) PRIMARY INDEX(Order_Number)

PARTITION BY RANGE_N

(Order_Date BETWEEN

date '2013-01-01' AND date '2013-12-31'


EACH INTERVAL ‘1'Month);

A Partitioned Table (PPI) is sorted on each AMP by the Row-Key.

Above is Partition # 1 (January) and Partition # 2 (February). Inside each


partition, you can see the Row-ID. The Partition # combined with the
Row-ID is called the Row Key.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 32/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

An AMP Stores its Rows Sorted in only Two Different Ways

An AMP stores its rows sorted by the Row-ID or the Row Key. A normal
table sorts by the Row-ID, and a Partitioned Table (PPI table) sorts on the
Row Key (Partition # + Row-ID). You will soon find out exactly why this is
done and the true meaning of Physical Database Design! These are the
two ways an AMP sorts their data on disk.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 33/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

AMPs Moves Their Data Blocks into Memory to Read/Write

Rows of a table can't be read on disk. Rows are merely stored on disk in
data blocks. When a Full Table Scan (for example) needs to read all rows
of a table, the AMPs will simultaneously move their data blocks into their
dedicated memory.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 34/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Most Taxing thing for an AMP is Moving Blocks into Memory

Once the Table Header and the Data Block are inside an AMP's Memory,
the AMP can read. The most taxing thing for an AMP to do is to move
blocks from disk to memory.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 35/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Rows are Stored in Data Blocks which are stored in Cylinders

Above, you see many rows of the same table stored in a data block. You
can have many data blocks stored in a cylinder, and an AMP owns
thousands of cylinders.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 36/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Rows for an AMP Stored Inside a Data Block in a Cylinder

This is a real data block that contains the overhead in front of each row.
This data block among other data blocks will reside inside a cylinder. The
rows are sorted by Row-ID, so the AMP can read the Row Reference Array
like a phone book to quickly find a row.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 37/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

An AMP's Master Index is Used to Find the Right Cylinder

An AMP has thousands of cylinders. Inside each cylinder are data blocks
where the AMP stores the rows of a table. The AMP's Master Index is
always in memory so it can easily locate the cylinder containing the data.
Once the data block and table header are located and moved into memory,
rows can then be read, updated, inserted, or deleted.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 38/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Row Reference Array (RRA) Does the Binary Search

There is one Row Reference Array (RRA) inside each data block. This
array shows the starting position of each data row. Now, the AMP can
start in the middle of the RRA and look for a row's Row-ID and either find
it, or know to move up or down like you might use a phone book. The Row
Reference Array is always in Row-ID order.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 39/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

A Block Splits into Two Blocks at Maximum Block Size

Once a block reaches the maximum block size (on an AMP), the AMP
splits the block from one 255 sector block to two separate 127.5 sector
blocks. Notice that we have the same total amount of rows, but now half
the rows are in the first data block and the other half of the rows are in the
second data block. Notice that both blocks have the same TableID and
notice now there is a Block Header, Row Reference Array and Block
Trailer in both of the blocks. An AMP might have many blocks for a single
table.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 40/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Data Blocks Maximum Block Size has Changed (V14.10)

The maximum block size (before a split) was 255 sectors (127.5 K), but
now it is 2047 sectors (1 MB).

Prior to Teradata 14.10, the maximum block size is 255 sectors or 127.5
KB. Starting with Teradata 14.10, the maximum block size has been
increased from 127.5 KB to approximately 1MB. The Max user-specifiable
size is 2047 sectors.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 41/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The New Block Split with Teradata V14.10

In Teradata V14.10, once this block reaches the maximum block size (on
this AMP), the AMP splits the block from one 2047 sector block to two
separate 1023.5 sector blocks. Blocks can vary in size from 1 sector to the
maximum size (255 or 2047). A maximum row size is 64,255 bytes. Rows
can vary in size within the block.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 42/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

The Block Split with Even More Detail in Teradata V14.10

As data is added, Teradata continues to grow the block one 512 byte sector
at a time. Once the block grows to the maximum size, then Teradata will
split the single block into two separate blocks. The default for the split is
2047 sectors or 1 MB.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 43/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Teradata V14.10 Block Split Defaults

The maximum for all Teradata V14 data block splits are 2047 sectors or 1
MB, but the defaults differ based on whether or not it is an enterprise
class system, or an appliance.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 44/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

There is One Master Index and Thousands of Cylinder Indexes

There is only one Master Index per AMP, and its purpose is to locate the
cylinder that holds a particular data block or table header. Then, each
cylinder has a Cylinder Index so the AMP can locate the exact location of a
particular data block or table header.

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 45/46
7/31/2019 Chapter 1 - The Cold, Hard Teradata Facts - Teradata Database Administration – Teradata Internals

Each Table has a 48-bit TableID

Both blocks in Cylinder 1 above are from the same table. The first 32-bits
have the same value in 1234, but the second 16-bits identify Table Header
vs. Data Block.

Resource Centers / Playlists / History / Topics / Settings / Get the App / Sign Out
© 2019 Safari. Terms of Service / Privacy Policy
PREV NEXT
⏮ ⏭
Contents Chapter 2 - How Teradata Tracks Objects

Find answers on the fly, or master something new. Subscribe today. See pricing options.

https://learning.oreilly.com/library/view/teradata-database-administration/9781940540184/chapter01.xhtml#sub1 46/46

Potrebbero piacerti anche