Performance Tuning

Advanced DataStage Workshop
Module 05 Performance Tuning
Information Management
2010 IBM Corporation
Module Objectives
After completing this module, you should be able to:
Explain performance tuning methodology Selectively disabling operator combination Understand configuration file guideline Understand the impact of partitioning Understand the impact of sorting Understand the impact of transformer Use the performance analyzer
Optimizing Performance
The ability to process large volumes of data in a short period of time requires optimizing all aspect of the job flow and environment for maximum throughput and performance
Job design Stage properties DataStage parameters Configuration file Disk subsystems: RAID / SAN Source and target databases Network etc....
Parallel Job Performance
Within DataStage examine (in order):

End-to-end process flow
Intermediate results, sources / targets, disk usage
Configuration file(s) for each job

Degree of parallelism Impact to overall system resources File systems mapping, scratch disk
Individual job design including shared containers Stages chosen, overall design approach Partitioning strategy Operator combination Buffering (as a last resort)
Parallel Job Performance
Ultimate job performance may be constrained by external sources / targets

Disk subsystems, network, database, etc. May be appropriate to scale back on degree of parallelism to conserve resources
Performance Tuning Methodology

An iterative process Test in isolation (nothing else should be running)

DataStage server Source and target databases
Change ONE item at time, then examine impact Use job score to determine
Number of processes generated Operator combination Framework inserted sorting and partitioning
Use DataStage performance monitor to verify

Data distributions (partitioning) Throughput and bottleneck
Use performance analyzer to check other metrics

Using DataStage Director Job Monitor
Enable Show Instances to show data distribution (skew) across partitions

Best performance with even distribution
Be cautious with Rows/sec numbers calculated by Director (elapsed time of entire job, not per stage)
Operator Combination
At run time, DataStage parallel framework will attempt to combine stages (operators) into a single process Operator combination is intended to improve overall performance and lower resource usage Combination only occurs between stages (operators) that:
Use the same partitioning method
Repartitioning prevents operator combination between the producer and consumer stages Implicit repartitioning (sequential operators) prevents combination
Are combinable
Set automatically within the stage / operator definition Can also be set within stages advanced properties
Tuning Operator Combination
For easier debugging, in order to know which stage produced the warning or error log message, selectively disable combination through Designer stage properties
In some situations, disable all combinations by setting $APT_DISABLE_COMBINATION = TRUE
Configuration File Guidelines
Minimize I/O overlaps across nodes

If multiple file systems are shared across nodes, alter order of file systems within each node definition Pay particular attention to map file systems to physical controllers / drives with RAID / SAN Use local disks for scratch storage if possible
Named pools can be used to further separate I/O

buffer file system is only used for buffer overflow sort file system is only used for sorting
With cluster / grid / MPP environment, named pools can be used to further control resources
Minimize data shipping, direct database connection, etc.
10
Use Parallel Data Set
Use parallel Data Set to land intermediate results between parallel jobs
No conversion overhead, stored in native internal format Retains data partitioning and sort order Maximum performance through parallel I/O
11
Impact of Partitioning
Ensure data is close to as evenly distributed as possible

When business rules dictates otherwise, re-partition to a more balanced distribution as soon as possible to improve performance of downstream stages
Minimize re-partitions by optimizing the flow to reuse upstream partitioning

Specially in GRID / MPP / Cluster environment
Know your data

Choose hash key columns that generate sufficient unique key combinations while meeting business requirements
Use SAME partitioning carefully

Maintains the same degree of parallelism
12
Impact of Sorting
Use parallel sort if possible (sort by key-column groups)

Where global sort is required, using parallel sort and sort merge collector is generally much faster than sequential sort
Complete sort is expensive

Rows cannot be output to next stage until all rows are read and sorted. Pipleline is interrupted. Must use scratch disk for intermediate storage
Use the Restrict Memory Usage option to increase the amount of memory available for sorting per partition
13
More Impact of Sorting
Minimize and combine sorts where possible

Use the Dont Sort, Previously Sorted option to leverage previous sort groupings
Uses much less memory Output rows after each key-column group
Parallel Data Set maintains partitions and sort order across jobs
Stable sort is slower

Use only when needed to satisfy business requirement
14
Impact of Transformer
Minimize the number of Transformers

If possible, combine derivations from multiple Transformers
Use stage variables to perform calculations used by constraints and multiple derivations Never use the BASIC Transformer
Doesnt show up in the standard palette by default Intended to provide a migration path for existing DataStage Server applications that use DataStage BASIC routines Runs sequentially Invokes the DataStage server engine Extremely expensive (slow)!
15
Impact of Transformer vs. Other Stages
For optimum performance, consider more appropriate stages instead of a Transformer in parallel job flows:
Use non-Transformer stage (e.g., Copy stage) to:
Rename Columns Drop Columns Perform default type conversions Split output
Transformer constraints are FASTER than the Filter or Switch stages

Filter and Switch expressions are interpreted at runtime Transformer constraints are compiled
Other Performance Tips
Remove unnecessary columns as early as possible within the flow

Minimizes memory usage, use VARCHAR with length specified Use SELECT COLUMN NAMES not SELECT * when reading from database Disable RCP if not require
Sequential File stage file pattern reads start with a single CAT process
Setting $APT_IMPORT_PATTERN_USES_FILESET allows parallel I/O Dynamically builds a File Set header file for list of files match pattern
17
Performance Analysis
18
Performance Analysis In the Past
Use the Director monitor to watch the throughput (rows/sec) during a job run Compare job run durations Turn on APT_PM_PLAYER_TIMING and APT_PM_PLAYER_MEMORY to report player calls and memory allocation Long running jobs couldnt be watched for record throughput changes throughout the job run The job monitor didnt allow recording for playback Job monitor throughput rates included time waiting for data Couldnt determine what was happening on the machines
How This Fails You...
19
Performance Analyzer

Visualization tool that provides deeper insight into job runtime behavior Part of the DataStage engine Offers several categories of visualizations:
Record throughput (rows/sec) CPU utilization Job timing & memory utilization Physical machine utilization
Performance data to be visualized can be:

Filtered in selected ways, including
Hide startup processes Hide license operators Hide inserted operators
Isolated to selected stages (operators), partitions, and phases
Charts can be saved and printed

20
Enabling Performance Data Recording

Open the job in Designer Select Record job performance data in Job Properties Run your job. Performance collection has little impact on overall job performance To view the results, click the Performance Analysis icon in Designer
21
Example Job
22
Job Timeline Chart

Job timeline chart Job timing Throughput CPU utilization Memory utilization
Machine utilization
Stages in job
Lengths of time
23
Expanding the Job Timeline Chart

View by partition Click to expand
Process phases
24
Another Job Timeline Chart
25
Record Throughput Chart
Rows per second
Blue is reading 350,000 records per second
Run mouse over line to identify the stage port represented Timeline
26
Displaying Selected Stages

Select stages in a partition to display Select partitions to display
Select the stages to display
27
CPU Utilization Totals Chart
Blue shows CPU time
Amount of CPU time
Red shows system time

28
Machine Utilization - CPU
29
Filters
30
Module Summary
After completing this module, you should be able to:
Explain performance tuning methodology Selectively disabling operator combination Understand configuration file guideline Understand the impact of partitioning Understand the impact of sorting Understand the impact of transformer Use the performance analyzer
31

Performance Tuning

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Performance Tuning

Caricato da

Copyright:

Formati disponibili

Advanced DataStage Workshop

Module 05 Performance Tuning

2010 IBM Corporation

After completing this module, you should be able to:

2010 IBM Corporation

2010 IBM Corporation

Parallel Job Performance

Within DataStage examine (in order):

Configuration file(s) for each job

2010 IBM Corporation

Parallel Job Performance

Ultimate job performance may be constrained by external sources / targets

2010 IBM Corporation