Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
PowerCenter 7 Basics
Education Services
PC6B-20030512
2
Extract, Transform, and Load
Operational Systems Decision Support
Data
RDBMS Mainframe Other Warehouse
ETL Load
Extract
3
PowerCenter Architecture
4
PowerCenter 7 Architecture
Server
native native
Sources Targets
TCP/IP
Repository
Heterogeneous Server Heterogeneous
Targets Targets
TCP/IP Repository
Agent
native
Repository Designer Workflow Workflow Rep Server Repository
Manager Manager Monitor Administrative
Console
Not Shown: Client ODBC Connections for Source and Target metadata 5
PowerCenter 7 Components
PowerCenter Repository
PowerCenter Repository Server
PowerCenter Client
• Designer
• Repository Manager
• Repository Server Administration Console
• Workflow Manager
• Workflow Monitor
PowerCenter Server
External Components
• Sources
• Targets
6
Repository Topics
Repository
Server
Repository
Agent
The Repository Server runs on the same system running the Repository Agent 9
Repository Server Administration Console
10
Repository Server Administration Console
Console Tree
Hypertext Links to
Repository
Maintenance Tasks
11
Repository Management
Perform all Repository
maintenance tasks through
Repository Server from the
Repository Server Administration
Console
Create the Repository
Configuration
Select Repository Configuration
and perform maintenance tasks:
• Create • Notify
• Delete Users
• Backup • Propagate
• Copy from • Register
• Disable • Restore
• Export Connection • Un-Register
• Make Global • Upgrade
12
Repository Manager
13
Repository Manager Interface
Navigator
Window
Main Window
Dependency Window
Output Window
14
Users, Groups and Repository Privileges
Steps:
Create groups
Create users
Assign users to
groups
Assign privileges to
groups
Assign additional
privileges to users
(optional)
15
Managing Privileges
16
Folder Permissions
18
Object Searching
(Menu- Analyze – Search)
Keyword search
• Limited to keywords
previously defined in
the Repository
(via Warehouse
Designer)
Search all
• Filter and search
objects
19
Object Sharing
Reuse existing objects
Enforces consistency
Decreases development time
Share objects by using copies and shortcuts
COPY SHORTCUT
Copy object to another folder Link to an object in another folder
Changes to original object not captured Dynamically reflects changes to original
object
Duplicates space Preserves space
Copy from shared or unshared folder Created from a shared folder
21
Sample Metadata Extensions
23
Source Object Definitions
24
Source Analyzer
Designer Tools
Analyzer Window
Navigation
Window
25
Methods of Analyzing Sources
Repository
Import from Database
Import from File
Import from Cobol File
Import from XML file
Create manually Source
Analyzer
26
Analyzing Relational Sources
Source Analyzer Relational Source
ODBC Table
View
Synonym
DEF
Repository
Server
TCP/IP
Repository
Agent
native
Repository
DEF
27
Analyzing Relational Sources
Editing Source Definition Properties
28
Analyzing Flat File Sources
Source Analyzer
Mapped Drive Flat File
NFS Mount
Local Directory DEF
Fixed Width or
Delimited
Repository
Server
TCP/IP
Repository
Agent
native
Repository
DEF
29
Flat File Wizard
Three-step
wizard
Columns can
be renamed
within wizard
Text, Numeric
and Datetime
datatypes are
supported
Wizard
‘guesses’
datatype
30
XML Source Analysis
Source Analyzer Mapped Drive
.DTD File
NFS Mounting
Local Directory DEF
DATA
Repository
Server
TCP/IP
Repository
Agent In addition to the DTD file, an
XML Schema or XML file
native can be used as a Source
Definition
Repository
DEF
31
Analyzing VSAM Sources
Source Analyzer .CBL File
Mapped Drive
NFS Mounting
DEF
Local Directory
Repository DATA
Server
TCP/IP
Repository
Agent Supported Numeric Storage Options:
COMP, COMP-3, COMP-6
native
Repository
DEF
32
VSAM Source Properties
33
Target Object Definitions
34
Creating Target Definitions
35
Automatic Target Creation
Drag-and-
drop a
Source
Definition
into
the
Warehouse
Designer
Workspace
36
Import Definition from Database
Can “Reverse engineer” existing object definitions
from a database system catalog or data dictionary
Warehouse
Database
Designer
ODBC
Table
Repository View
Server DEF Synonym
TCP/IP Repository
Agent
native
Repository DEF
37
Manual Target Creation
1. Create empty definition 2. Add desired columns
39
Target Definition Properties
40
Creating Physical Tables
DEF
DEF
LOGICAL PHYSICAL
Repository target table Target database
definitions tables
41
Creating Physical Tables
Create tables that do not already exist in target database
Connect - connect to the target database
Generate SQL file - create DDL in a script file
Edit SQL file - modify DDL script as needed
Execute SQL file - create physical tables in target database
44
Transformation Types
45
Transformation Views
A transformation has
three views:
Iconized - shows the
transformation in
relation to the rest of
the mapping
Normal - shows the
flow of data through
the transformation
Edit - shows
transformation ports
and properties; allows
editing
46
Edit Mode
Allows users with folder “write” permissions to change
or create transformation ports and properties
Define transformation
Define port level handling
level properties
Enter comments
Make reusable
Switch
between
transformations
47
Expression Transformation
Passive Transformation
Connected
Ports
• Mixed
• Variables allowed
Click here to invoke the
Expression Editor
Create expression in an
output or variable port
Usage
• Perform majority of
data manipulation
48
Expression Editor
An expression formula is a calculation or conditional statement
Used in Expression, Aggregator, Rank, Filter, Router, Update Strategy
Performs calculation based on ports, functions, operators, variables,
literals, constants and return values from other transformations
49
Informatica Functions - Samples
ASCII
CHR Character Functions
CHRCODE
CONCAT
Used to manipulate character data
INITCAP
INSTR CHRCODE returns the numeric value
LENGTH (ASCII or Unicode) of the first character
LOWER of the string passed to this function
LPAD
LTRIM
RPAD
RTRIM
SUBSTR For backwards compatibility only - use || instead
UPPER
REPLACESTR
REPLACECHR
50
Informatica Functions
ADD_TO_DATE
Date Functions
DATE_COMPARE Used to round, truncate, or compare
DATE_DIFF
GET_DATE_PART
dates; extract one part of a date; or
LAST_DAY
perform arithmetic on a date
ROUND (date) To pass a string to a date function,
SET_DATE_PART
first use the TO_DATE function to
TO_CHAR (date)
TRUNC (date)
convert it to an date/time datatype
51
Informatica Functions
Numerical Functions
ABS
CEIL Used to perform mathematical
CUME operations on numeric data
EXP
FLOOR
LN Scientific Functions COS
LOG COSH
MOD Used to calculate SIN
MOVINGAVG SINH
geometric values
MOVINGSUM TAN
POWER of numeric data TANH
ROUND
SIGN
SQRT
TRUNC
52
Informatica Functions
Special Functions
ERROR
ABORT Used to handle specific conditions within a session;
DECODE search for certain values; test conditional
statements
IIF
IIF(Condition,True,False)
SOUNDEX
Encoding Functions
53
Expression Validation
54
Variable Ports
Use to simplify complex expressions
• e.g. - create and store a depreciation formula to be
referenced more than once
Use in another variable port or an output port expression
Local to the transformation (a variable port cannot also be an input or
output port)
Available in the Expression, Aggregator and Rank transformations
55
Informatica Data Types
NATIVE DATATYPES TRANSFORMATION DATATYPES
Specific to the source and target PowerMart / PowerCenter internal
database types datatypes based on ANSI SQL-92
Display in source and target tables Display in transformations within
within Mapping Designer Mapping Designer
58
Mapping Designer
Transformation Toolbar
Mapping List
Iconized Mapping
59
Pre-SQL and Post-SQL Rules
60
Data Flow Rules
Each Source Qualifier starts a single data stream
(a dataflow)
Transformations can send rows to more than one
transformation (split one data flow into multiple pipelines)
Two or more data flows can meet together -- if (and only if)
they originate from a common active transformation
Cannot add an active transformation into the mix
ALLOWED DISALLOWED
Passive Active
T T T T
Example holds true with Normalizer in lieu of Source Qualifier. Exceptions are:
Mapplet Input and Joiner transformations 61
Connection Validation
62
Mapping Validation
Mappings must:
• Be valid for a Session to run
• Be end-to-end complete and contain valid expressions
• Pass all data flow rules
Mappings are always validated when saved; can be validated
without being saved
Output Window will always display reason for invalidity
63
Workflows
By the end of this section, you will be familiar with:
The Workflow Manager GUI interface
Workflow Schedules
Setting up Server Connections
Relational, FTP and External Loader
Task
Tool Bar
Workflow
Designer
Tools
Workspace
Navigator
Window
Output Window
Status Bar
65
Workflow Manager Tools
Workflow Designer
• Maps the execution order and dependencies of Sessions,
Tasks and Worklets, for the Informatica Server
Task Developer
• Create Session, Shell Command and Email tasks
• Tasks created in the Task Developer are reusable
Worklet Designer
• Creates objects that represent a set of tasks
• Worklet objects are reusable
66
Workflow Structure
A Workflow is set of instructions for the Informatica Server
to perform data transformation and load
Combines the logic of Session Tasks, other types of Tasks
and Worklets
The simplest Workflow is composed of a Start Task, a Link
and one other Task
Link
Start Session
Task Task
67
Workflow Scheduler Objects
68
Server Connections
Configure Server data access connections
− Used in Session Tasks
Configure:
1. Relational
2. MQ Series
3. FTP
4. Custom
5. External Loader
69
Relational Connections (Native )
Create a relational (database) connection
− Instructions to the Server to locate relational tables
− Used in Session Tasks
70
Relational Connection Properties
Define native
relational (database)
connection
User Name/Password
Database connectivity
information
Rollback Segment
assignment (optional)
71
FTP Connection
Create an FTP connection
− Instructions to the Server to ftp flat files
− Used in Session Tasks
72
External Loader Connection
Create an External Loader connection
− Instructions to the Server to invoke database bulk loaders
− Used in Session Tasks
73
Task Developer
Create basic Reusable “building blocks” – to use in any Workflow
Reusable Tasks
• Session Set of instructions to execute Mapping logic
• Command Specify OS shell / script command(s) to run
during the Workflow
• Email Send email at any point in the Workflow
Session
Command
Email
74
Session Task
Server instructions to runs the logic of ONE specific Mapping
• e.g. - source and target data location specifications,
memory allocation, optional Mapping overrides,
scheduling, processing and load instructions
Becomes a
component of a
Workflow (or
Worklet)
If configured in
the Task
Developer,
the Session Task
is reusable
(optional)
75
Command Task
Specify one (or more) Unix shell or DOS (NT, Win2000) commands to
run at a specific point in the Workflow
Becomes a component of a Workflow (or Worklet)
If configured in the Task Developer, the Command Task is reusable
(optional)
77
Additional Workflow Components
78
Developing Workflows
Create a new Workflow in the Workflow Designer
Customize
Workflow name
Select a
Server
79
Workflow Properties
Customize Workflow
Properties
Select a Workflow
Schedule (optional)
May be reusable or
non-reusable
80
Workflows Properties
81
Building Workflow Components
Add Sessions and other Tasks to the Workflow
Connect all Workflow components with Links
Save the Workflow
Start the Workflow Save
Start Workflow
Link 2
83
Session Tasks
84
Session Task
85
Session Task - General
86
Session Task - Properties
87
Session Task – Config Object
88
Session Task - Sources
89
Session Task - Targets
90
Session Task - Transformations
Allows overrides of
some transformation
properties
Does not change the
properties in the
Mapping
91
Session Task - Partitions
92
Monitor Workflows
93
Monitor Workflows
The Workflow Monitor is the tool for monitoring
Workflows and Tasks
Review details about a Workflow or Task in two views
• Gantt Chart view
• Task view
Task view
Gantt Chart view 94
Monitoring Workflows
Perform operations in the Workflow Monitor
• Restart -- restart a Task, Workflow or Worklet
• Stop -- stop a Task, Workflow, or Worklet
• Abort -- abort a Task, Workflow, or Worklet
• Resume -- resume a suspended Workflow after a
failed Task is corrected
Monitoring filters
can be set using
drop down menus
Minimizes items
displayed in
Task View
98
Debugger Features
99
Debugger Interface
Debugger windows & indicators Debugger Mode
indicator
Solid yellow
arrow Current
Transformation
indicator
Flashing
yellow
SQL
indicator
Transformation
Debugger Instance
Log tab Data window
Active Transformation
Connected
Ports
• All input / output
Usage
• Filter rows from
flat file sources
• Single pass source(s)
into multiple targets
101
Aggregator Transformation
Active Transformation
Connected
Ports
• Mixed
• Variables allowed
• Group By allowed
Create expressions in
output or variable ports
Usage
• Standard aggregations
102
Informatica Functions
Aggregate Functions
AVG Return summary values for non-null data
COUNT in selected ports
FIRST
LAST
Use only in Aggregator transformations
MAX
Use in output ports only
MEDIAN
MIN Calculate a single value (and row) for all
PERCENTILE
records in a group
STDDEV
SUM Only one aggregate function can be
VARIANCE nested within an aggregate function
Conditional statements can be used with
these functions
103
Aggregate Expressions
Aggregate
functions are
supported
only
in the
Aggregator
Transformation
Conditional
Aggregate
expressions
are supported Conditional SUM format: SUM(value, condition)
104
Aggregator Properties
Instructs the
Aggregator to
expect the data
to be sorted
Set Aggregator
cache sizes (on
Informatica Server
machine)
105
Sorted Data
106
Incremental Aggregation
MTD
Trigger in calculation
Session Properties,
Performance
Tab
Best Practice is to copy these files in case a rerun of data is ever required.
Reinitialize when no longer needed, e.g. – at the beginning new month processing 107
Joiner Transformation
108
Homogeneous Joins
Joins that can be performed with a SQL SELECT statement:
Source Qualifier contains a SQL join
109
Heterogeneous Joins
110
Joiner Transformation
Active Transformation
Connected
Ports
• All input or input / output
• “M” denotes port comes
from master source
Specify the Join condition
Usage
• Join two flat files
• Join two tables from
different databases
• Join a flat file with a
relational table 111
Joiner Conditions
Multiple
join
conditions
are supported
112
Joiner Properties
Join types:
• “Normal”
(inner)
• Master outer
• Detail outer
• Full outer
Set
Joiner Cache
114
Sorter Transformation
115
Lookup Transformation
116
How a Lookup Transformation Works
For each Mapping row, one or more port values are looked
up in a database table
If a match is found, one or more table values are returned
to the Mapping. If no match is found, NULL is returned
Return value(s)
117
Lookup Transformation
Looks up values in a database table and provides
data to other components in a Mapping
Passive Transformation
Connected / Unconnected
Ports
• Mixed
• “L” denotes Lookup port
• “R” denotes port used as a
return value (unconnected
Lookup only)
Specify the Lookup Condition
Usage
• Get related values
• Verify if records exists or
if data has changed 118
Lookup Properties
Override
Lookup SQL
option
Toggle
caching
Native
Database
Connection
Object name
119
Additional Lookup Properties
Set cache
directory
Make cache
persistent
Set
Lookup
cache sizes
120
Lookup Conditions
Multiple conditions are supported
121
To Cache or not to Cache?
Caching can significantly impact performance
Cached
• Lookup table data is cached locally on the Server
• Mapping rows are looked up against the cache
• Only one SQL SELECT is needed
Uncached
• Each Mapping row needs one SQL SELECT
Rule Of Thumb: Cache if the number (and size) of
records in the Lookup table is small relative to the
number of mapping rows requiring lookup
122
Target Options
123
Target Properties
Session Task
Select target
instance
Row loading
operations
Error handling
Properties Tab
124
Constraint-based Loading
Maintains referential integrity in the Targets
pk1
Example 1
fk1, pk2 With only One Active source, rows
for Targets 1-3 will be loaded
properly and maintain referential
fk2 integrity
pk1 Example 2
With Two Active sources, it is not
fk1, pk2 possible to control whether rows for
Target 3 will be loaded before or
fk2 after those for Target 2
126
Update Strategy Transformation
Used to specify how each individual row will be used to
update target tables (insert, update, delete, reject)
Active Transformation
Connected
Ports
• All input / output
Usage
• Updating Slowly
Changing Dimensions
• IIF or DECODE logic
determines how to
handle the record
127
Target Refresh Strategies
128
Router Transformation
Active Transformation
Connected
Ports
• All input/output
• Specify filter conditions
for each Group
Usage
• Link source data in one
pass to multiple filter
conditions
129
Router Transformation in a Mapping
130
Parameters and Variables
131
System Variables
SYSDATE Provides current datetime on the
Informatica Server machine
• Not a static value
133
Mapping Parameters and Variables
Sample declarations
Set the
User- appropriate
defined aggregation
names type
Set optional
Initial Value
135
Unconnected Lookup
Will be physically “unconnected” from other transformations
• There can be NO data flow arrows leading to or from an
unconnected Lookup
Lookup function can be set within any
transformation that supports expressions
Lookup data is
called from the
point in the
Mapping that
needs it
Function in the
Aggregator calls the
unconnected Lookup
136
Conditional Lookup Technique
Two requirements:
Must be Unconnected (or “function mode”) Lookup
Lookup function used within a conditional statement
Row keys
Condition (passed to Lookup)
IIF ( ISNULL(customer_id),:lkp.MYLOOKUP(order_no))
Lookup function
Condition Lookup
(true for 2 percent of all (called only when condition is
rows) true)
Net savings = 490,000 lookups
138
Connected vs. Unconnected Lookups
Part of the mapping data flow Separate from the mapping data
flow
Returns multiple values (by linking Returns one value (by checking the
output ports to another Return (R) port option for the output
transformation) port that provides the return value)
Executed for every record passing Only executed when the lookup
through the transformation function is called
More visible, shows where the Less visible, as the lookup is called
lookup values are used from an expression within another
transformation
Default values are used Default values are ignored
139
Heterogeneous Targets
140
Definition: Heterogeneous Targets
141
Step One: Identify Different Target Types
Oracle table
Oracle table
Tables are EITHER in two
different databases, or
require different (schema-
specific) connect strings
Flat file
One target is a flatfile load
142
Step Two: Different Database Connections
Flatfile requires
separate location
information
143
Target Type Override (Conversion)
CAUTION: If target definition datatypes are not compatible with datatypes in newly
selected database type, modify the target definition 144
Mapplet Designer
Mapplet Output
Transformation
145
Mapplet Advantages
146
Active and Passive Mapplets
147
Using Active and Passive Mapplets
Multiple Passive
Mapplets can populate
Passive the same target
instance
148
Reusable Transformations
149
Reusable Transformations
Define once - reuse many times
Reusable Transformations
• Can be a copy or a shortcut
• Edit Ports only in Transformation Developer
• Can edit Properties in the mapping
• Instances dynamically inherit changes
• Be careful: It is possible to invalidate mappings by
changing reusable transformations
Transformations that cannot be made reusable
• Source Qualifier
• ERP Source Qualifier
• Normalizer used to read a Cobol data source
150
Promoting a Transformation to Reusable
Place a
check in the
“Make
reusable” box
This action
is not
reversible
151
Sequence Generator Transformation
Passive Transformation
Connected
Ports
• Two predefined
output ports,
NEXTVAL and
CURRVAL
• No input ports allowed
Usage
• Generate sequence numbers
• Shareable across mappings
152
Sequence Generator Properties
Number
of
Cached
Values
153
Dynamic Lookup
154
Additional Lookup Cache Options
Make cache
persistent
156
Dynamic Lookup Cache Advantages
157
Update Dynamic Lookup Cache
NewLookupRow port values
• 0 – static lookup, cache is not changed
• 1 – insert row to Lookup cache
• 2 – update row in Lookup cache
Does NOT change row type
Use the Update Strategy transformation before or after
Lookup, to flag rows for insert or update to the target
Ignore NULL Property
• Per port
• Ignore NULL values from input row and update the cache
using only with non-NULL values from input
158
Example: Dynamic Lookup Configuration
160
Multi-Task Workflows - Sequential
161
Multi-Task Workflows - Concurrent
162
Multi-Task Workflows - Combined
164
Rank Transformation
Filters the top or bottom range of records
Active Transformation
Connected
Ports
• Mixed
• One pre-defined
output port
RANKINDEX
• Variables allowed
• Group By allowed
Usage
• Select top/bottom
• Number of records
165
Normalizer Transformation
Active Transformation
Connected
Ports
• Input / output or output
Usage
• Required for VSAM
Source definitions
• Normalize flat file or
relational source
definitions
• Generate multiple
records from one record
166
Normalizer Transformation
Turn one row
YEAR,ACCOUNT,MONTH1,MONTH2,MONTH3, … MONTH12
1997,Salaries,21000,21000,22000,19000,23000,26000,29000,29000,34000,34000,40000,4500
0
1997,Benefits,4200,4200,4400,3800,4600,5200,5800,5800,6800,6800,8000,9000
1997,Expenses,10500,4000,5000,6500,3000,7000,9000,4500,7500,8000,8500,8250
167
Stored Procedure Transformation
Passive Transformation
Connected/Unconnected
Ports
• Mixed
• “R” denotes port will
return a value from the
stored function to the
next transformation
Usage
• Perform transformation
logic outside PowerMart /
PowerCenter
168
External Procedure Transformation (TX)
Calls a passive procedure defined in a dynamic linked
library (DLL) or shared library
Passive Transformation
Connected/Unconnected
Ports
• Mixed
• “R” designates return
value port of an
unconnected
transformation
Usage
• Perform transformation
logic outside PowerMart /
PowerCenter
Option to allow partitioning 169
Advanced TX Transformation
Calls an active procedure defined in a dynamic linked
library (DLL) or shared library
Active Transformation
Connected Mode only
Ports
• Mixed
Usage
• Perform
transformation logic
outside PowerMart /
PowerCenter
• Sorting, Aggregation
Passive Transformation
Connected Mode Only
Ports
• Input and Output
Properties
• Continue
• Commit Before
• Commit After
• Rollback Before
• Rollback After
171
Transaction Control Functionality
Commit Types
• Target Based Commit -
Commit Based on “approximate” number of records
written to target
• Source Based Commit –
Ensures that a source record is committed in all
targets
• User Defined Commit –
Uses Transaction Control Transform to specify
commits and rollbacks in the mapping based on
conditions
Set the Commit Type (and other specifications) in the
Transaction Control Condition
172
Versioning
173
Informatica Business Analytics Suite
Modular
Plug-&-Play
Approach
Packaged
Analytic Solutions Custom Built
Analytic Solutions
174
Informatica Warehouses / Marts
Informatica Warehouse™
Customer Finance Human Supply Chain
Relationship Resources
Sales G/L Compensation Planning
Marketing Receivables Scorecard Sourcing
Service Payables Inventory
Web Profitability Quality
Common Dimensions
Customer Product Supplier Geography
Organization Time Employee
175
Inside the Informatica Warehouse
Business Intelligence Business Adapters™ (Extract)
• Data Source Connectivity with Minimal
Load
Informatica Warehouse™ • Structural/Functional Knowledge of
Sources
Analytic Advanced
Analytic Bus™ (Transform)
Data Model Calculation
• Transaction consolidation and
Engine standardization
• Source independent interface
Load
dimensions
Business Adapters™
Advanced Calculation Engine
• Pre-aggregations for rapid query response
• Complex calculation metrics (e.g.
SAP ORCL i2 SEBL PSFT Custom statistical)
176
PowerConnect Products