Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Informatica PowerCenter 9x
Level One Developer
Student Guide
Version: PowerCenter 9x Level One Developer 201201
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Un
nauthorized
d reproductio
on or distrib
bution prohib 2, Informaticca and/or itssiiaffiliates.
bited. Copyrright© 2012
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Preface
The course will introduce attendees to working with the PowerCenter Designer,
Workflow Manager, and Workflow Monitor tools, performing tasks such as creating
transformations, mappings, reusable objects, sessions and workflows to extract, transform
and load data. They will develop cleansing, formatting, sorting and aggregating
procedures. They can learn how to use routers, update strategies, parameters /variables
and overrides. This course will cover many different types of lookups, such as cached,
persistent, dynamic and multiple row returns. Workflow tasks will be created to define a
set of instructions for executing the ETL.
Prerequisites:
Prerequisites include basic familiarity with Windows GUI and at least two years’ work
experience and some knowledge of SQL.
Course Objectives:
After successfully completing this course, students should be able to:
Use Informatica Support to resolve questions and problems with PC9.x.
Use PowerCenter 9.x Designer to build mappings that extract data from a source
to a target, transforming it as necessary.
Use PowerCenter transformations to cleanse, format, join, aggregate and route
data to the appropriate targets
Perform error handling/trapping using PowerCenter mappings
Use PowerCenter 9.x Workflow Manager to build and run a workflow which
executes a sessions associated with a mapping
Design and build simple mappings and workflows based on essential business
needs.
Perform basic troubleshooting using PowerCenter logs and debugger
Audience:
This course is designed for database developers with little or no experience of
PowerCenter.
.
Un
nauthorized
d reproductio
on or distrib
bution prohib 2, Informaticca and/or itssiiiaffiliates.
bited. Copyrright© 2012
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Document Conventions
This guide uses the following formatting conventions:
If you see… It means… Example
> Indicates a submenu to navigate Click Repository > Connect.
to. In this example, you should click the
Repository menu or button and choose
Connect.
boldfaced text Indicates text you need to type or Click the Rename button and name the new
enter. source definition S_EMPLOYEE.
UPPERCASE Database tables and column names T_ITEM_SUMMARY
are shown in all UPPERCASE.
italicized text Indicates a variable you must Connect to the Repository using the assigned
replace with specific information. login_id.
Note: The following paragraph provides Note: You can select multiple objects to
additional facts. import by using the Ctrl key.
Tip: The following paragraph provides Tip: The m_ prefix for a mapping name is…
suggested uses or a Velocity best
practice.
Un
nauthorized
d reproductio
on or distrib
bution prohib 2, Informaticca and/or itssivaffiliates.
bited. Copyrright© 2012
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Other Informatica Resources
In addition to the student guides, Informatica provides these other resources:
Informatica Documentation
Informatica Customer Portal
Informatica web site
Informatica Developer Network
Informatica Knowledge Base
Informatica Professional Certification
Informatica Technical Support
Obtaining Informatica Documentation
You can access Informatica documentation from the product CD or online help.
Visiting Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal site at
http://communities.informatica.com
The site contains product information, user group information, newsletters, access to the
Informatica customer support case management system (ATLAS), the Informatica Knowledge
Base, and access to the Informatica user community.
Visiting the Informatica Web Site
You can access Informatica’s corporate web site at:
http://www.informatica.com
The site contains information about Informatica, its background, upcoming events, and locating your
closest sales office. You will also find product information, as well as literature and partner
information. The services area of the site includes important information on technical support,
training and education, and implementation services.
Visiting the Informatica Technology Network
The Informatica Developer Network is a web-based forum growing online community and
interactive forum for data integration and data quality professionals around the globe. You can access
the Informatica Developer Network at the following URL:
http://technet.informatica.com/
The site contains information on how to create, market, and support customer-oriented add-on
solutions based on interoperability interfaces for Informatica products.
Visiting the Informatica Knowledge Base
As an Informatica customer, you can access the Informatica Knowledge Base at:
http://communities.informatica.com
The Knowledge Base lets you search for documented solutions to known technical issues about
Informatica products. It also includes frequently asked questions, technical white papers, and
technical tips.
Obtaining Informatica Professional Certification
You can take, and pass, exams provided by Informatica to obtain Informatica Professional
Certification. For more information, go to:
http://www.informatica.com/products_services/education_services/certification/Pages/index.aspx
Un
nauthorized
d reproductio
on or distrib
bution prohib 2, Informaticca and/or itssvaffiliates.
bited. Copyrright© 2012
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Providing Feedback
Email any comments on this guide to education@informatica.com.
WebSupport requires a user name and password. You can request a user name and password at:
http://communities.informatica.com
Un
nauthorized
d reproductio
on or distrib
bution prohib 2, Informaticca and/or itssviaffiliates.
bited. Copyrright© 2012
Module
Un 0: Intro d reproductio
nauthorized on or distrib
bution prohib
bited. Copyrright© 2012 0.1
2, Informaticca and/or itss affiliates.
0
0
0
0
0
0
0
0
0
0
0
Integration Service The engine which performs all the ETL logic
Integration Service The engine which performs all the ETL logic
Integration Service The engine which performs all the ETL logic
Integration Service The engine which performs all the ETL logic
Integration Service The engine which performs all the ETL logic
Repository Contains all the metadata needed to run the ETL process
Folder Management Folders are created and managed in the Repository Manager
application.
Shortcut Folders Do not confuse repository folders with the directories visible
in Windows Explorer. The folders are PowerCenter repository
objects and are not related to Windows directories.
Technically, all folders are “shared” with all users who have
the appropriate folder permissions, regardless of the “blue
arm” icon. The “blue arm” icon indicates that the folder
permits shortcuts, dynamic links to the objects contained in
that folder used by mappings in other folders.
Note: Two sources from different systems may use the same name.
Placing each source in a folder based on its connection type
avoids confusion when this is the case.
Type Active
Description Mandatory for all flat file and relational sources in a mapping.
Selects records from flat file and relational table sources. For
relational tables, creates a SQL SELECT statement.
Converts native source datatypes to PowerCenter
transformation datatypes.
A workflow is a set of ordered tasks that describe runtime ETL processes. Tasks can be
sequenced serially, in parallel and conditionally. Each linked icon represents a task.
Note: In the labs for this course, we are simulating part of the
creation of a (very simple) Dimensional Data Warehouse. In
these labs, you will begin with data in OLTP tables and flat
files, bring data to Staging, and from Staging (STG) to the
Operational Data Store (ODS).
Because creation of Staging tables is fairly trivial, you will do
more work on moving data from STG to ODS. This will
provide more realistic uses of the capabilities of
PowerCenter.
Velocity Phases Velocity covers the entire data integration project lifecycle:
Phase 1: Manage
Phase 2: Architect
Phase 3: Design
Phase 4: Build
Phase 5: Deploy
Phase 6: Operate
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 3: Troubleshooting
nauthorized d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012 3.1
2, Informaticca and/or itss affiliates.
When the Integration Service reads non-numeric data in a numeric column from a flat file, it drops
the row and writes a message in the session log. Also, when the Integration Service reads non-
datetime data in a datetime column from a flat file, it drops the row and writes a message in the
session log.
Such a target and/or source reject would put the record in the .bad file as well.
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 4: PowerCenter
nauthorized Transformations,
d reproductio
on or distribTasks prohib
bution and Reusability
bited. Copyrright© 2012 4.1
2, Informaticca and/or itss affiliates.
Type Passive
Description Modifies individual ports (columns) within a single row. Can add
and suppress ports. Cannot perform aggregation across multiple
rows.
Business Use the logical and arithmetic operators and built-in functions
Purpose for:
• Character manipulation (concatenate, truncate, etc.)
• Datatype conversion (to char, to date, etc.)
• Data cleansing (check nulls, replace strings, etc.)
• Data manipulation (round, truncate, etc.)
• Numerical calculations
• Scientific calculations
• Special functions (lookup, decode, etc.)
• Testing (for spaces, number, etc.)
Type Active
Description Allows rows which meet the filter condition are passed through
to the next transformation. Rows which do not meet the filter
condition are skipped.
Example:
A business sells a high volume of products and updates the Product Dimension
table on a regular basis. To update the dimension table, a join of the PRODUCT
and PRODUCT_COST table is required. Since the source tables are form the
same database and have a key relationship only a single Source Qualifier
transformation is needed.
Type Active
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 7: Usingdthe
nauthorized Debugger
reproductio
on or distrib
bution prohib
bited. Copyrright© 2012 7.1
2, Informaticca and/or itss affiliates.
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 8: Sequence
nauthorized Generators,
d reproductio
on orLookups
distrib and Additional
bution prohib Workflow
bited. Tasks 2012
Copyrright© 8.1
2, Informaticca and/or itss affiliates.
Type Passive
Business Purpose Allows data from external sources such as product codes,
dates, names, etc., to be brought into the row being processed.
• Report Error. The Integration Service reports an error and does not return a
row. If you do not enable the Output Old Value On Update option, the Lookup
Policy On Multiple Match option is set to Report Error for dynamic lookups.
• Use First Value. Returns the first row that matches the lookup condition.
• Use Last Value. Return the last row that matches the lookup condition.
Description Event Wait tasks wait for either the presence of a named flat
file (a pre-defined event) or some other user-defined event
to occur in the workflow processing. Note that the Workflow
must be running in order to recognize a pre-defined event.
Business Purpose An Event Wait task watching for a flat file by name is placed
in a workflow because some subsequent processing is 8
dependent on the presence of the file.
An Event Wait task waiting for the occurrence of a user-
defined event will be strategically placed so that the
workflow should not proceed further until some other set of
tasks and conditions has occurred. It always works in
concert with an Event Raise task.
Note: The Control task can fail, stop, or abort either the parent
Workflow or the top-level Workflow. However, stopping or
aborting the parent Workflow means that no further progress
takes place along that branch in the top-level Workflow. This
can cause the top-level Workflow to stop if there is no other
branch.
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 9: Update
nauthorized Strategies, on
d reproductio Routers andbution
or distrib Overrides
prohib
bited. Copyrright© 2012 9.1
2, Informaticca and/or itss affiliates.
Type Active
Business Purpose A target table may require historical information dealing with
existing entries. Rows written to a target table, based on one
or more criteria, may need to be inserted, updated, or
deleted. The Update Strategy transformation meets this 9
requirement.
Note: For the row tags DD_DELETE and DD_UPDATE, the table
definition in the mapping must have a key identified.
Otherwise, the session created from the mapping will fail.
If the “Forward Rejected Rows” attribute is checked
(default), then rows tagged with DD_REJECT will be passed
on to the next transformation or the Target, and
subsequently placed in the appropriate “bad file”. If the
attribute is unchecked, then the reject rows will be skipped.
Type Active
Business Purpose Allows you to write records from a single source into multiple
targets based on user-defined criteria.
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 10: Sorting
nauthorized and Aggregating
d reproductio Databution
on or distribUsing PowerCenter
prohib
bited. Copyrright© 2012 10.1
2, Informaticca and/or itss affiliates.
10
Type Active
Description Sorts incoming data based on one or more key values. Sort
order may be ascending, descending, or mixed.
10
Property Description
Distinct Treats output rows as distinct. If this is selected, all ports are
considered as part of the sort key.
Null Treated Low If selected, treat nulls as lower values than any other.
10
Type Active
Attribute Description
Cache Directory Local directory for the index and data cache file
Tracing Level Amount of detail displayed in the session log for this
transformation
Sorted Input Indicates input data is presorted by group. Use only if the
mapping passes sorted data to the Aggregator.
Aggregator Data Data cache size for the transformation. Default size is set to
Cache Size Auto.
Aggregator Index Index cache size for the transformation. Default cache size
Cache Size is set to Auto.
10
Transformation Transaction: applies transformation logic to all rows in a
Scope transaction.
All input: applies the transformation logic on all incoming
data.
Key Points • If there is not enough memory specified in the index and
data cache properties, overflow is written to disk
• No rows are returned until all rows are aggregated
• Checking the “sorted input” attribute bypasses caching, as
well as the sort operation that occurs implicitly in an
Aggregator
10
Type Passive
Business Purpose A source table may have a small percentage of records with
incomplete data. These “holes” in the data can be filled by
performing a lookup to another table or tables, on an as-
needed basis.
10
10
If a port is not selected as the R port, the mapping will not be invalidated but the
session will fail at runtime.
10
10
10
10
Scope Parameters and variables can be used only inside the object
in which they are created. A Mapping variable created for
Mapping_1 is available only within that Mapping and cannot
be used by another Mapping or Mapplet in the same
workflow. A parameter or variable’s scope is the object in
which it was created.
10
Repository Saved Values for variables that were saved in the Repository after
Value successful completion of a Session
Declared Initial Value The initial value, as set by the user when creating the
variable or parameter
10
From WhatIs.com –
In data warehousing and business intelligence, a star schema is the simplest form
of a dimensional model, in which data is organized into facts and dimensions.
A dimension contains reference information about the fact, such as date, product,
or customer.
10
10
Type Passive
10
Type Passive
10
10
Note: Mapplets cannot be nested – that is, you cannot use a Mapplet
inside another Mapplet.
10
10
10
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.2
2, Informaticca and/or itss affiliates.
Business Purpose A workflow can contain multiple tasks and multiple pipelines.
One or more tasks or pipelines may be dependent on the
status of previous tasks. Workflow variables convey that
information from one task to another.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.4
2, Informaticca and/or itss affiliates.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.6
2, Informaticca and/or itss affiliates.
Business Purpose Running a workflow task may depend on the results of other
tasks or calculations in the workflow. An Assignment task
can do certain calculations to establish the value for a
workflow variable. This value may determine whether other
tasks or pipelines are run.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.8
2, Informaticca and/or itss affiliates.
Business Purpose Commonly, workflows have multiple paths. Some are simply
concurrent tasks. Others are pipelines of tasks that should
only run if the previous tasks are successful. Still others
should be run only if those tasks are not successful.
What determines the success or failure of a task or group of
tasks is user-defined, depending on the business-defined
rules and operational rules of processing.
The criteria are set as the decision condition in a Decision
task, and subsequently tested for a True or False condition
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.10
2, Informaticca and/or itss affiliates.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.12
2, Informaticca and/or itss affiliates.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.14
2, Informaticca and/or itss affiliates.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.16
2, Informaticca and/or itss affiliates.
11
PowerCenter 9.x Level 1 Developer Copyright © 2012 Informatica Corp
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 11: Workflow
nauthorized Variables,
d reproductio on Incremental
or distrib Aggregation
bution prohib andCopyr
bited. Tasks
right© 2012 11.18
2, Informaticca and/or itss affiliates.
12
12
12
12
Service variables. Define general properties for the Integration Service such as email addresses,
log file counts, and error thresholds.
Service process variables. Define the directories for Integration Service files for each Integration
Service process. $PMRootDir, PMSessionLogDir, and $PMBadFileDir are examples of service
process variables.
12
12
12
12
12
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 13: Dynamic
nauthorized Lookup and
d reproductio Error
on or Logging/Handling
distrib
bution prohib
bited. Copyrright© 2012 13.1
2, Informaticca and/or itss affiliates.
13
13
13
13
13
13
13
13
13
13
13
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 14: MoredLookups
nauthorized reproductio
on or distrib
bution prohib
bited. Copyrright© 2012 14.1
2, Informaticca and/or itss affiliates.
14
14
14
14
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 15: Mapping
nauthorized Workshop
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012 15.1
2, Informaticca and/or itss affiliates.
15
15
In this lab, we will build a fact table that tracks promotions by day for each
dealership for each product being sold.
15
Un
nauthorized
d reproductio
on or distrib
bution prohib
bited. Copyrright© 2012
2, Informaticca and/or itss affiliates.
Module
Un 16: Workflow
nauthorized Workshop
on 1or distrib
d reproductio bution prohib
bited. Copyrright© 2012 16.1
2, Informaticca and/or itss affiliates.
16
16
An Employee flat file is being used as a source to load an Employee target system
at regular intervals.
The Employee load session is processed based on when an indicator file is
created as part of the nightly batch load. Sometimes the latest Employee file may
not be available for loading, perhaps due to source dependency.
If the source does not trigger that script that creates the indicator file in a specified
period of time, a notification should be sent and the wait for the file should be
stopped.
17
17
17