Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Agenda
Introduction Architecture Best Practices
Introduction
Presenter Contact
Lingaraju Ramasamy (Raju) lramasamy@informatica.com
408-368-2475 (Mobile)
Technical Architecture Manager, Informatica Professional Services
Modularity
Develop according to a modular design Common Error Handling Reprocessing Mapping Assistants
Reusability
Focus on reuse to make quick and universal modifications Mapplets, Worklets, Transformations, reusable functions
7
Simplicity
Multiple simple processes are often better than few complex processes Multiple mappings Simple Queries Staging Tables Advantages Easy to develop, debug, maintain and debug
8
SEQ_OUTLET
EXP_OUTL_BO OKEND
JAVA_GENERA TE_MSGID
JAVA_GENERA TE_SESSID
SEQ_PK_FK_O UTLET
EXP_CREATE_ HDR_ELEMEN TS
SEQ_HR_HDR
EXP_HRS_HDR
SEQ_HRS
EXP_HRS_BOO KEND
JNR_HRS_HDR
JNR_OUTL_HR S
JNR_OUTL_HR _HOL
SEQ_LANG
EXP_LANG_BO OKEND
SEQ_TEAM_ME MBER
SQ_TEAM_ME MBER
EXP_TEAM_ME MBER
EXP_SEQ_SER VICE_REC_3
EXP_HEADER_ HOURS
EXP_SEQ_REC _BASIC
EXP_SEQ_REC _TM_ZONE
EXP_SEQ_REC _GEO_TYPE
EXP_ACCOUNT ING_REC_6
EXP_SEQ_DAY _OF_WEEK
EXP_SEQ_STA TUS_REC_5
UNI_OUTLET
EXP_HDR_BOO KEND
WSC_STR_OU Tl_SV_OUTL
EXP_SRC_BOO KEND
EXP_LANG_BO OKEND
SEQ_OUTLET
SEQ_OUTLET
SEQ_OTHERS
SC_LKP_GET_ MSGID
EXP_OUTL_BO OKEND
SEQ_OTHERS
SC_LKP_GET_ MSGID
SEQ_HRS
EXP_HRS_BOO KEND
JNR_OUTL_HR S
EXP_PASS_TH ROUGH
SRT_INCM_RE CS
EXP_CHK_NE W_RECS
RTR_HDR_DE T_DATA
EXP_SEQ_DAY _OF_WEEK
Staging 1
SEQ_LANG SC_T_STR_AT TR_OUTL_LAN G_FN (Oracle) SQ_SC_T_STR _ATTR_OUTL_ LANG_FN EXP_LANG_BO OKEND
EXP_DETAIL_L ANG
EXP_HEADER_ HOURS
Staging 2
SC_T_STR_AT TR_SITE_FN (O racle) SQ_SC_T_STR _ATTR_SITE_F N EXP_SRC_BOO KEND JAVA_GEN_MS G_ID
EXP_DETAIL_H OURS
SEQ_OUTLET
EXP_OUTL_BO OKEND
JAVA_GEN_SE SSID
SEQ_OTHERS
SC_LKP_GET_ MSGID
SEQ_ID
SC_EXP_CREA TE_HDR_ELEM ENTS JNR_OUTL_LA NG_HOL_HRS EXP_PASS_TH ROUGH SRT_INCM_RE CS EXP_CHK_NE W_RECS RTR_HDR_DE T_DATA EXP_HEADER_ LANG LANG_HEADER _SC_T_STR_A TTR_OUTL_W K (Oracle)
EXP_GET_SEQ _NUM
WSC_STR_ATT R_SAVE_SITE
EXP_DETAIL_L ANG
Staging 3
10
11
Mapping Tips
Sources and Targets
Use shortcuts from shared folders Extract only what is necessary
12
Mapping Tricks
Parameters & Variables
Reduce overhead of creating multiple mappings Replace hard coded values Use to incrementally extract data
Example
13
Mapping Tricks
Parameters & Variables
Assign Parameter/Variable values in a Session
Pass values from one session to a subsequent session in same workflow/worklet On Components Tab in Session Properties Use user-defined workflow/worklet variables Non-reusable Sessions only
14
Mapping Tricks
Built-in Mapping Variables
Mapping Name Workflow Name Session Name
Integration Service Name Repository Service Name Repository User Name Folder Name Session Run Mode Source Table Names Target Table Names
15
Mapping Tricks
Group Expression (Anchor transformation)
Add expression transformation after a source qualifier and before a target
16
Mass Update
pmrep massupdate
Session properties Session config attributes
17
Mapping Assistants
Preview Data
View Data
Accommodate anomalies early Verification of extraction/loading strategies
Type of Data
Source/Targets Relational, Flat file XML Files
18
Mapping Assistants
Mapping Wizard
Pass-Through Slowly Changing Dimension Type 1 Dimension (No History) Type 2 Dimension (All History) Version Data Flag Current Effective Date Range Type 3 Dimension (Previous Versions)
Mapping Assistants
Mapping Analyst for Excel (MAE)
Standardize specifications Enhance collaboration between analyst and developer Improve documentation & audit ability of business logic
Data Analyst Defines Business Terms Specifies Transformation Rules Standardize Excel format DI Developer Augments, Tunes Generated Mappings from Specifications
Generate Specification
Generate Mapping
20
Mapping Assistants
Mapping Architect for Visio (MAV)
Define consistent methodology & structure for data integration projects
Build custom wizard based on pattern without coding Generate multiple mappings at one time Document data flow
DI Architect Creates & Publishes mapping template
Template File
Informatica Toolbar
Informatica Stencil
Publish Template
Drawing Window
Generate Mappings
Parameter File
21
Mapping Assistants
Mapping Architect for Visio (MAV)
Case Study #1 7 templates were used across 2 projects to generate 600 mappings 97% of mappings were automatically generated and required no additional changes 3% needed to be manually modified or custom developed
Case Study #2 1 template was used to create 150 mappings for a data migration project along with PowerCenter sessions and workflows Total effort was less than one day Equivalent effort to create 150 mappings manually would have been 2 weeks (10x effort)
22
Transformation Techniques
23
Transformation Tips
Source Qualifier
Apply Default Query when possible
Utilize SQ Attributes (i.e., User Defined Join, Source Filter)
Minimize complicated queries Add the SQL Override Query to the Description
24
Transformation Tips
Expressions
Understand Port process order
INs or IN/OUTs VARIABLEs OUTPUTs
Optimize Expressions
Numeric operations are faster than string operations Operators are faster than functions Un-Nest complicated logic (use IIF or DECODE)
25
Transformation Tips
User-Defined Functions
Build complex expressions and reuse them within repository
Two Types:
Public: Callable from any transformation expression Private: Only callable from another userdefined function
26
Transformation Tips
Filters/Routers
Consider Source Qualifier with a filter to limit rows within relational sources Filter as close to the source as possible
Replace multiple filters with a router Pertaining to routers, rows will go to each path where the criteria is TRUE
27
Transformation Tips
Aggregators
Use sorted input to decrease use of aggregate caches
Limit connected input/output or output port Filter data before aggregating Use as early as possible
Joiners
Perform joins in Source Qualifier when possible
Limit use to heterogeneous and flat file sources
Perform normal joins when possible Join sorted input when possible Designate the master source as the source with fewer rows
28
Transformation Tips
Lookups
Using SQL Override in Lookup
Similar to Source Qualifier, avoid when possible Can apply Parameters and Variables Can query against multiple tables in same database Suppress ORDER BY statement by appending two dashes (--)
Add indexes to database columns Replace large lookup tables with joins in the Source Qualifier when possible
Relational Lookups should only return ports that meet the condition Remove all ports not used downstream or in the SQL Override
29
Transformation Tips
Lookup Caches
Lookup Cache Types Persistent Caches Save lookup cache files for reuse Dynamic Caches Retains the latest changes to data as rows pass through the mapping Updating a master table Real-time sessions Slowly changing dimension Cache Sizes Eliminate Paging Stores condition values in index, .idx, files Stores output values in data, .dat, files Apply the Cache Calculator in Session
30
Transformation Tips
Lookups
Cache Updates Update the dynamic lookup cache with results of an expression
Use Case: Update QTY on hand for new timestamp Add WHERE incoming row timestamp > cached timestamp
SQL Overrides for Uncached Lookups You must choose the Use Any Value on Lookup Policy on Multiple Match condition Multiple Rows Return
Use Case: Aggregate customer orders and store the total value
Transformation Tricks
Pipeline Lookup
Perform a lookup on an application source that is not a relational table or file
Partial pipeline contains Source & Source Qualifier but no target Integration Service reads source data and passes to Lookup Transformation to create cache Create partitions to improve performance
32
Transformation Tips
Transaction Control Transformation
Transaction in PowerCenter is a set of rows bound by commit or rollback Control commit and rollback transactions based on a row or set of rows that pass through the transformation
Use Case: Each invoice number is committed to the target database as a single transaction
Transformation Tips
Associated Source Qualifier
Use ASQ when MQ data is flat file or COBOL ASQ is specific to the format of the MQ data
34
Transformation Tips
Sequence
Non-Reusable the counter is 0 Performance will be affected if cached is low Increase of caching will improve the performance This doesnt involve any database operation The caching allows to reserve number of rows in the memory
35
36
37
Input Type: Command (default: file) Command Type: Command Generating File List
Command writes list of file names to stdout PowerCenter interprets this as a file list.
38
Input Type: Command (default: file) Command Type: Command Generating Data
Command generates rows to stdout Flat file reader reads directly from stdout Removes need for staging data
Filename Port
Source Filename
Input Filename can be processed and passed to target
41
Filename Port
Target Filename
Write records to a dynamically named flat file
42
43
44
45
Use of Metadata
46
47
48
Repository Maintenance
49
Repository Maintenance
Purge repository versions
Define version strategy for Dev, QA and Prod Archieve if required for future analysis Purge unwanted versions Run the purge in regular interval daily, weekly or monthly
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD
Repository Maintenance
Purge repository logs
Define log strategy for Dev, QA and Prod Archieve if required for future analysis Purge unwanted logs Run the purge in regular interval daily, weekly or monthly
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD pmrep truncatelog -t $DAYS_TO_KEEP
51
52
Questions?
53