Sei sulla pagina 1di 206

Diff between Primary key and unique key

Scenario:
Diff between Primary key and unique key
Solution:
Unique key display the unique values which can have one null value where
Primary key has unique values without null.
Whenever you create the primary key constraints, Oracle by default create a unique
index with not null.

Posted 3rd December 2012 by Prafull Dangore


0

Add a comment

Jun
15

SQL Transformation with examples


=======================================================
======================================

SQL Transformation with examples


Use: SQL Transformation is a connected transformation used to process SQL queries in the midstream
of a pipeline. We can insert, update, delete and retrieve rows from the database at run time using the
SQL transformation. Use SQL transformation in script mode to run DDL (data definition language)
statements like creating or dropping the tables.
The following SQL statements can be used in the SQL transformation.

Data Definition Statements (CREATE, ALTER, DROP, TRUNCATE, RENAME)

DATA MANIPULATION statements (INSERT, UPDATE, DELETE, MERGE)

DATA Retrieval Statement (SELECT)

DATA Control Language Statements (GRANT, REVOKE)

Transaction Control Statements (COMMIT, ROLLBACK)

Scenario: Lets say we want to create a temporary table in mapping while workflow is running for
some intermediate calculation. We can use SQL transformation in script mode to achieve the same.
Below we will see how to create sql transformation in script mode with an example where we will
create a table in mapping and will insert some rows in the same table.
Solution:
Step 1:

1.

Create two text files in the $PMSourceFileDir directory with some sql queries.
sql_script.txt
File contains the below Sql queries (you can have multiple sql queries in file separated by semicolon)
create table create_emp_table
(emp_id number,emp_name varchar2(100))

2.

sql_script2.txt
File contains the below Sql queries (you can have multiple sql queries in file separated by semicolon)
insert into create_emp_table values (1,'abc')
These are the script files to be executed by SQL transformation on database server.
Step 2:
We need a source which contains the above script file names with a complete path.
So, I created another file in the $PMSourceFileDir directory to store these script file names as
Sql_script_input.txt.
File contains the list of files with their complete path:
E:\softs\Informatica\server\infa_shared\SrcFiles\sql_script.txt
E:\softs\Informatica\server\infa_shared\SrcFiles\sql_script2.txt
Step 3:
Now we will create a mapping to execute the script files using the SQL transformation.
Go to the mapping designer tool, source analyzer and Import from file
=>then creates source definition by selecting a file Sql_script_input.txt Located at
E:\softs\Informatica\server\infa_shared\SrcFiles.
Source definition will look like

Similarly create a target definition, go to target designer and create a target flat file with result and
error ports. This is shown in the below image

Step 4:

Go to the mapping designer and create a new mapping.

Drag the flat file into the mapping designer.

Go to the Transformation in the toolbar, Create, select the SQL transformation, enter a name
and click on create.

Now select the SQL transformation options as script mode and DB type as Oracle and click ok.

The SQL transformation is created with the default ports.

Now connect the source qualifier transformation ports to the SQL transformation input port.

Drag the target flat file into the mapping and connect the SQL transformation output ports to
the target.

Save the mapping. The mapping flow image is shown in the below picture.

Go to the workflow manager; create a new workflow and session.

Edit the session. For source, enter the source & target file directory.

For the SQL transformation, enter the oracle database relational connection as shown below.

Save the workflow and run it.

Open the target file, you will find the below data.
"PASSED";
"PASSED";
"PASSED"; -: for sql_script.txt, where it will crate the table and
"PASSED"; -: For sql_scriptw.txt, where it will insert rows in to the table

Fire a select query on the database to check whether table is created or not.

=======================================================
======================

Posted 15th June 2012 by Prafull Dangore


0

Add a comment

Jun
7

Efficient SQL Statements : SQL Tunning


Tips
Efficient SQL Statements
This is an extremely brief look at some of the factors that may effect the efficiency
of your SQL and PL/SQL code. It is not intended as a thorough discussion of the
area and should not be used as such.

Check Your Stats

Why Indexes Aren't Used

Caching Tables

EXISTS vs. IN

Presence Checking

Inequalities

When Things Look Bad!

Driving Tables (RBO Only)

Improving Parse Speed

Packages Procedures and Functions

Check Your Stats


The Cost Based Optimizer (CBO) uses statistics to decide which execution plan to
use. If these statistics are incorrect the decision made by the CBO may be
incorrect. For this reason it is important to make sure that these statistics are
refreshed regularly. The following article will help you achieve this aim.

Cost Based Optimizer (CBO) and Database Statistics

Why Indexes Aren't Used


The presence of an index on a column does not guarantee it will be used. The
following is a small list of factors that will prevent an index from being used.

The optimizer decides it would be more efficient not to use the index. If your
query is returning the majority of the data in a table, then a full table scan is
probably going to be the most efficient way to access the table.

You perform a function on the indexed column i.e. WHERE UPPER(name) =


'JONES'. The solution to this is to use a Function-Based Index.

You perform mathematical operations on the indexed column i.e. WHERE


salary + 1 = 10001

You concatenate a column i.e. WHERE firstname || ' ' || lastname = 'JOHN
JONES'

You do not include the first column of a concatenated index in the WHERE
clause of your statement. For the index to be used in a partial match, the
first column (leading-edge) must be used. Index Skip Scanning in Oracle 9i
and above allow indexes to be used even when the leading edge is not
referenced.

The use of 'OR' statements confuses the Cost Based Optimizer (CBO). It will
rarely choose to use an index on column referenced using an OR statement.
It will even ignore optimizer hints in this situation. The only way of
guaranteeing the use of indexes in these situations is to use an INDEX hint.
EXISTS vs. IN
The EXISTS function searches for the presence of a single row meeting the stated
criteria as opposed to the IN statement which looks for all occurrences.
TABLE1 - 1000 rows
TABLE2 - 1000 rows
(A)
SELECT t1.id
FROM table1 t1
WHERE t1.code IN (SELECT t2.code
FROM table2 t2);
(B)
SELECT t1.id
FROM table1 t1
WHERE EXISTS (SELECT '1'
FROM table2 t2
WHERE t2.code = t1.code)
For query A, all rows in TABLE2 will be read for every row in TABLE1. The effect will
be 1,000,000 rows read from items. In the case of query B, a maximum of 1 row
from TABLE2 will be read for each row of TABLE1, thus reducing the processing
overhead of the statement.
Rule of thumb:

If the majority of the filtering criteria are in the subquery then the IN
variation may be more performant.

If the majority of the filtering criteria are in the top query then the EXISTS
variation may be more performant.

I would suggest they you should try both variants and see which works the best.
Note. In later versions of Oracle there is little difference between EXISTS and IN
operations.
Presence Checking
The first question you should ask yourself is, "Do I need to check for the presence
of a record?" Alternatives to presence checking include:

Use the MERGE statement if you are not sure if data is already present.

Perform an insert and trap failure because a row is already present using the
DUP_VAL_ON_INDEX exception handler.

Perform an update and test for no rows updated using SQL%ROWCOUNT.

If none of these options are right for you and processing is conditional on the
presence of certain records in a table, you may decide to code something like the
following.
SELECT Count(*)
INTO v_count
FROM items
WHERE item_size = 'SMALL';
IF v_count = 0 THEN
-- Do processing related to no small items present
END IF;
If there are many small items, time and processing will be lost retrieving multiple
records which are not needed. This would be better written like one of the
following.
SELECT COUNT(*)
INTO v_count
FROM items
WHERE item_size = 'SMALL'
AND rownum = 1;
IF v_count = 0 THEN
-- Do processing related to no small items present
END IF;
OR
SELECT COUNT(*)
INTO v_count
FROM dual
WHERE EXISTS (SELECT 1
FROM items
WHERE item_size = 'SMALL');
IF v_count = 0 THEN
-- Do processing related to no small items present
END IF;
In these examples only single a record is retrieved in the presence/absence check.
Inequalities

If a query uses inequalities (item_no > 100) the optimizer must estimate the
number of rows returned before it can decide the best way to retrieve the data.
This estimation is prone to errors. If you are aware of the data and it's distribution
you can use optimizer hints to encourage or discourage full table scans to improve
performance.
If an index is being used for a range scan on the column in question, the
performance can be improved by substituting >= for >. In this case, item_no >
100 becomes item_no >= 101. In the first case, a full scan of the index will occur.
In the second case, Oracle jumps straight to the first index entry with an item_no
of 101 and range scans from this point. For large indexes this may significantly
reduce the number of blocks read.
When Things Look Bad!
If you have a process/script that shows poor performance you should do the
following:

Write sensible queries in the first place!

Identify the specific statement(s) that are causing a problem. The simplest
way to do this is to use SQL Trace, but you can try running the individual
statements using SQL*Plus and timing them (SET TIMING ON)

Use EXPLAIN to look at the execution plan of the statement. Look for any full
table accesses that look dubious. Remember, a full table scan of a small table
is often more efficient than access by index.

Check to see if there are any indexes that may help performance.

Try adding new indexes to the system to reduce excessive full table scans.
Typically, foreign key columns should be indexed as these are regularly used
in join conditions. On occasion it may be necessary to add composite
(concatenated) indexes that will only aid individual queries. Remember,
excessive indexing can reduce INSERT, UPDATE and DELETE performance.
Driving Tables (RBO Only)
The structure of the FROM and WHERE clauses of DML statements can be tailored to
improve the performance of the statement. The rules vary depending on whether
the database engine is using the Rule or Cost based optimizer. The situation is
further complicated by the fact that the engine may perform a Merge Join or a
Nested Loop join to retrieve the data. Despite this, there are a few rules you can
use to improve the performance of your SQL.
Oracle processes result sets a table at a time. It starts by retrieving all the data for
the first (driving) table. Once this data is retrieved it is used to limit the number of
rows processed for subsequent (driven) tables. In the case of multiple table joins,
the driving table limits the rows processed for the first driven table. Once
processed, this combined set of data is the driving set for the second driven table
etc. Roughly translated into English, this means that it is best to process tables that
will retrieve a small number of rows first. The optimizer will do this to the best of
it's ability regardless of the structure of the DML, but the following factors may
help.
Both the Rule and Cost based optimizers select a driving table for each query. If a
decision cannot be made, the order of processing is from the end of the FROM

clause to the start. Therefore, you should always place your driving table at the end
of the FROM clause. Subsequent driven tables should be placed in order so that
those retrieving the most rows are nearer to the start of the FROM clause.
Confusingly, the WHERE clause should be writen in the opposite order, with the
driving tables conditions first and the final driven table last. ie.
FROM d, c, b, a
WHERE a.join_column = 12345
AND a.join_column = b.join_column
AND b.join_column = c.join_column
AND c.join_column = d.join_column;
If we now want to limit the rows brought back from the "D" table we may write the
following.
FROM d, c, b, a
WHERE a.join_column = 12345
AND a.join_column = b.join_column
AND b.join_column = c.join_column
AND c.join_column = d.join_column
AND d.name = 'JONES';
Depending on the number of rows and the presence of indexes, Oracle my now pick
"D" as the driving table. Since "D" now has two limiting factors (join_column and
name), it may be a better candidate as a driving table so the statement may be
better written as follows.
FROM c, b, a, d
WHERE d.name = 'JONES'
AND d.join_column = 12345
AND d.join_column = a.join_column
AND a.join_column = b.join_column
AND b.join_column = c.join_column
This grouping of limiting factors will guide the optimizer more efficiently making
table "D" return relatively few rows, and so make it a more efficient driving table.
Remember, the order of the items in both the FROM and WHERE clause will not
force the optimizer to pick a specific table as a driving table, but it may influence
it's decision. The grouping of limiting conditions onto a single table will reduce the
number of rows returned from that table, and will therefore make it a stronger
candidate for becoming the driving table.
Caching Tables
Queries will execute much faster if the data they reference is already cached. For
small frequently used tables performance may be improved by caching tables.
Normally, when full table scans occur, the cached data is placed on the Least
Recently Used (LRU) end of the buffer cache. This means that it is the first data to
be paged out when more buffer space is required. If the table is cached (ALTER
TABLE employees CACHE;) the data is placed on the Most Recently Used (MRU) end
of the buffer, and so is less likely to be paged out before it is re-queried. Caching
tables may alter the CBO's path through the data and should not be used without
careful consideration.
Improving Parse Speed

Execution plans for SELECT statements are cached by the server, but unless the
exact same statement is repeated the stored execution plan details will not be
reused. Even differing spaces in the statement will cause this lookup to fail. Use of
bind variables allows you to repeatedly use the same statements whilst changing
the WHERE clause criteria. Assuming the statement does not have a cached
execution plan it must be parsed before execution. The parse phase for statements
can be decreased by efficient use of aliasing. If an alias is not present, the engine
must resolve which tables own the specified columns. The following is an example.
Bad Statement
Good Statement
SELECT e.first_name,
SELECT first_name,
e.last_name,
last_name,
c.country
country
FROM employee e,
FROM employee,
countries c
countries
WHERE e.country_id =
WHERE country_id = id
c.id
AND lastname =
AND e.last_name =
'HALL';
'HALL';
Packages Procedures and Functions
When an SQL statement, or anonymous block, is passed to the server it is
processed in three phases.
Phase
Actions
Parse
Syntax Check and Object Resolution
Execution Necessary Reads and Writes performed
Fetch
Resultant rows are Retrieved, Assembled, Sorted and Returned
The Parse phase is the most time and resource intensive. This phase can be
avoided if all anonymous blocks are stored as Database Procedures, Functions,
Packages or Views. Being database objects their SQL text and compiled code is
stored in Data Dictionary and the executable copies reside in the Shared Pool.

Posted 7th June 2012 by Prafull Dangore


0

Add a comment

Jun
7

Function : NVL2 and COALESCE


NVL2
The NVL2 function accepts three parameters. If the first parameter value is not null it returns
the value in the second parameter. If the first parameter value is null, it returns the third

parameter.
The following query shows NVL2 in action.
SQL> SELECT * FROM null_test_tab ORDER BY id;
ID COL1
COL2
COL3
COL4
---------- ---------- ---------- ---------- ---------1 ONE
TWO
THREE
FOUR
2
TWO
THREE
FOUR
3
THREE
FOUR
4
THREE
THREE
4 rows selected.
SQL> SELECT id, NVL2(col1, col2, col3) AS output FROM null_test_tab ORDER BY id;
ID OUTPUT
---------- ---------1 TWO
2 THREE
3 THREE
4 THREE
4 rows selected.
SQL>
COALESCE
The COALESCE function was introduced in Oracle 9i. It accepts two or more parameters and
returns the first non-null value in a list. If all parameters contain null values, it returns null.
SQL> SELECT id, COALESCE(col1, col2, col3) AS output FROM null_test_tab ORDER BY id;
ID OUTPUT
---------- ---------1 ONE
2 TWO
3 THREE
4 THREE
4 rows selected.
SQL>

Posted 7th June 2012 by Prafull Dangore


0

Add a comment

Feb
8

Load the session statistics such as Session


Start & End Time, Success Rows, Failed
Rows and Rejected Rows etc. into a database
table for audit/log purpose.
Scenario:
Load the session statistics such as Session Start & End Time, Success Rows, Failed
Rows and Rejected Rows etc. into a database table for audit/log purpose.
Solution:

After performing the below solution steps your end workflow will look as follows:

START => SESSION1 => ASSIGNMENT TASK => SESSION2


SOLUTION STEPS
SESSION1
This session is used to achieve your actual business logic. Meaning this session will
perform your actual data load. It can be anything File Table.File or TableTable, File

WORKFLOW VARIABLES
Create the following workflow variables.
=>
=>
=>
=>
=>

$$Workflowname
$$SessionStartTime
$$SessionEndTime
$$TargetSuccessrows
$$TargetFailedRows

ASSIGNMENT TASK
Use the Expression tab in the Assignment Task and assign as follows:
$$workflowname = $PMWorkflowName
$$sessionStartTime = $ SESSION1.StartTime
$$SessionEndTime = $ SESSION1.Endtime

$$ TargetSuccessrows = $ SESSION1. TgtSuccessRows


$$ TargetFailedRows = $ SESSION1. TgtFailedRows
SESSION2
This session is used to load the session statistics into a database table.
=> This should call a mapping say m_sessionLog
=> This mapping m_sessionLog should have mapping variables for the above
defined workflow variables such as $$wfname, $$Stime, $$Etime, $$TSRows and $
$TFRows.
=> This mapping m_sessionLog should use a dummy source and it must have a
expression transformation and a target => database Audit table)
=> Inside the expression you must assign the mapping variables to the output ports
workflowname=$$wfname
starttime=$$Stime
endtime=$$Etime
SucessRows=$$TSRows
FailedRows=$$TFRows
=> Create a target database table with the following columns
Workflowname, start time, end time, success rows and failed rows.
=> Connect all the required output ports to the target which is nothing but your
audit table.
PRE-Session Variable
=> Session 2: In the Pre-session variable assignment tab assign the mapping
variable = workflow variable
=> In our case
$$wfname=$$workflowname
$$Stime=$$sessionStartTime
$$Etime=$$sessionEndTime
$$TSRows=$$TargetSuccessrows
$$TFRows=$$TargetFailedrows
Workflow Execution

Posted 8th February 2012 by Prafull Dangore


0

Add a comment

Dec
30

Use Target File Path in Parameter File


Scenario:

I want to use mapping parameter to store target file path. My question is can define
file path in parameter file? If possible can anyone explain how to assign target file
path as parameter?
Solution:
You can define the file path in parameter file.
$OutputFileName=your file path here
Give the above mentioned parameter in your parameter file.

Posted 30th December 2011 by Prafull Dangore


0

Add a comment

Dec
27

Insert and reject records using update


strategy.
Scenario:
Insert and reject records using update strategy.
There is an emp table and from that table insert the data to targt where sal<3000
and reject other rows.
Solution:
1. connect out-puts from SQF to Update Strategy transformation.
2. In properties of Update Strategy write the condition like this

IIF(SAL<3000,DD_INSERT,DD_REJECT)
3. Connectthe Update Strategy to target

Posted 27th December 2011 by Prafull Dangore


0

Add a comment

Dec
27

Convert Numeric Value to Date Format


Scenario:
Suppose you are importing a flat file emp.csv and hire_date colummn is in numeric
format, like 20101111 .Our objective is convert it to date,with a format 'YYYYMMDD'.
Source
EMPNO
HIRE_DATE(numeric)
----------------1
20101111
2
20090909
Target
EMPNO
HIRE_DATE (date)
---------------1
11/11/2010
2
09/09/2009
Solution:

1. Connect SQF to an expression.


2. In expression make hire_date as input only and make another port hire_date1
as o/p port with date data type.
3. In o/p port of hire_date write condition like as below
TO_DATE(TO_CHAR(hire_date),YYYYMMDD)

Posted 27th December 2011 by Prafull Dangore


1

View comments

Dec
26

How to change a string to decimal with 2


decimal places in informatica?
Scenario:
How to change a string to decimal with 2 decimal places in informatica?

Eg:: input data 12345678


I want output as 123456.78
Solution:
output = to_decimal(to_integer(input)/100,2)
OR
SUBSTR(INPUT_FIELD, 1, LENGTH(INPUT_FIELD) - 2) || '.' || SUBSTR(INPUT_FIELD, -2)

Posted 26th December 2011 by Prafull Dangore


0

Add a comment

Dec
26

Append the data in a flat file for a daily run


Scenario:
I have the flat file in our server location; I want to append the data in a flat file for a
daily run.
Solution:
We have an option in Informatica "Append if exists" in target session properties.

Posted 26th December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Convert Day No. to corresponding month


and date of year

Scenario:
Suppose you have a source is like this
Source
E_NO YEAR
DAYNO
------ --------- --------1
01-JAN-07
301
2
01-JAN-08
200
Year column is a date and dayno is numeric that represents a day ( as in 365 for
31-Dec-Year). Convert the Dayno to corresponding year's month and date and then
send to target.
Target
E_NO
YEAR_MONTH_DAY
-------------- ---------1
29-OCT-07
2
19-JUL-08
Solution:
Use below date format in exp transformation
Add_to_date(YEAR,DD,DAYNO)

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

How to delete duplicate rows in a table?


Scenario:
How to delete duplicate rows in a table?
Solution:
delete from emp a where rowid != (select max(rowid) from emp b where
a.empno=b.empno);
OR
delete from emp a where rowid != (select min(rowid) from emp b where
a.empno=b.empno);

Posted 22nd December 2011 by Prafull Dangore

Add a comment

Dec
22

How to get nth max salaries ?


Scenario:
How to get nth max salaries ?
Solution:
select distinct hiredate from emp a where &n = (select count(distinct sal) from emp
b where a.sal >= b.sal);

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

How to get 3 Max & Min salaries?


Scenario:
How to get 3 Max & Min salaries?
Solution:
Max - select distinct sal from emp a where 3 >= (select count(distinct sal) from emp
b where a.sal <= b.sal) order by a.sal desc;
Min - select distinct sal from emp a where 3 >= (select count(distinct sal) from emp
b where a.sal >= b.sal);

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Find FIRST & LAST n records from a table.


Scenario:
Find FIRST & LAST n records from a table.
Solution:
First - select * from emp where rownum <= &n;
Last - select * from emp minus select * from emp where rownum <= (select
count(*) - &n from emp);

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Find the 3rd MAX & MIN salary in the emp


table
Scenario:
Find the 3rd MAX & MIN salary in the emp table
Solution:
Max select distinct sal from emp e1 where 3 =
(select count(distinct sal) from emp e2 where e1.sal <= e2.sal);
Min select distinct sal from emp e1 where 3 =

(select count(distinct sal) from emp e2 where e1.sal >= e2.sal);

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Sql query to find EVEN & ODD


NUMBERED records from a table.
Scenario:
Sql query to find EVEN & ODD NUMBERED records from a table.
Solution:
Even - select * from emp where rowid in (select
decode(mod(rownum,2),0,rowid, null) from emp);
Odd - select * from emp where rowid in (select decode(mod(rownum,2),0,null
,rowid) from emp);

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

SQL questions which are the most frequently


asked in interviews.
Complex Queries in SQL ( Oracle )

To fetch ALTERNATE records from a table. (EVEN NUMBERED)


select * from emp where rowid in (select decode(mod(rownum,2),0,rowid, null) from
emp);
To select ALTERNATE records from a table. (ODD NUMBERED)
select * from emp where rowid in (select decode(mod(rownum,2),0,null ,rowid) from
emp);
Find the 3rd MAX salary in the emp table.
select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp e2
where e1.sal <= e2.sal);
Find the 3rd MIN salary in the emp table.
select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp
e2where e1.sal >= e2.sal);
Select FIRST n records from a table.
select * from emp where rownum <= &n;
Select LAST n records from a table
select * from emp minus select * from emp where rownum <= (select count(*) - &n
from emp);
List dept no., Dept name for all the departments in which there are no
employees in the department.
select * from dept where deptno not in (select deptno from emp);
alternate solution: select * from dept a where not exists (select * from emp b where
a.deptno = b.deptno);
altertnate solution: select empno,ename,b.deptno,dname from emp a, dept b
where a.deptno(+) = b.deptno and empno is null;
How to get 3 Max salaries ?
select distinct sal from emp a where 3 >= (select count(distinct sal) from emp b
where a.sal <= b.sal) order by a.sal desc;
How to get 3 Min salaries ?
select distinct sal from emp a where 3 >= (select count(distinct sal) from emp b
where a.sal >= b.sal);
How to get nth max salaries ?
select distinct hiredate from emp a where &n = (select count(distinct sal) from emp
b where a.sal >= b.sal);
Select DISTINCT RECORDS from emp table.
select * from emp a where rowid = (select max(rowid) from emp b where
a.empno=b.empno);
How to delete duplicate rows in a table?
delete from emp a where rowid != (select max(rowid) from emp b where
a.empno=b.empno);
Count of number of employees in department wise.
select count(EMPNO), b.deptno, dname from emp a, dept b where
a.deptno(+)=b.deptno group by b.deptno,dname;
Suppose there is annual salary information provided by emp table. How
to fetch monthly salary of each and every employee?
select ename,sal/12 as monthlysal from emp;
Select all record from emp table where deptno =10 or 40.
select * from emp where deptno=30 or deptno=10;

Select all record from emp table where deptno=30 and sal>1500.
select * from emp where deptno=30 and sal>1500;
Select all record from emp where job not in SALESMAN or CLERK.
select * from emp where job not in ('SALESMAN','CLERK');
Select all record from emp where ename in
'BLAKE','SCOTT','KING'and'FORD'.
select * from emp where ename in('JONES','BLAKE','SCOTT','KING','FORD');
Select all records where ename starts with S and its lenth is 6 char.
select * from emp where ename like'S____';
Select all records where ename may be any no of character but it should
end with R.
select * from emp where ename like'%R';
Count MGR and their salary in emp table.
select count(MGR),count(sal) from emp;
In emp table add comm+sal as total sal .
select ename,(sal+nvl(comm,0)) as totalsal from emp;
Select any salary <3000 from emp table.
select * from emp where sal> any(select sal from emp where sal<3000);
Select all salary <3000 from emp table.
select * from emp where sal> all(select sal from emp where sal<3000);
Select all the employee group by deptno and sal in descending order.
select ename,deptno,sal from emp order by deptno,sal desc;
How can I create an empty table emp1 with same structure as emp?
Create table emp1 as select * from emp where 1=2;
How to retrive record where sal between 1000 to 2000?
Select * from emp where sal>=1000 And sal<2000
Select all records where dept no of both emp and dept table matches.
select * from emp where exists(select * from dept where emp.deptno=dept.deptno)
If there are two tables emp1 and emp2, and both have common record.
How can I fetch all the recods but common records only once?
(Select * from emp) Union (Select * from emp1)
How to fetch only common records from two tables emp and emp1?
(Select * from emp) Intersect (Select * from emp1)
How can I retrive all records of emp1 those should not present in emp2?
(Select * from emp) Minus (Select * from emp1)
Count the totalsa deptno wise where more than 2 employees exist.
SELECT deptno, sum(sal) As totalsal
FROM emp
GROUP BY deptno
HAVING COUNT(empno) > 2

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Complex Queries in SQL ( Oracle )


Complex Queries in SQL ( Oracle )
To fetch ALTERNATE records from a table. (EVEN NUMBERED)
select * from emp where rowid in (select decode(mod(rownum,2),0,rowid, null) from
emp);
To select ALTERNATE records from a table. (ODD NUMBERED)
select * from emp where rowid in (select decode(mod(rownum,2),0,null ,rowid) from
emp);
Find the 3rd MAX salary in the emp table.
select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp e2
where e1.sal <= e2.sal);
Find the 3rd MIN salary in the emp table.
select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp
e2where e1.sal >= e2.sal);
Select FIRST n records from a table.
select * from emp where rownum <= &n;
Select LAST n records from a table
select * from emp minus select * from emp where rownum <= (select count(*) - &n
from emp);
List dept no., Dept name for all the departments in which there are no
employees in the department.
select * from dept where deptno not in (select deptno from emp);
alternate solution: select * from dept a where not exists (select * from emp b where
a.deptno = b.deptno);
altertnate solution: select empno,ename,b.deptno,dname from emp a, dept b
where a.deptno(+) = b.deptno and empno is null;
How to get 3 Max salaries ?
select distinct sal from emp a where 3 >= (select count(distinct sal) from emp b
where a.sal <= b.sal) order by a.sal desc;
How to get 3 Min salaries ?
select distinct sal from emp a where 3 >= (select count(distinct sal) from emp b
where a.sal >= b.sal);
How to get nth max salaries ?
select distinct hiredate from emp a where &n = (select count(distinct sal) from emp
b where a.sal >= b.sal);
Select DISTINCT RECORDS from emp table.
select * from emp a where rowid = (select max(rowid) from emp b where
a.empno=b.empno);
How to delete duplicate rows in a table?
delete from emp a where rowid != (select max(rowid) from emp b where
a.empno=b.empno);

Count of number of employees in department wise.


select count(EMPNO), b.deptno, dname from emp a, dept b where
a.deptno(+)=b.deptno group by b.deptno,dname;
Suppose there is annual salary information provided by emp table. How
to fetch monthly salary of each and every employee?
select ename,sal/12 as monthlysal from emp;
Select all record from emp table where deptno =10 or 40.
select * from emp where deptno=30 or deptno=10;
Select all record from emp table where deptno=30 and sal>1500.
select * from emp where deptno=30 and sal>1500;
Select all record from emp where job not in SALESMAN or CLERK.
select * from emp where job not in ('SALESMAN','CLERK');
Select all record from emp where ename in
'BLAKE','SCOTT','KING'and'FORD'.
select * from emp where ename in('JONES','BLAKE','SCOTT','KING','FORD');
Select all records where ename starts with S and its lenth is 6 char.
select * from emp where ename like'S____';
Select all records where ename may be any no of character but it should
end with R.
select * from emp where ename like'%R';
Count MGR and their salary in emp table.
select count(MGR),count(sal) from emp;
In emp table add comm+sal as total sal .
select ename,(sal+nvl(comm,0)) as totalsal from emp;
Select any salary <3000 from emp table.
select * from emp where sal> any(select sal from emp where sal<3000);
Select all salary <3000 from emp table.
select * from emp where sal> all(select sal from emp where sal<3000);
Select all the employee group by deptno and sal in descending order.
select ename,deptno,sal from emp order by deptno,sal desc;
How can I create an empty table emp1 with same structure as emp?
Create table emp1 as select * from emp where 1=2;
How to retrive record where sal between 1000 to 2000?
Select * from emp where sal>=1000 And sal<2000
Select all records where dept no of both emp and dept table matches.
select * from emp where exists(select * from dept where emp.deptno=dept.deptno)
If there are two tables emp1 and emp2, and both have common record.
How can I fetch all the recods but common records only once?
(Select * from emp) Union (Select * from emp1)
How to fetch only common records from two tables emp and emp1?
(Select * from emp) Intersect (Select * from emp1)
How can I retrive all records of emp1 those should not present in emp2?
(Select * from emp) Minus (Select * from emp1)
Count the totalsa deptno wise where more than 2 employees exist.
SELECT deptno, sum(sal) As totalsal
FROM emp
GROUP BY deptno
HAVING COUNT(empno) > 2

Posted 22nd December 2011 by Prafull Dangore


3

View comments

Dec
22

Informatica Quiz: Set 2


Quiz: Informatica Set 2

A lookup transformation is used to look up data in


Explanation:
flat file

Relational table

view

synonyms

All of the above (correct)

Which value returned by NewLookupRow port says that Integration Service does not update or
insert the row in the cache?
Explanation:
3 (wrong)

Which one need a common key to join?


Explanation:
source qualifier

joiner (correct)

look up

Which one support hetrogeneous join?

Explanation:
source qualifier

joiner (correct)

look up

What is the use of target loader?


Explanation:
Target load order is first the data is load in dimension table and then fact table.

Target load order is first the data is load in fact table and then dimensional table.

Load the data from different target at same time. (wrong)

Which one is not tracing level?


Explanation:
terse

verbose

initialization

verbose initialization

terse initialization (correct)

Which output file is not created during session running?


Explanation:
Session log

workflow log

Error log

Bad files

cache files (correct)

Is Fact table is normalised ?


Explanation:
yes

no (correct)

Which value returned by NewLookupRow port says that Integration Service inserts the row into
the cache?
Explanation:

0 (wrong)

Which transformation only works on relational source?


Explanation:
lookup

Union

joiner

Sql (correct)

Which are both connected and unconnected?


Explanation:
External Store Procedure (omitted)

Stote Procedure (correct)

Lookup (correct)

Advanced External Procedure Transformation

Can we generate alpha-numeric value in sequence generator?


Explanation:
yes

no (correct)

Which transformation is used by cobol source?


Explanation:
Advanced External Procedure Transformation

Cobol Transformation

Unstructured Data Transformation

Normalizer (correct)

What is VSAM normalizer transformation?


Explanation:
The VSAM normalizer transformation is the source qualifier transformation for a
COBOL source definition.

The VSAM normalizer transformation is the source qualifier transformation for a flat file
source definition.
The VSAM normalizer transformation is the source qualifier transformation for a xml
source definition. (wrong)
Non of these

What is VSAM normalizer transformation?


Explanation:
The VSAM normalizer transformation is the source qualifier transformation for a
COBOL source definition.
The VSAM normalizer transformation is the source qualifier transformation for a flat file
source definition.
The VSAM normalizer transformation is the source qualifier transformation for a xml
source definition. (wrong)
Non of these

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

Informatica Quiz: Set 1


Quiz: Informatica Set 1

Which one is not correct about filter transformation?


Explanation: Filter generally parses single condition. For multiple condition we can use router
Act as a 'where' condition

Can't passes multiple conditions

Act like 'Case' in pl/sql (wrong)

If one record does not match condition, the record is blocked

Can we calculate in aggrigator ?


Explanation:

No

Yes (correct)

Which one is not a type of fact?


Explanation:
Semi-aditive

Additive

Confirm fact

Not additive (wrong)

Which one is not a type of dimension ?


Explanation:
Conformed dimension

Rapidly changing dimension (correct)

Junk dimension

Degenerated dimension

Which of these not correct about Code Page?


Explanation:
A code page contains encoding to specify characters in a set of one or more languages

A code page contains decoding to specify characters in a set of one or more languages

In this way application stores, receives, and sends character data.

Non of these (wrong)

What is a mapplet?
Explanation:
Combination of reusable transformation.

Combination of reusable mapping

Set of transformations and it allows us to reuse (correct)

Non of these

What does reusable transformation mean?


Explanation:
It can be re-used across repositories

I can only be used in mapplet.

It can use in multiple mapping only once

It can use in multiple mapping multiple times (correct)

Which one is not an option in update strategy?

Explanation:
dd_reject

4 (correct)

dd_delete

Can we update records without using update strategy?


Explanation:
Yes (correct)

No

How to select distinct records form Source Qualifier?


Explanation:
Choose 'non duplicate' option

Choose 'select distinct' option (correct)

Choose 'Select non duplicate'

What type of repository is no available in Informatica Repository Manager?


Explanation:
Standalone Repository

Local Repository

User Defined

Versioned Repository

Manual Repository (wrong)

Joiner does not support flat file.


Explanation:
False (correct)

True

How to execute PL/SQL script from Informatica mapping?


Explanation:
Lookup

Store Procdure (correct)

Expression

Non of these

NetSal= bassic+hra. In which transformation we can achive this?


Explanation:

Aggrigator

Lookup

Filter

Expression (correct)

Which one is not an active transformation?


Explanation:
Sequence generator

Normalizer

Sql

Store Procedure (wrong)

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
22

How large is the database,used and free


space?
Scenario:
How large is the database,used and free space?
Solution:
select
,
,
from

round(sum(used.bytes) / 1024 / 1024 / 1024 ) || ' GB' "Database Size"


round(sum(used.bytes) / 1024 / 1024 / 1024 ) round(free.p / 1024 / 1024 / 1024) || ' GB' "Used space"
round(free.p / 1024 / 1024 / 1024) || ' GB' "Free space"
(select bytes
from
v$datafile
union all
select bytes
from
v$tempfile
union all
select bytes
from
v$log) used

(select sum(bytes) as p
from dba_free_space) free
group by free.p

Posted 22nd December 2011 by Prafull Dangore


0

Add a comment

Dec
20

Batch File to Append Date to file name


Scenario:
Batch File to Append Date to file name
Solution:
@echo off
REM Create a log file with the current date and time in the filename
REM
the ~4 in the Date skips the first four characters of the echoed date stamp
and writes the remainder and so on
set LOG_FILE_NAME=Example_File_Name.%date:~4,2%%date:~7,2%
%date:~10,4%.%time:~0,2%%time:~3,2%%time:~6,2%.txt
Echo This is much easier in UNIX > c: emp\%LOG_FILE_NAME%
:exit
OR
@echo off
for /F "tokens=2,3,4 delims=/ " %%i in ('date/t') do set y=%%k
for /F "tokens=2,3,4 delims=/ " %%i in ('date/t') do set d=%%k%%i%%j
for /F "tokens=5-8 delims=:. " %%i in ('echo.^| time ^| find "current" ') do set t=%
%i%%j
set t=%t%_
if "%t:~3,1%"=="_" set t=0%t%
set t=%t:~0,4%
set "theFilename=%d%%t%"
echo %theFilename%

Posted 20th December 2011 by Prafull Dangore


0

Add a comment

Dec

19

PL/SQL Interview Questions


1. What is PL/SQL ?
PL/SQL is Oracle's Procedural Language extension to SQL. PL/SQL's language syntax, structure and
datatypes are similar to that of ADA. The language includes object oriented programming techniques
such as encapsulation, function overloading, information hiding (all but inheritance), and so, brings
state-of-the-art programming to the Oracle database server and a variety of Oracle tools.
PL SQL is a block structured programming language. It combines data manipulation & data processing
power. It supports all SQL data types. Also has its own data types i,e BOOLEAN,BINARY INTEGER
2. What is the basic structure of PL/SQL ?
A PL/SQL block has three parts:
a declarative part,
an executable part,
and an exception-handling part.

First comes the declarative part, in which items can


be declared. Once declared, items can be manipulated in the executable part.
Exceptions raised during execution can be dealt with in the exception-handling
part.
3. What are the components of a PL/SQL block ?

PL/SQL Block contains :


Declare : optional
Variable declaration
Begin : Manadatory
Procedural statements.
Exception : Optional
any errors to be trapped
End : Mandatory
5. What are the datatypes a available in PL/SQL ?
Following are the datatype supported in oracle PLSQL
Scalar Types
BINARY_INTEGER
DEC
DECIMAL
DOUBLE PRECISION
FLOAT
INT
INTEGER
NATURAL
NATURALN

NUMBER
NUMERIC
PLS_INTEGER
POSITIVE
POSITIVEN
REAL
SIGNTYPE
SMALLINT
CHAR
CHARACTER
LONG
LONG RAW
NCHAR
NVARCHAR2
RAW
ROWID
STRING
UROWID
VARCHAR
VARCHAR2
DATE
INTERVAL DAY TO SECOND
INTERVAL YEAR TO MONTH
TIMESTAMP
TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP WITH TIME ZONE
BOOLEAN
Composite Types
RECORD
TABLE
VARRAY
LOB Types
BFILE
BLOB
CLOB
NCLOB
Reference Types
REF CURSOR
REF object_type
6.

What are % TYPE and % ROWTYPE ? What are the advantages of using these over datatypes?% TYPE
provides the data type of a variable or a database column to that variable.
% ROWTYPE provides the record type that represents a entire row of a table or view or columns
selected in the cursor.

The advantages are :


I. Need not know about variable's data type
ii. If the database definition of a column in a table changes, the data type of a variable changes
accordingly.

Advantage is, if one change the type or size of the column in the table, it will be reflected in our
program unit without making any change.
%type is used to refer the column's datatype where as %rowtype is used to refer the whole
record in a table.
7. What is difference between % ROWTYPE and TYPE RECORD ?
% ROWTYPE is to be used whenever query returns a entire row of a table or view.
TYPE rec RECORD is to be used whenever query returns columns of different table or views and
variables.
E.g. TYPE r_emp is RECORD (eno emp.empno% type,ename emp ename %type);
e_rec emp% ROWTYPE cursor c1 is select empno,deptno from emp;
e_rec c1 %ROWTYPE.
8. What is PL/SQL table ?
A PL/SQL table is a one-dimensional, unbounded, sparse collection of homogenous elements, indexed
by integers
One-dimensional
A PL/SQL table can have only one column. It is, in this way, similar to a one-dimensional array.
Unbounded or Unconstrained
There is no predefined limit to the number of rows in a PL/SQL table. The PL/SQL table grows
dynamically as you add more rows to the table. The PL/SQL table is, in this way, very different from
an array.
Related to this definition, no rows for PL/SQL tables are allocated for this structure when it is defined.
Sparse
In a PL/SQL table, a row exists in the table only when a value is assigned to that row. Rows do not
have to be defined sequentially. Instead you can assign a value to any row in the table. So row 15
could have a value of `Fox' and row 15446 a value of `Red', with no other rows defined in between.
Homogeneous elements
Because a PL/SQL table can have only a single column, all rows in a PL/SQL table contain values of the
same datatype. It is, therefore, homogeneous.
With PL/SQL Release 2.3, you can have PL/SQL tables of records. The resulting table is still, however,
homogeneous. Each row simply contains the same set of columns.
Indexed by integers
PL/SQL tables currently support a single indexing mode: by BINARY_INTEGER. This number acts as
the "primary key" of the PL/SQL table. The range of a BINARY_INTEGER is from -2 31-1 to 231-1, so you
have an awful lot of rows with which to work
9. What is a cursor ? Why Cursor is required ?
Cursor is a named private SQL area from where information can be accessed. Cursors are required
to process rows individually for queries returning multiple rows.
10. Explain the two type of Cursors ?

implicit cursor: implicit cursor is a type of cursor which is automatically maintained by the
Oracle server itself.implicit cursor returns only one row.
Explicit Cursor: Explicit Cursor is defined by the Proframmer,and it has for
phases:declare,open,fetch and close.explicit Cursor returns more than one row.
11. What are the PL/SQL Statements used in cursor processing ?
DECLARE CURSOR cursor name, OPEN cursor name, FETCH cursor name INTO or Record types,
CLOSE cursor name.
12. What are the cursor attributes used in PL/SQL ?
%ISOPEN - to check whether cursor is open or not
% ROWCOUNT - number of rows fetched/updated/deleted.
% FOUND - to check whether cursor has fetched any row. True if rows are fetched.
% NOT FOUND - to check whether cursor has fetched any row. True if no rows are featched.

These attributes are proceeded with SQL for Implicit Cursors and with Cursor name for Explicit
Cursors
13. What is a cursor for loop ?
Cursor for loop implicitly declares %ROWTYPE as loop index,opens a cursor, fetches rows of values
from active set into fields in the record and closes
when all the records have been processed.
eg. FOR emp_rec IN C1 LOOP
salary_total := salary_total +emp_rec sal;
END LOOP;
cursor for loop is use for automatically open ,fetch,close
15. Explain the usage of WHERE CURRENT OF clause in cursors ?
PL/SQL provides the WHERE CURRENT OF clause for both UPDATE and DELETE statements inside a
cursor in order to allow you to easily make changes to the most recently fetched row of data.
The general format for the WHERE CURRENT OF clause is as follows:

UPDATE table_name
SET set_clause WHERE CURRENT OF cursor_name;DELETE
table_name WHERE CURRENT OF cursor_name;

FROM

Notice that the WHERE CURRENT OF clause references the cursor and not the record into which the
next fetched row is deposited.
The most important advantage to using WHERE CURRENT OF where you need to change the row
fetched last is that you do not have to code in two (or more) places the criteria used to uniquely
identify a row in a table. Without WHERE CURRENT OF, you would need to repeat the WHERE clause of
your cursor in the WHERE clause of the associated UPDATEs and DELETEs. As a result, if the table
structure changes in a way that affects the construction of the primary key, you have to make sure
that each SQL statement is upgraded to support this change. If you use WHERE CURRENT OF, on the
other hand, you only have to modify the WHERE clause of the SELECT statement.
This might seem like a relatively minor issue, but it is one of many areas in your code where you can
leverage subtle features in PL/SQL to minimize code redundancies. Utilization of WHERE CURRENT OF,
%TYPE, and %ROWTYPE declaration attributes, cursor FOR loops, local modularization, and other
PL/SQL language constructs can have a big impact on reducing the pain you may experience when you
maintain your Oracle-based applications.
Let's see how this clause would improve the previous example. In the jobs cursor FOR loop above, I
want to UPDATE the record that was currently FETCHed by the cursor. I do this in the UPDATE
statement by repeating the same WHERE used in the cursor because (task, year) makes up the
primary key of this table:

WHERE task = job_rec.task

AND year = TO_CHAR (SYSDATE, 'YYYY');

This is a less than ideal situation, as explained above: I have coded the same logic in two places, and
this code must be kept synchronized. It would be so much more convenient and natural to be able to
code the equivalent of the following statements:
Delete the record I just fetched.
or:
Update these columns in that row I just fetched.
A perfect fit for WHERE CURRENT OF! The next version of my winterization program below uses this
clause. I have also switched to a simple loop from FOR loop because I want to exit conditionally from
the loop:

DECLARE
CURSOR fall_jobs_cur IS SELECT ... same as before ... ;
job_rec
fall_jobs_cur%ROWTYPE;BEGIN
OPEN fall_jobs_cur;
LOOP
FETCH
fall_jobs_cur INTO job_rec;
IF fall_jobs_cur%NOTFOUND
THEN
EXIT;
ELSIF job_rec.do_it_yourself_flag = 'YOUCANDOIT'
THEN
UPDATE winterize SET responsible = 'STEVEN'
WHERE CURRENT OF
fall_jobs_cur;
COMMIT;
EXIT;
END IF;
END LOOP;
CLOSE
fall_jobs_cur;END;

16. What is a database trigger ? Name some usages of database trigger ?


A database trigger is a stored procedure that is invoked automatically when a predefined event
occurs.
Database triggers enable DBA's (Data Base Administrators) to create additional relationships
between separate databases.
For example, the modification of a record in one database could trigger the modification of a
record in a second database.
17. How many types of database triggers can be specified on a table ? What are they ?
Insert

Update

Before Row
After Row

o.k.
o.k.

Before Statement
After Statement

Delete
o.k.
o.k.

o.k.
o.k.

o.k.
o.k.

o.k.
o.k.
o.k.
o.k.

If FOR EACH ROW clause is specified, then the trigger for each Row affected by the statement.
If WHEN clause is specified, the trigger fires according to the returned Boolean value.
the different types of triggers: * Row Triggers and Statement Triggers * BEFORE and AFTER Triggers *
INSTEAD OF Triggers * Triggers on System Events and User Events
18. What are two virtual tables available during database trigger execution ?
The table columns are referred as OLD.column_name and NEW.column_name.
For triggers related to INSERT only NEW.column_name values only available.
For triggers related to UPDATE only OLD.column_name NEW.column_name values only available.
For triggers related to DELETE only OLD.column_name values only available.
The two virtual table available are old and new.
19.What happens if a procedure that updates a column of table X is called in a database trigger of the same table ?

To avoid the mutation table error ,the procedure should be declared as an AUTONOMOUS
TRANSACTION.
By this the procedure will be treated as an separate identity.
20. Write the order of precedence for validation of a column in a table ?
21.

I. done using Database triggers.


ii. done using Integarity Constraints.
What is an Exception ? What are types of Exception ?

Predefined
Do not declare and allow the Oracle server to raise implicitly
NO_DATA_FOUND
TOO_MANY_ROWS
INVALID_CURSOR
ZERO_DIVIDE
INVALID_CURSOR
WHEN EXCEPTION THEN

Non predefined

Declare within the declarative section and allow allow Oracle server to
raise implicitly
SQLCODE Returns the numeric value for the seeor code

SQLERRM Returns the message associated with error number


DECLARE -- PRAGMA EXCEPTION_INIT (exception, error_number)
RAISE WHEN EXCEPTION_NAME THEN

User defined
Declare within the declarative section and raise explicitly.
IF confidition the
RAISE EXCEPTION or RAISE_APPLICATION_ERROR

22. What is Pragma EXECPTION_INIT ? Explain the usage ?


Pragma exception_init Allow you to handle the Oracle predefined message by you'r own
message. means you can instruct compiler toassociatethe specific message to oracle predefined
message at compile time.This way you Improve the Readbility of your program,and handle it
accoding to your own way.
It should be declare at the DECLARE section.
example
declare
salary number;
FOUND_NOTHING exception;
Pragma exception_init(FOUND_NOTHING ,100);
begin
select sal in to salaryfrom emp where ename ='ANURAG';
dbms_output.put_line(salary);
exception
WHEN FOUND_NOTHING THEN
dbms_output.put_line(SQLERRM);
end;
23. What is Raise_application_error ?
Raise_application_error is used to create your own error messages which can be more
descriptive than named exceptions.
Syntax is:Raise_application_error (error_number,error_messages);
where error_number is between -20000 to -20999..
24. What are the return values of functions SQLCODE and SQLERRM ?
Pl / Sql Provides Error Information via two Built-in functions, SQLCODE & SQLERRM.
SQLCODE Returns the Current Error Code.
Returns 1.
SQLERRM Returns the Current Error Message Text.
Returns " User Defined Exception "
25. Where the Pre_defined_exceptions are stored ?
PL/SQL declares predefined exceptions in the STANDARDpackage.
26. What is a stored procedure ?
Stored Procedure is the PlSQL subprgram stored in the databasse .
Stored Procedure
A program running in the database that can take complex actions based on the inputs you send it. Using
a stored procedure is faster than doing the same work on a client, because the program runs right
inside the database server. Stored procedures are nomally written in PL/SQL or Java.

advantages fo Stored Procedure


Extensibility,Modularity, Reusability, Maintainability and one time compilation.
28. What are the modes of parameters that can be passed to a procedure ?
1.in:
in parameter mode is used to pass values to subprogram when invoked.
2.out:
out is used to return values to callers of subprograms
3.in out:
it is used to define in and out
29. What are the two parts of a procedure ?
PROCEDURE name (parameter list.....)
is
local variable declarations
BEGIN
Executable statements.
Exception.
exception handlers
end;
31. Give the structure of the function ?
FUNCTION name (argument list .....) Return datatype is
local variable declarations
Begin
executable statements
Exception
execution handlers
End;
32. Explain how procedures and functions are called in a PL/SQL block ?

Procedure can be called in the following ways


a) CALL <procedure name> direc
b) EXCECUTE <procedure name> from calling environment
c) <Procedure name> from other procedures or functions or packages
Functions can be called in the following ways
a) EXCECUTE <Function name> from calling environment. Always use a variable to get the
return value.
b) As part of an SQL/PL SQL Expression
33. What are two parts of package ?
The two parts of package are PACKAGE SPECIFICATION & PACKAGE BODY.
Package Specification contains declarations that are global to the packages and local to the schema.
Package Body contains actual procedures and local declaration of the procedures and cursor
declarations.
33.What is difference between a Cursor declared in a procedure and Cursor declared in a package specification ?
A cursor declared in a package specification is global and can be accessed by other procedures or
procedures in a package.

A cursor declared in a procedure is local to the procedure that can not be accessed by other
procedures.

The scope of A cursor declared in a procedure is limited to that procedure only.


The Scope of cursor declared in a package specification is global .
Example:
create or replace package curpack is
cursor c1 is select * from emp;
end curpack;
This will create a package Now You can use this cursor any where.
Like:
set serveroutput on
begin
for r1 in curpack.c1 loop
dbms_output.put_line(r1.empno||' '||r1.ename);
end loop;
end;
this will dispaly all empno and enames.
It will be better to use ref cursor in packages
35. How packaged procedures and functions are called from the following?
a. Stored procedure or anonymous block
b. an application program such a PRC *C, PRO* COBOL
c. SQL *PLUS
a. PACKAGE NAME.PROCEDURE NAME (parameters);
variable := PACKAGE NAME.FUNCTION NAME (arguments);
EXEC SQL EXECUTE
b.
BEGIN
PACKAGE NAME.PROCEDURE NAME (parameters)
variable := PACKAGE NAME.FUNCTION NAME
(arguments);
END;
END EXEC;
c. EXECUTE PACKAGE NAME.PROCEDURE if the procedures does not have any
out/in-out parameters. A function can not be called.
36.Name the tables where characteristics of Package, procedure and functions are stored ?
The Data dictionary tables/ Views where the characteristics of subprograms and Packages are stored
are mentioned below
a) USER_OBJECTS, ALL_OBJECTS, DBA_OBJECTS
b) USER_SOURCE, ALL_SOURCE, DBA_SOURCE
c) USER_DEPENCENCIES
d) USER_ERRORS, ALL_ERRORS, DBA_ERRORS
37. What is Overloading of procedures ?

Overloading procs are 2 or more procs with the same name but different arguments.
Arguments needs to be different by class it self. ie char and Varchar2 are from same class.
Packages The main advantages of packages are 1- Since packages has specification and body separate so, whenever any ddl is run and if any
proc/func(inside pack) is dependent on that, only body gets invalidated and not the spec. So any

other proc/func dependent on package does not gets invalidated.


2- Whenever any func/proc from package is called, whole package is loaded into memory and
hence all objects of pack is availaible in memory which means faster execution if any is called.
And since we put all related proc/func in one package this feature is useful as we may need to
run most of the objects.
3- we can declare global variables in the package
38.Is it possible to use Transaction control Statements such a ROLLBACK or COMMIT in Database Trigger ? Why ?
Autonomous Transaction is a feature of oracle 8i which maintains the state of
its transactions and save it , to affect with the commit or rollback of the
surrounding transactions.
Here is the simple example to understand this :ora816 SamSQL :> declare
2 Procedure InsertInTest_Table_B
3 is
4 BEGIN
5 INSERT into Test_Table_B(x) values (1);
6 Commit;
7 END ;
8 BEGIN
9 INSERT INTO Test_Table_A(x) values (123);
10 InsertInTest_Table_B;
11 Rollback;
12 END;
13 / PL/SQL procedure successfully completed.
ora816 SamSQL :> Select * from Test_Table_A; X---------- 123
ora816 SamSQL :> Select * from Test_Table_B; X---------- 1
Notice in above pl/sql COMMIT at line no 6 , commits the transaction at
line-no 5 and line-no 9. The Rollback at line-no 11 actually did nothing.
Commit/ROLLBACK at nested transactions will commit/rollback all other DML
transaction before that. PRAGMA AUTONOMOUS_TRANSACTION override this behavior.
Let us the see the following example with PRAGMA AUTONOMOUS_TRANSACTION.
39. What is difference between a PROCEDURE & FUNCTION ?

A function always return a values while procedure can return one or more values
through Parameters.
A function can call directly by sql statement like select "func_name" from dual while
procedure cannot.
40. What is Data Concarency and Consistency?

Concurrency
How well can multiple sessions access the same data simultaneously

Consistency
How consistent is the view of the data between and within multiple sessions, transactions or
statements
41. Talk about "Exception Handling" in PL/SQL?

the exception are written to handle the exceptions thrown by programs.


we have user defined and system exception.
user defined exception are the exception name given by user (explicitly decalred and used) and
they are raised to handle the specific behaviour of program.
system exceptions are raised due to invalid data(you dont have to deaclre these). few examples
are when no_data_found, when others etc.
44. Can we use commit or rollback command in the exception part of PL/SQL block?
Yes, we can use the TCL commands(commit/rollback) in the exception block of a stored
procedure/function. The code in this part of the program gets executed like those in the body without
any restriction. You can include any business functionality whenever a condition in main block(body of
a proc/func) fails and requires a follow-thru process to terminate the execution gracefully!
DECALRE
..
BEGIN
.
EXCEPTION
WHEN NO_DATA_FOUND THEN
INSERT INTO err_log(
err_code, code_desc)
VALUES(1403, No data found)
COMMIT;
RAISE;
END
46. What is bulk binding please explain me in brief ?

Bulk Binds (BULK COLLECT , FORALL ) are a PL/SQL technique where, instead of multiple
individual SELECT, INSERT, UPDATE or DELETE statements are executed to retrieve from, or
store data in, at table, all of the operations are carried out at once, in bulk.
This avoids the context-switching you get when the PL/SQL engine has to pass over to the SQL
engine, then back to the PL/SQL engine, and so on, when you individually access rows one at a
time. To do bulk binds with Insert, Update and Delete statements, you enclose the SQL statement
within a PL/SQL FORALL statement.
To do bulk binds with Select statements, you include the Bulk Collect INTO a collection clause
in the SELECT Statement instead of using Simply into .
Collections, BULK COLLECT and FORALL are the new features in Oracle 8i, 9i and 10g
PL/SQL that can really make a different to you PL/SQL performance
Bulk Binding is used for avoiding the context switching between the sql engine and pl/sql
engine. If we use simple For loop in pl/sql block it will do context switching between
sql and pl/sql engine for each row processing that degrades the performance of pl/sql
bloack.
So that for avoiding the context switching betn two engine we user FORALL keyword by using
the collection pl/sql tables for DML. forall is pl/sql keyword.
It will provides good result and performance increase.

47.Why Functions are used in oracle ?Can Functions Return more than 1 values?Why Procedures are used in oracle ?
What are the Disadvantages of packages?What are the Global Variables in Packages?

The functions are used where we can't used the procedure.i.e we can use a function the in select
statments,in the where clause of delete/update statments.But the procedure can't used like that.
It is true that function can return only one value, but a function can be used to return more than
one value,by using out parameters and also by using ref cursors.
There is no harm in using the out parameter,when functins are used in the DML statements we
can't used the out parameter(as per rules).
49. What are the restrictions on Functions ?

Function cannot have DML statemets and we can use select statement in function
If you create function with DML statements we get message function will be created
But if we use in select statement we get error
50. What happens when a package is initialized ?
when a package is initialised that is called for the first time the entire package is loaded into SGA and
any variable declared in the package is initialises.
52. What is PL/SQL table?
Pl/Sql table is a type of datatype in procedural language Extension.It has two columns.One for the
index,say Binary index And another column for the datas,which might further extend to any number of
rows (not columns)in future.
PL/SQL table is nothing but one dimensional array. It is used to hold similar type of data for temporary
storage. Its is indexed by binary integer.
3. can i write plsql block inside expection
Yes you can write PL/SQL block inside exception section. Suppose you want to insert the exception
detail into your error log table, that time you can write insert into statement in exception part. To
handle the exception which may be raised in your exception part, you can write the PL/SQL code in
exception part.
54. Can e truncate some of the rows from the table instead of truncating the full table.
You can truncate few rows from a table if the table is partitioned. You can truncate a single partition
and keep remaining.

CREATE TABLE parttab (


state VARCHAR2(2),
sales NUMBER(10,2))
PARTITION BY LIST (state) (
PARTITION northwest VALUES ('OR', 'WA')
TABLESPACE uwdata,
PARTITION southwest VALUES ('AZ', 'CA')
TABLESPACE uwdata);
INSERT INTO parttab VALUES ('OR', 100000);
INSERT INTO parttab VALUES ('WA', 200000);
INSERT INTO parttab VALUES ('AZ', 300000);
INSERT INTO parttab VALUES ('CA', 400000);
COMMIT;
SELECT * FROM parttab;
ALTER TABLE parttab
TRUNCATE PARTITION southwest;
SELECT * FROM parttab;
56. What is the difference between a reference cursor and normal cursor ?

REF cursors are different than your typical, standard cursors. With standard cursors, you know
the cursor's query ahead of time. With REF cursors, you do not have to know the query ahead of
time. With REF Cursors, you can build the cursor on the fly
Normal Cursor is a Static Cursor.
Refernce Cursor is used to create dynamic cursor.
There are two types of Ref Cursors:
1. Weak cursor and 2.Strong cursor
Type ref_name is Ref cursor [return type]
[return type] means %Rowtype
if Return type is mentioned then it is Strong cursor else weak cursor
The Reference cursor does not support For update clause.
Normal cursor is used to process more than one record in plsql.
Refcusor is a type which is going to hold set of records which can be sent out through the
procedure or function out variables.
we can use Ref cursor as an IN OUT parameter .
58. Based on what conditions can we decide whether to use a table or a view or a materialized view ?
Table is the basic entity in any RDBMS , so for storing data you need table .
for view - if you have complex query from which you want to extract data again and again ,
moreover it is a standard data which is required by many other user also for REPORTS
generation then create view . Avoid to insert / update / delete through view unless it is essential.
keep view as read only (FOR SHOWING REPORTS)
for materialized view - this view ia mainly used in datawarehousing . if you have two databases
and you want a view in both databases , remember in datawarehousing we deal in GB or TB
datasize . So create a summary table in a database and make the replica(materialized view) in
other database.
when to create materialized view[1] if data is in bulk and you need same data in more than one database then create summary
table at one database and replica in other databases
[2] if you have summary columns in projection list of query.
main advatages of materialized view over simple view are [1] it save data in database whether simple view's definition is saved in database
[2] can create parition or index on materialize view to enhance the performance of view , but
cannot on simple view.
59. What is the difference between all_ and user_ tables ?

An ALL_ view displays all the information accessible to the current user, including information from
the current user's schema as well as information from objects in other schemas, if the current user
has access to those objects by way of grants of privileges or roles.
While

A USER_ view displays all the information from the schema of the current user. No special
privileges are required to query these views.
User_tables data dictionary contains all the tables created by the users under that schema.
whereas All_tables stores all the tables created in different schema. If any user id have the Grants
for access table of diff. schema then he can see that table through this dictionary.
61. what is p-code and sourcecode ?
P-code is Pre-complied code stored in Public cache memory of System Global Area after the

Oracle instance is started, whereas sourcecode is a simple code of sp, package, trigger, functions
etc which are stored in Oracle system defined data dictionary. Every session of oracle access the
p-code which have the Execute permission on that objects.
Source code stored in user_objects data dictinary for user defined Store proc, Trigger, Package,
Function. DBA_object stores all the db objects in sp. DB. ALL_objects stores all the db objects
in sp. schema.
Source code: The code say a PLSQL block that the user types for the exectionP-Code: The
source code after -Syntax check, Parse tree generation, Symantic check, and further execution of
the parse tree..giving the final P-code ready for data fetch or manipulation ...
63. Is there any limitation on no. of triggers that can be created on a table?

There is no limit on number of triggers on one table.


you can write as many u want for insert,update or delte by diff names.
if table has got n columns. we can create n triggers based on each column.
64.What happens when DML Statement fails?A.User level rollbackB.Statement Level RollbackC.Sustem evel Rollback
When a DML statement executes (fails/sucess) an automatic Commit is executed. Eg : Create a
table t1. Insert a record in t1. Then again to create the same object t1.
65.What steps should a programmer should follow for better tunning of the PL/SQL blocks?
SQL Queries Best Practices
1.

Always use the where clause in your select statement to narrow the number of rows returned.
If we dont use a where clause, the Oracle performs a full table scan on our table and returns all of the
rows.

2.

Use EXISTS clause instead of IN clause as it is more efficient than IN and performs faster.
Ex:
Replace
SELECT * FROM DEPT WHERE DEPTNO IN
(SELECT DEPTNO FROM EMP E)
With
SELECT * FROM DEPT D WHERE EXISTS
(SELECT 1 FROM EMP E WHERE D.DEPTNO = E.DEPTNO)
Note: IN checks all rows. Only use IN if the table in the sub-query is extremely small.

3.

When you have a choice of using the IN or the BETWEEN clauses in your SQL, use the BETWEEN
clause as it is much more efficient than IN.
Depending on the range of numbers in a BETWEEN, the optimizer will choose to do a full table scan or
use the index.

4.

Avoid WHERE clauses that are non-sargable. Non-sargable search arguments in the WHERE clause,
such as "IS NULL", "OR", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and
"LIKE %500" can prevent the query optimizer from using an index to perform a search. In addition,
expressions that include a function on a column, or expressions that have the same column on both
sides of the operator, are not sargable.
Convert multiple OR clauses to UNION ALL.

5.

Use equijoins. It is better if you use with indexed column joins. For maximum performance when
joining two or more tables, the indexes on the columns to be joined should have the same data type.

6.

Avoid a full-table scan if it is more efficient to get the required rows through an index. It decides full
table scan if it has to read more than 5% of the table data (for large tables).

7.

Avoid using an index that fetches 10,000 rows from the driving table if you could instead use another
index that fetches 100 rows and choose selective indexes.

8.

Indexes can't be used when Oracle is forced to perform implicit datatype conversion.

9.
l
l

Choose the join order so you will join fewer rows to tables later in the join order.
use smaller table as driving table
have first join discard most rows

10.

Set up the driving table to be the one containing the filter condition that eliminates the highest
percentage of the table.

11.

In a where clause (or having clause), constants or bind variables should always be on the right hand
side of the operator.

12.

Do not use SQL functions in predicate clauses or WHERE clauses or on indexed columns, (e.g.
concatenation, substr, decode, rtrim, ltrim etc.) as this prevents the use of the index. Use function
based indexes where possible
SELECT * FROM EMP WHERE SUBSTR (ENAME, 1, 3) = KES
Use the LIKE function instead of SUBSTR ()

13.

If you want the index used, dont perform an operation on the field.
Replace
SELECT * FROM EMPLOYEE WHERE SALARY +1000 = :NEWSALARY
With
SELECT * FROM EMPLOYEE WHERE SALARY = :NEWSALARY 1000

14.

All SQL statements will be in mixed lower and lower case. All reserve words will be capitalized and all
user-supplied objects will be lower case. (Standard)
15.

Minimize the use of DISTINCT because it forces a sort operation.

16.

Try joins rather than sub-queries which result in implicit joins

Replace
SELECT * FROM A WHERE A.CITY IN (SELECT B.CITY FROM B)
With

17.

SELECT A.* FROM A, B WHERE A.CITY = B.CITY

Replace Outer Join with Union if both join columns have a unique index:
Replace
SELECT A.CITY, B.CITY FROM A, B WHERE A.STATE=B.STATE (+)
With
SELECT A.CITY, B.CITY FROM A, B

WHERE A.STATE=B.STATE

UNION
SELECT NULL, B.CITY FROM B WHERE NOT EXISTS
(SELECT 'X' FROM A.STATE=B.STATE)
18.
Use bind variables in queries passed from the application (PL/SQL) so that the same query can
be reused. This avoids parsing.
19.
Use Parallel Query and Parallel DML if your system has more than 1 CPU.
20.
Match SQL where possible. Applications should use the same SQL statements wherever
possible to take advantage of Oracle's Shared SQL Area. The SQL must match exactly to take
advantage of this.
21.
No matter how many indexes are created, how much optimization is done to queries or how
many caches and buffers are tweaked and tuned if the design of a database is faulty, the performance
of the overall system suffers. A good application starts with a good design.
22.

The following operations always require a sort:


SELECT DISTINCT
SELECT UNIQUE
SELECT ....ORDER BY...
SELECT....GROUP BY...
CREATE INDEX
CREATE TABLE.... AS SELECT with primary key specification
Use of INTERSECT, MINUS, and UNION set operators
Unindexed table joins
Some correlated sub-queries

Also the order in which the conditions are given in the 'WHERE' cluase are very important while
performing a 'Select' query. The Performance Difference is unnoticed ifother wise the query
is run on a Massive Database.
For example for a select statement,
SELECT Emp_id FROM Emp_table WHERE Last_Name = 'Smith' AND Middle_Initial =
'K' AND Gender = 'Female';
The look up for matches in the table is performed by taking the conditions in the WHERE cluase
in the reverse order i.e., first all the rows that match the criteria Gender = 'Female' are returned
and in these returned rows, the conditon Last_Name = 'Smith' is looked up.
There fore, the order of the conditions in the WHERE clause must be in such a way that the last
condition gives minimum collection of potential match rows and the next condition must pass on
even little and so on. So, if we fine tune the above query, it should look like,
SELECT Emp_id FROM Emp_table WHERE Gender = 'Female' AND Middle_Initial =
'K' AND Last_Name = 'Smith' ; as Last_Name Smith would return far more less number of
rows than Gender = 'Female' as in the former case.
66.what is difference between varray and nested table.can u explain in brief and clear my these concepts.also give a small
and sweet example of both these.

Varry and Nestead table both are belongs to CollectionsThe Main difference is Varray has Upper
bound, where as Nestead table doesn't. It's size is unconstrained like any other database
tableNestead table can be stored in DatabaseSyntax of Nestead TableTYPE nes_tabtype IS
TABLE OF emp.empno%type;nes_tab nes_tabtype;Syntax of VarryTYPE List_ints_t IS
VARRAY(8) OF NUMBER(2);aList List_ints_t:=List_ints_t(2,3,5,1,5,4);
Nested table can be indexed where as VArray can't.

69. What is PL/Sql tables?Is cursor variable store in PL/SQL table?


pl/sql table is temparary table which is used to store records temrparaily in PL/SQL Block,
whenever block completes execution, table is also finished.
71. What is the DATATYPE of PRIMARY KEY
Binary Integer
72.What is the difference between User-level, Statement-level and System-level Rollback? Can you please give me example of
each?

1. System - level or transaction level


Rollback the current transaction entirely on errors. This was the unique
behavior of old drivers becauase PG has no savepoint functionality until
8.0.
2. Statement
Rollback the current (ODBC) statement on errors (in case of 8.0 or later
version servers). The driver calls a SAVEPOINT command just before starting
each (ODBC) statement and automatically ROLLBACK to the savepoint on errors
or RELEASE it on success. If you expect Oracle-like automatic per statement
rollback, please use this level.
3. User Level
You can(have to) call some SAVEPOINT commands and rollback to a savepoint
on errors by yourself. Please note you have to rollback the current
transcation or ROLLBACK to a savepoint on errors (by yourself) to continue
the application
74. Details about FORCE VIEW why and we can use
Generally we are not supposed to create a view without base table. If you want to create any
view without base table that is called as Force View or invalid view.
Syntax: CREATE FORCE VIEW AS < SELECT STATMENT >;
That View will be created with the message
View created with compilation errors
Once you create the table that invalid view will become as valid one.
75. 1) Why it is recommonded to use INOUT instead of OUT parameter type in a
procedure?
2) What happen if we will not assign anything in OUT parameter type in a procedure?
Hi,OUT parameter will be useful for returning the value from subprogram, value can be assigned
only once and this variable cannot be assigned to another variable.IN OUT parameter will be
used to pass the value to subprogram and as well as it can be used to return the value to caller of
subprogram. It acts as explicitly declared variable. Therefore it can be assigned value and its
value can be assigned to another variable.So IN OUT will be useful than OUT parameter.
1) IN OUT and OUT selection criteria depends upon the program need.if u want to retain the value
that is being passed then use seperate (IN and OUT)otherwise u can go for IN OUT.2)If nothing
is assigned to a out parameter in a procedure then NULL will be returned for that parameter.
78. What is autonomous Transaction? Where are they used?
Autonomous transaction is the transaction which acts independantly from the calling part and
could commit the process done.
example using prgma autonomous incase of mutation problem happens in a trigger.

79. How can I speed up the execution of query when number of rows in the tables increased
Standard practice is 1. Indexed the columns (Primary key)
2. Use the indexed / Primary key columns in the where clause
3. check the explain paln for the query and avoid for the nested loops / full table scan (depending
on the size of data retrieved and / or master table with few rows)
80. What is the purpose of FORCE while creating a VIEW
usually the views are created from the basetable if only the basetable exists.
The purpose of FORCE keyword is to create a view if the underlying base table doesnot exists.
ex : create or replace FORCE view <viewname> as <query>
while using the above syntax to create a view the table used in the query statement doesnot
necessary to exists in the database
83. What is Mutation of a trigger? why and when does it oocur?
A table is said to be a Mutating table under the following three circumstances
1) When u try to do delete, update, insert into a table through a trigger and at the same time u r
trying to select the same table.
2) The same applies for a view
3) Apart from that, if u r deleting (delete cascade),update,insert on the parent table and doing a
select in the child tableAll these happen only in a row level trigger
90. How to handle exception in Bulk collector?
During bulk collect you can save the exception and then you can process the exception.
Look at the below given example:
DECLARE
TYPE NumList IS TABLE OF NUMBER;
num_tab NumList := NumList(10,0,11,12,30,0,20,199,2,0,9,1);
errors NUMBER;
BEGIN
FORALL i IN num_tab.FIRST..num_tab.LAST
SAVE EXCEPTIONS
DELETE * FROM emp WHERE sal > 500000/num_tab(i);
EXCEPTION WHEN OTHERS THEN
-- this is not in the doco, thanks to JL for
pointing this out
errors := SQL%BULK_EXCEPTIONS.COUNT;
dbms_output.put_line('Number of errors is ' || errors);
FOR i IN 1..errors LOOP
-- Iteration is SQL
%BULK_EXCEPTIONS(i).ERROR_INDEX;
-- Error code is SQL
%BULK_EXCEPTIONS(i).ERROR_CODE;
END LOOP;END;
91.#1 What are the advantages and disadvantages of using PL/SQL or JAVA as the primary programming tool for
database automation.
#2 Will JAVA replace PL/SQL?
Internally the Oracle database supports two procedural languages, namely PL/SQL and Java. This
leads to questions like "Which of the two is the best?" and "Will Oracle ever desupport PL/SQL in
favour of Java?".
Many Oracle applications are based on PL/SQL and it would be difficult of Oracle to ever desupport
PL/SQL. In fact, all indications are that PL/SQL still has a bright future ahead of it. Many
enhancements are still being made to PL/SQL. For example, Oracle 9iDB supports native compilation
of Pl/SQL code to binaries.
PL/SQL and Java appeal to different people in different job roles. The following table briefly describes
the difference between these two language environments:
PL/SQL:
Data centric and tightly integrated into the database

Proprietary to Oracle and difficult to port to other database systems


Data manipulation is slightly faster in PL/SQL than in Java
Easier to use than Java (depending on your background)
Java:
Open standard, not proprietary to Oracle
Incurs some data conversion overhead between the Database and Java type systems
Java is more difficult to use (depending on your background)
110.
1.What is bulk collect?
2.What is instead trigger
3.What is the difference between Oracle table & PL/SQL table?
4.What R built in Packages in Oracle?
5.what is the difference between row migration & row changing?

1.What is bulk collect?


Bulk collect is part of PLSQL collection where data is stored/ poped up into a variable.
example:
declare
type sal_rec is table of number;
v_sal sal_rec;
begin
select sal bulk collect into v_sal from emp;
for r in 1.. v_sal.count loop
dbms_output.put_line(v_sal(r));
end loop;
end;
2.What is instead trigger
instead triggers are used for views.
3.What is the difference between Oracle table & PL/SQL table?
Table is logical entity which holds the data in dat file permanently . where as scope of plsql table
is limited to the particular block / procedure . refer above example sal_rec table will hold data
only till programme is reaching to end;
4.What R built in Packages in Oracle?
There R more then 1000 oracle builtin packges like:
Dbms_output, dbms_utility dbms_pipe .............
5.what is the difference between row migration & row
changing?
Migration: The data is stored in blocks whic use Pctfree 40%
and pctused 60% ( normally). The 40% space is used for update and delete statements . when a
condition may arise that update/delete statement takes more then pctfree then it takes the space
from anther block. this is called migration.
RowChaining: while inserting the data if data of one row takes more then one block then this
row is stored in two blocks and rows are chained.

insted of triggers: They provide a transparent way of modifying view that can't be modified
directly through SQL,DML statement.
111.Can anyone tell me the difference between instead of trigger, database trigger, and schema trigger?

INSTEAD OF Trigger control operation on view , not table. They can be used to make nonupdateable views updateable and to override the behvior of view that are updateable.
Database triggers fire whenever the database startup or is shutdown, whenever a user logs on or
log off, and whenever an oracle error occurs. these tigger provide a means of tracking activity in
the database
if we have created a view that is based on join codition then its not possibe to apply dml
operations like insert, update and delete on that view. So what we can do is we can create instead
off trigger and perform dml operations on the view.
131. HI,What is Flashback query in Oracle9i...?
Flahsback is used to take your database at old state like a system restore in windows. No DDL
and DML is allowed when database is in flashback condition.
user should have execute permission on dbms_flashback package
for example:
at 1030 am
from scott user : delete from emp;
commit;
at 1040 am I want all my data from emp table then ?
declare
cursor c1 is select * from emp;
emp_cur emp%rowtype;
begin
dbms_flashback.enable_at_time(sysdate - 15/1440);
open c1;
dbms_flashback.disable;
loop
fetch c1 into emp_cur;
exit when c1%notfound;
insert into emp values(emp_cur.empno, emp_cur.ename, emp_cur.job,
emp_cur.mgr,emp_cur.hiredate, emp_cur.sal, emp_cur.comm,
emp_cur.deptno);
end loop;
commit;
end;
/
select * from emp;
14 rows selected
132. what is the difference between database server and data dictionary
Database server is collection of all objects of oracle
Data Dictionary contains the information of for all the objects like when created, who created
etc.
Database server is a server on which the instance of oracle as server runs..whereas datadictionary
is the collection of information about all those objects like tables indexes views triggers etc in a
database..

134. Mention the differences between aggregate functions and analytical functions clearly with examples?
Aggregate functions are sum(), count(), avg(), max(), min()
like:
select sum(sal) , count(*) , avg(sal) , max(sal) , min(sal) from emp;
analytical fuction differ from aggregate function
some of examples:
SELECT ename "Ename", deptno "Deptno", sal "Sal",
SUM(sal)
OVER (ORDER BY deptno, ename) "Running Total",
SUM(SAL)
OVER (PARTITION BY deptno
ORDER BY ename) "Dept Total",
ROW_NUMBER()
OVER (PARTITION BY deptno
ORDER BY ENAME) "Seq"
FROM emp
ORDER BY deptno, ename
SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top3 FROM emp
)
WHERE Top3 <= 3
136. what are the advantages & disadvantages of packages ?
Modularity,Easier Application Design,Information Hiding,Added Functionality,Better
Performance,
Disadvantages of Package - More memory may be required on the Oracle database server when
using Oracle PL/SQL packages as the whole package is loaded into memory as soon as any
object in the package is accessed.
Disadvantages: Updating one of the functions/procedures will invalid other objects which use
different functions/procedures since whole package is needed to be compiled.
we cant pass parameters to packages
137.

What is a NOCOPY parameter? Where it is used?


NOCOPY Parameter Option
Prior to Oracle 8i there were three types of parameter-passing options to procedures and functions:

IN: parameters are passed by reference

OUT: parameters are implemented as copy-out

IN OUT: parameters are implemented as copy-in/copy-out


The technique of OUT and IN OUT parameters was designed to protect original values of them in case
exceptions were raised, so that changes could be rolled back. Because a copy of the parameter set
was made, rollback could be done. However, this method imposed significant CPU and memory
overhead when the parameters were large data collectionsfor example, PL/SQL Table or VARRAY
types.

With the new NOCOPY option, OUT and IN OUT parameters are passed by reference, which avoids
copy overhead. However, parameter set copy is not created and, in case of an exception rollback,
cannot be performed and the original values of parameters cannot be restored.
Here is an example of using the NOCOPY parameter option:

TYPE Note IS RECORD(


Title VARCHAR2(15),
Created_By VARCHAR2(20),
Created_When DATE,
Memo VARCHAR2(2000));TYPE Notebook IS VARRAY(2000) OF
Note;CREATE OR REPLACE PROCEDURE Update_Notes(Customer_Notes IN OUT NOCOPY
Notebook) ISBEGIN
...END;

NOCOPY is a hint given to the compiler, indicating that the parameter is passed as a reference
and hence actual value should not be copied in to the block and vice versa. The processing will
be done accessing data from the original variable. (Which other wise, oracle copies the data from
the parameter variable into the block and then copies it back to the variable after processing. This
would put extra burdon on the server if the parameters are of large collections/sizes)
For better understanding of NOCOPY parameter, I will suggest u to run the following code and
see the result.
DECLARE
n NUMBER := 10;
PROCEDURE do_something (
n1 IN NUMBER,
n2 IN OUT NUMBER,
n3 IN OUT NOCOPY NUMBER) IS
BEGIN
n2 := 20;
DBMS_OUTPUT.PUT_LINE(n1); -- prints 10
n3 := 30;
DBMS_OUTPUT.PUT_LINE(n1); -- prints 30
END;
BEGIN
do_something(n, n, n);
DBMS_OUTPUT.PUT_LINE(n); -- prints 20
END;
138. How to get the 25th row of a table.
select * from Emp where rownum < 26
minus

select * from Emp where rownum<25


SELECT * FROM EMP A WHERE 25=(SELECT COUNT(*) FROM EMP BWHERE
A.EMPNO>B.EMPNO);
139. What is Atomic transaction?
An atomic transaction is a database transaction or a hardware transaction which either
completely occurs, or completely fails to occur. A prosaic example is pregnancy - you can't be
"halfway pregnant"; you either are or you aren't
140. What is materialized view?
A materialized view is a database object that contains the results of a query. They are local copies
of data located remotely, or are used to create summary tables based on aggregations of a table's
data. Materialized views, which store data based on remote tables are also, know as snapshots.A

materialized view can query tables, views, and other materialized views. Collectively these are
called master tables (a replication term) or detail tables (a data warehouse term).
141. How to change owner of a table?
Owner of a table is the schema name which holds the table. To change the owner just recreate the
table in the new schema and drop the previous table
142. How can i see the time of execution of a sql statement?
sqlplus >set time on
144. what happens when commit is given in executable section and an error occurs ?please tell me what ha
Whenever the exception is raised ..all the transaction made before will be commited. If the
exception is not raised then all the transaction will be rolled back.
145. Wheather a Cursor is a Pointer or Reference?
cursor is basically a pointer as it's like a address of virtual memory which is being used storage
related to sql query & is made free after the values from this memory is being used
146. What will happen to an anonymus block,if there is no statement inside the block?eg:-declarebeginend
We cant have
declare
begin
end
we must have something between the begin and the end keywords
otherwise a compilation error will be raised.
147.Can we have same trigger with different names for a table?
eg: create trigger trig1
after insert on tab1;
and
eg: create trigger trig2
after insert on tab1;
If yes,which trigger executes first.

The triggers will be fired on the basis of TimeStamp of their creation in Data Dictionary. The
trigger with latest timestamp will be fired at last.
148.creating a table, what is the difference between VARCHAR2(80) and VARCHAR2(80 BYTE)?

Historically database columns which hold alphanumeric data have been defined using the
number of bytes they store. This approach was fine as the number of bytes equated to the number
of characters when using single-byte character sets. With the increasing use of multibyte
character sets to support globalized databases comes the problem of bytes no longer equating to
characters.Suppose we had a requirement for a table with an id and description column, where
the description must hold up to a maximum of 20 characters.We then decide to make a
multilingual version of our application and use the same table definition in a new instance with a
multibyte character set. Everything works fine until we try of fill the column with 20 two-byte
characters. All of a sudden the column is trying to store twice the data it was before and we have
a problem.Oracle9i has solved this problem with the introduction of character and byte length
semantics. When defining an alphanumeric column it is now possible to specify the length in 3
different ways: 1. VARCHAR2(20) 2. VARCHAR2(20 BYTE) 3. VARCHAR2(20
CHAR)Option 1 uses the default length semantics defined by the NLS_LENGTH_SEMANTICS
parameter which defaults to BYTE. Option 2 allows only the specified number of bytes to be

stored in the column, regardless of how many characters this represents. Option 3 allows the
specified number of characters to be stored in the column regardless of the number of bytes this
equates to.
151. how to insert a music file into the database
LOB datatypes can be used to store blocks of unstructured data like graphic images, video,
audio, etc
152. what is diff between strong and weak ref cursors
A strong REF CURSOR type definition specifies a return type, but a weak definition does not.
DECLARE
TYPE EmpCurTyp IS REF CURSOR RETURN emp%ROWTYPE; -- strong
TYPE GenericCurTyp IS REF CURSOR; -- weak
in a strong cursor structure is predetermined --so we cannot query having different structure
other than emp%rowtype
in weak cursor structure is not predetermined -- so we can query with any structure
Strong Ref cursor type is less Error prone, because oracle already knows what type you are going
to return as compare to weak ref type.
154. Explain, Is it possible to have same name for package and the procedure in that package.
Yes, its possible to have same name for package and the procedure in that package.
159. Without closing the cursor, If you want to open it what will happen. If error, get what is the error
If you reopen a cursor without closing it first,PL/SQL raises the predefined exception
CURSOR_ALREADY_OPEN.
161. What is PRAGMA RESTRICT_REFERENCES:
By using pragma_restrict_references we can give the different status to functions,Like
WNDB(WRITE NO DATA BASE),RNDB(read no data base),Write no package state,read no
packge state.W
164. What is difference between PL/SQL tables and arrays?
array is set of values of same datatype.. where as tables can store values of diff datatypes.. also
tables has no upper limit where as arrays has.
168. How do you set table for read only access ?
If you update or delete the records in the table, at the same time, no body can update or delete the
same records which you updated or deleted because oracle lock the data which u updated or
deleted.
Select for update
169.

What are the disadvantages of Packages and triggers??


Disadvantages of Packages:
1. You cannot reference remote packaged variables directly or indirectly..
2. Inside package you cannot reference host variable..
3. We are not able to grant a procedure in package..
Disadvantages of Trigger:
1. Writing more number of codes..
170. How to disable a trigger for a particular table ?
alter trigger <trigger_name> disable
172. how can we avoid duplicate rows. without using distinct command

Using Self join like :


select dup.column from tab a,tab b where a.dup.column=b.dup.column and a.rowid<>b.rowid

This query will return the first row for each unique id in the table. This query could be used as part of
a delete statement to remove duplicates if needed.
SELECT ID
FROM func t1
WHERE ROWID = (SELECT MIN (ROWID)

FROM func

WHEREID=t1.ID)

Also: You can use a group by without a summary function


SELECT ID
FROM func t1
GROUP BY id
173. Why we use instead of trigger. what is the basic structure of the instead of trigger. Explain speci
Conceptually, INSTEAD OF triggers are very simple. You write code that the Oracle server will execute
when a program performs a DML operation on the view. Unlike a conventional BEFORE or AFTER
trigger, an INSTEAD OF trigger takes the place of, rather than supplements, Oracle's usual DML
behavior. (And in case you're wondering, you cannot use BEFORE/AFTER triggers on any type of view,
even if you have defined an INSTEAD OF trigger on the view.)
CREATE OR REPLACE TRIGGER images_v_insert
INSTEAD OF INSERT ON images_v
FOR EACH ROW
BEGIN
/* This will fail with DUP_VAL_ON_INDEX if the images table
|| already contains a record with the new image_id.
*/
INSERT INTO images
VALUES (:NEW.image_id, :NEW.file_name, :NEW.file_type,
:NEW.bytes);
IF :NEW.keywords IS NOT NULL THEN
DECLARE
/* Note: apparent bug prevents use of :NEW.keywords.LAST.
|| The workaround is to store :NEW.keywords as a local
|| variable (in this case keywords_holder.)
*/
keywords_holder Keyword_tab_t := :NEW.keywords;
BEGIN
FOR the_keyword IN 1..keywords_holder.LAST
LOOP
INSERT INTO keywords
VALUES (:NEW.image_id, keywords_holder(the_keyword));
END LOOP;
END;
END IF;
END;
Once we've created this INSTEAD OF trigger, we can insert a record into this object view (and hence
into both underlying tables) quite easily using:

INSERT INTO images_v VALUES (Image_t(41265, 'pigpic.jpg', 'JPG', 824,


Keyword_tab_t('PIG', 'BOVINE', 'FARM ANIMAL')));

This statement causes the INSTEAD OF trigger to fire, and as long as the primary key value (image_id
= 41265) does not already exist, the trigger will insert the data into the appropriate tables.
Similarly, we can write additional triggers that handle updates and deletes. These triggers use the
predictable clauses INSTEAD OF UPDATE and INSTEAD OF DELETE.

180. what is the difference between database trigger and application trigger?
Database triggers are backend triggeres and perform as any event occurs on databse level (ex.
Inset,update,Delete e.t.c) wheras application triggers are froentend triggers and perform as
any event taken on application level (Ex. Button Pressed, New Form Instance e.t.c)
185. Compare EXISTS and IN Usage with advantages and disadvantages.
exist is faster than IN Command
exist do full table scan...so it is faster than IN
Use Exists whenever possible. EXISTS only checks the existence of records
(True/False), and in the case of IN each and every records will be
checked. performace wise EXISTS is better.
189. Which type of binding does PL/SQL use?
it uses latebinding so only we cannot use ddl staments directly
it uses dynamicbinding
191. Why DUAL table is not visible?
Because its a dummy table.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

How do you identify existing rows of data in


the target table using lookup transformation?
Scenario:
How do you identify existing rows of data in the target table using lookup
transformation?
Solution:
There are two ways to lookup the target table to verify a row exists or not:

1. Use connect dynamic cache lookup and then check the values of NewLookuprow
Output port to decide whether the incoming record already exists in the table /
cache or not.
2. Use Unconnected lookup and call it from an expression transformation and check
the Lookup condition port value (Null/ Not Null) to decide whether the incoming
record already exists in the table or not.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is Load Manager?


Scenario:
What is Load Manager?
Solution:
While running a Workflow, the PowerCenter Server uses the Load Manager
process and the Data Transformation Manager Process (DTM) to run the workflow
and carry out workflow tasks. When the PowerCenter Server runs a workflow,
The Load Manager performs the following tasks:
1. Locks the workflow and reads workflow properties.
2. Reads the parameter file and expands workflow variables.
3. Creates the workflow log file.
4. Runs workflow tasks.
5. Distributes sessions to worker servers.
6. Starts the DTM to run sessions.
7. Runs sessions from master servers.
8. Sends post-session email if the DTM terminates abnormally.
When the PowerCenter Server runs a session, the DTM performs the following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled. Checks
query
Conversions if data code page validation is disabled.
5. Verifies connection object permissions.

6. Runs pre-session shell commands.


7. Runs pre-session stored procedures and SQL.
8. Creates and runs mapping, reader, writer, and transformation threads to extract,
transformation, and load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

In which conditions we can not use joiner


transformation (Limitations of joiner
transformation)?
Scenario:
In which conditions we can not use joiner transformation (Limitations of joiner
transformation)?
Solution:
1. When our data comes through Update Strategy transformation or in other words
after Update strategy we cannot add joiner transformation
2. We cannot connect a Sequence Generator transformation directly before the
Joiner transformation.
The Joiner transformation does not match null values. For example if both EMP_ID1
and EMP_ID2 from the example above contain a row with a null value the
PowerCenter Server does not consider them a match and does not join the two
rows. To join rows with null values you can replace null input with default values and
then join on the default values.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

How can U improve session performance in


aggregator transformation?
Scenario:
How can U improve session performance in aggregator transformation?
Solution:

You can use the following guidelines to optimize the performance of an Aggregator
transformation.
Use sorted input to decrease the use of aggregate caches.
Sorted input reduces the amount of data cached during the session and improves
session performance. Use this option with the Sorter transformation to pass sorted
data to the Aggregator transformation.
Limit connected input/output or output ports.
Limit the number of connected input/output or output ports to reduce the amount of
data the Aggregator transformation stores in the data cache.
Filter before aggregating.
If you use a Filter transformation in the mapping place the transformation before the
Aggregator transformation to reduce unnecessary aggregation.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

How can you recognize whether or not the


newly added rows in the source are gets
insert in the target?
Scenario:
How can you recognize whether or not the newly added rows in the source are gets
insert in the target?
Solution:
In the Type2 mapping we have three options to recognize the newly added rows
Version number
Flagvalue
Effective date Range
If it is Type 2 Dimension the above answer is fine but if u want to get the info of all
the insert statements and Updates you need to use session log file where you
configure it to verbose.
You will get complete set of data which record was inserted and which was not.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is Slowly Changing Dimensions


(SCD) ?
Slowly Changing Dimensions
Dimensions that change over time are called Slowly Changing Dimensions.
For instance, a product price changes over time; People change their names for

some reason; Country and State names may change over time. These are a few
examples of Slowly Changing Dimensions since some changes are happening to
them over a period of time.
Slowly Changing Dimensions are often categorized into three types namely Type1,
Type2 and Type3. The following section deals with how to capture and handling
these changes over time.
The "Product" table mentioned below contains a product named, Product1 with
Product ID being the primary key. In the year 2004, the price of Product1 was $150
and over the time, Product1's price changes from $150 to $350. With this
information, let us explain the three types of Slowly Changing Dimensions.
Product Price in 2004:
Product ID(PK) Year Product Name Product Price
1 2004 Product1 $150
1.SCD TYPE1(Slowly Changing Dimension) : contains current data.
2.SCD TYPE2(Slowly Changing Dimension) : contains current data + complete historical data.
3.SCD TYPE3(Slowly Changing Dimension) : contains current data + one type historical data.

Type 1: Overwriting the old values.


In the year 2005, if the price of the product changes to $250, then the old values of
the columns "Year" and "Product Price" have to be updated and replaced with the
new values. In this Type 1, there is no way to find out the old value of the product
"Product1" in year 2004 since the table now contains only the new price and year
information.
Product
Product ID(PK) Year Product Name Product Price
1 2005 Product1 $250
Type 2: Creating an additional record.
In this Type 2, the old values will not be replaced but a new row containing the new
values will be added to the product table. So at any point of time, the difference
between the old values and new values can be retrieved and easily be compared.
This would be very useful for reporting purposes.
Product
Product ID(PK) Year Product Name Product Price
1 2004 Product1 $150
1 2005 Product1 $250
The problem with the above mentioned data structure is "Product ID" cannot store
duplicate values of "Product1" since "Product ID" is the primary key. Also, the
current data structure doesn't clearly specify the effective date and expiry date of
Product1 like when the change to its price happened. So, it would be better to
change the current data structure to overcome the above primary key violation.
Product
Product ID(PK) Effective
DateTime(PK) Year Product Name Product Price Expiry
DateTime

1 01-01-2004 12.00AM 2004 Product1 $150 12-31-2004 11.59PM


1 01-01-2005 12.00AM 2005 Product1 $250
In the changed Product table's Data structure, "Product ID" and "Effective DateTime"
are composite primary keys. So there would be no violation of primary key
constraint. Addition of new columns, "Effective DateTime" and "Expiry DateTime"
provides the information about the product's effective date and expiry date which
adds more clarity and enhances the scope of this table. Type2 approach may need
additional space in the data base, since for every changed record, an additional row
has to be stored. Since dimensions are not that big in the real world, additional
space is negligible.
Type 3: Creating new fields.
In this Type 3, the latest update to the changed values can be seen. Example
mentioned below illustrates how to add new columns and keep track of the
changes. From that, we are able to see the current price and the previous price of
the product, Product1.
Product
Product ID(PK) Current
Year Product
Name Current
Product Price Old Product
Price Old Year
1 2005 Product1 $250 $150 2004
The problem with the Type 3 approach, is over years, if the product price
continuously changes, then the complete history may not be stored, only the latest
change will be stored. For example, in year 2006, if the product1's price changes to
$350, then we would not be able to see the complete history of 2004 prices, since
the old values would have been updated with 2005 product information.
Product
Product ID(PK) Year Product
Name Product
Price Old Product
Price Old Year
1 2006 Product1 $350 $250 2005

Example: In order to store data, over the years, many application designers in
each branch have made their individual decisions as to how an application and
database should be built. So source systems will be different in naming conventions,
variable measurements, encoding structures, and physical attributes of data.
Consider a bank that has got several branches in several countries, has millions of
customers and the lines of business of the enterprise are savings, and loans. The
following example explains how the data is integrated from source systems to
target systems.
Example of Source Data
System Name Attribute Name Column Name Datatype Values
Source System 1 Customer Application Date CUSTOMER_APPLICATION_DATE
NUMERIC(8,0) 11012005
Source System 2 Customer Application Date CUST_APPLICATION_DATE DATE
11012005

Source System 3 Application Date APPLICATION_DATE DATE 01NOV2005


In the aforementioned example, attribute name, column name, datatype and values
are entirely different from one source system to another. This inconsistency in data
can be avoided by integrating the data into a data warehouse with good standards.
Example of Target Data(Data Warehouse)
Target System Attribute Name Column Name Datatype Values
Record #1 Customer Application Date CUSTOMER_APPLICATION_DATE DATE
01112005
Record #2 Customer Application Date CUSTOMER_APPLICATION_DATE DATE
01112005
Record #3 Customer Application Date CUSTOMER_APPLICATION_DATE DATE
01112005
In the above example of target data, attribute names, column names, and
datatypes are consistent throughout the target system. This is how data from
various source systems is integrated and accurately stored into the data warehouse.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is Dimension Table?


Dimension Table
Dimension table is one that describes the business entities of an enterprise,
represented as hierarchical, categorical information such as time, departments,
locations, and products. Dimension tables are sometimes called lookup or reference
tables.
Location Dimension
In a relational data modeling, for normalization purposes, country lookup,
state lookup, county lookup, and city lookups are not merged as a single table. In a
dimensional data modeling (star schema), these tables would be merged as a single
table called LOCATION DIMENSION for performance and slicing data
requirements. This location dimension helps to compare the sales in one region with
another region. We may see good sales profit in one region and loss in another
region. If it is a loss, the reasons for that may be a new competitor in that area, or
failure of our marketing strategy etc.
Example of Location Dimension:
Country Lookup
Country Code Country Name DateTimeStamp

USA United States Of America 1/1/2005 11:23:31 AM


State Lookup
State Code State Name DateTimeStamp
NY New York 1/1/2005 11:23:31 AM
FL Florida 1/1/2005 11:23:31 AM
CA California 1/1/2005 11:23:31 AM
NJ New Jersey 1/1/2005 11:23:31 AM
County Lookup
County Code County Name DateTimeStamp
NYSH Shelby 1/1/2005 11:23:31 AM
FLJE Jefferson 1/1/2005 11:23:31 AM
CAMO Montgomery 1/1/2005 11:23:31 AM
NJHU Hudson 1/1/2005 11:23:31 AM
City Lookup
City Code City Name DateTimeStamp
NYSHMA Manhattan 1/1/2005 11:23:31 AM
FLJEPC Panama City 1/1/2005 11:23:31 AM
CAMOSH San Hose 1/1/2005 11:23:31 AM
NJHUJC Jersey City 1/1/2005 11:23:31 AM
Location Dimension
Location
Dimension Id Country
Name State
Name County
Name City
Name DateTime
Stamp
1 USA New York Shelby Manhattan 1/1/2005 11:23:31 AM
2 USA Florida Jefferson Panama City 1/1/2005 11:23:31 AM
3 USA California Montgomery San Hose 1/1/2005 11:23:31 AM
4 USA New Jersey Hudson Jersey City 1/1/2005 11:23:31 AM
Product Dimension
In a relational data model, for normalization purposes, product category lookup,
product sub-category lookup, product lookup, and and product feature lookups are
are not merged as a single table. In a dimensional data modeling(star schema),
these tables would be merged as a single table called PRODUCT DIMENSION for
performance and slicing data requirements.
Example of Product Dimension: Figure 1.9
Product Category Lookup
Product Category Code Product Category Name DateTimeStamp
1 Apparel 1/1/2005 11:23:31 AM
2 Shoe 1/1/2005 11:23:31 AM
Product Sub-Category Lookup
Product
Sub-Category Code Product
Sub-Category Name DateTime
Stamp
11 Shirt 1/1/2005 11:23:31 AM

12 Trouser 1/1/2005 11:23:31 AM


13 Casual 1/1/2005 11:23:31 AM
14 Formal 1/1/2005 11:23:31 AM
Product Lookup
Product Code Product Name DateTimeStamp
1001 Van Heusen 1/1/2005 11:23:31 AM
1002 Arrow 1/1/2005 11:23:31 AM
1003 Nike 1/1/2005 11:23:31 AM
1004 Adidas 1/1/2005 11:23:31 AM
Product Feature Lookup
Product Feature Code Product Feature Description DateTimeStamp
10001 Van-M 1/1/2005 11:23:31 AM
10002 Van-L 1/1/2005 11:23:31 AM
10003 Arr-XL 1/1/2005 11:23:31 AM
10004 Arr-XXL 1/1/2005 11:23:31 AM
10005 Nike-8 1/1/2005 11:23:31 AM
10006 Nike-9 1/1/2005 11:23:31 AM
10007 Adidas-10 1/1/2005 11:23:31 AM
10008 Adidas-11 1/1/2005 11:23:31 AM
Product Dimension
Product Dimension Id Product Category Name Product Sub-Category Name Product
Name Product Feature Desc DateTime
Stamp
100001 Apparel Shirt Van Heusen Van-M 1/1/2005 11:23:31 AM
100002 Apparel Shirt Van Heusen Van-L 1/1/2005 11:23:31 AM
100003 Apparel Shirt Arrow Arr-XL 1/1/2005 11:23:31 AM
100004 Apparel Shirt Arrow Arr-XXL 1/1/2005 11:23:31 AM
100005 Shoe Casual Nike Nike-8 1/1/2005 11:23:31 AM
100006 Shoe Casual Nike Nike-9 1/1/2005 11:23:31 AM
100007 Shoe Casual Adidas Adidas-10 1/1/2005 11:23:31 AM
100008 Shoe Casual Adidas Adidas-11 1/1/2005 11:23:31 AM

Organization Dimension
In a relational data model, for normalization purposes, corporate office lookup,
region lookup, branch lookup, and employee lookups are not merged as a single
table. In a dimensional data modeling(star schema), these tables would be merged
as a single table called ORGANIZATION DIMENSION for performance and slicing
data.
This dimension helps us to find the products sold or serviced within the organization
by the employees. In any industry, we can calculate the sales on region basis,
branch basis and employee basis. Based on the performance, an organization can
provide incentives to employees and subsidies to the branches to increase further
sales.
Example of Organization Dimension: Figure 1.10
Corporate Lookup
Corporate Code Corporate Name DateTimeStamp
CO American Bank 1/1/2005 11:23:31 AM
Region Lookup

Region Code Region Name DateTimeStamp


SE South East 1/1/2005 11:23:31 AM
MW Mid West 1/1/2005 11:23:31 AM
Branch Lookup
Branch Code Branch Name DateTimeStamp
FLTM Florida-Tampa 1/1/2005 11:23:31 AM
ILCH Illinois-Chicago 1/1/2005 11:23:31 AM
Employee Lookup
Employee Code Employee Name DateTimeStamp
E1 Paul Young 1/1/2005 11:23:31 AM
E2 Chris Davis 1/1/2005 11:23:31 AM
Organization Dimension
Organization Dimension Id Corporate Name Region Name Branch Name Employee
Name DateTime
Stamp
1 American Bank South East Florida-Tampa Paul Young 1/1/2005 11:23:31 AM
2 American Bank Mid West Illinois-Chicago Chris Davis 1/1/2005 11:23:31 AM
Time Dimension
In a relational data model, for normalization purposes, year lookup, quarter lookup,
month lookup, and week lookups are not merged as a single table. In a dimensional
data modeling(star schema), these tables would be merged as a single table called
TIME DIMENSION for performance and slicing data.
This dimensions helps to find the sales done on date, weekly, monthly and yearly
basis. We can have a trend analysis by comparing this year sales with the previous
year or this week sales with the previous week.
Example of Time Dimension: Figure 1.11
Year Lookup
Year Id Year Number DateTimeStamp
1 2004 1/1/2005 11:23:31 AM
2 2005 1/1/2005 11:23:31 AM
Quarter Lookup
Quarter Number Quarter Name DateTimeStamp
1 Q1 1/1/2005 11:23:31 AM
2 Q2 1/1/2005 11:23:31 AM
3 Q3 1/1/2005 11:23:31 AM
4 Q4 1/1/2005 11:23:31 AM
Month Lookup
Month Number Month Name DateTimeStamp
1 January 1/1/2005 11:23:31 AM
2 February 1/1/2005 11:23:31 AM
3 March 1/1/2005 11:23:31 AM
4 April 1/1/2005 11:23:31 AM
5 May 1/1/2005 11:23:31 AM
6 June 1/1/2005 11:23:31 AM
7 July 1/1/2005 11:23:31 AM
8 August 1/1/2005 11:23:31 AM
9 September 1/1/2005 11:23:31 AM

10 October 1/1/2005 11:23:31 AM


11 November 1/1/2005 11:23:31 AM
12 December 1/1/2005 11:23:31 AM
Week Lookup
Week Number Day of Week DateTimeStamp
1 Sunday 1/1/2005 11:23:31 AM
1 Monday 1/1/2005 11:23:31 AM
1 Tuesday 1/1/2005 11:23:31 AM
1 Wednesday 1/1/2005 11:23:31 AM
1 Thursday 1/1/2005 11:23:31 AM
1 Friday 1/1/2005 11:23:31 AM
1 Saturday 1/1/2005 11:23:31 AM
2 Sunday 1/1/2005 11:23:31 AM
2 Monday 1/1/2005 11:23:31 AM
2 Tuesday 1/1/2005 11:23:31 AM
2 Wednesday 1/1/2005 11:23:31 AM
2 Thursday 1/1/2005 11:23:31 AM
2 Friday 1/1/2005 11:23:31 AM
2 Saturday 1/1/2005 11:23:31 AM
Time Dimension
Time Dim Id Year No Day of Year Quarter No Month No Month Name Month Day No
Week No Day of Week Cal Date DateTime
Stamp
1 2004 1 Q1 1 January 1 1 5 1/1/2004 1/1/2005 11:23:31 AM
2 2004 32 Q1 2 February 1 5 1 2/1/2004 1/1/2005 11:23:31 AM
3 2005 1 Q1 1 January 1 1 7 1/1/2005 1/1/2005 11:23:31 AM
4 2005 32 Q1 2 February 1 5 3 2/1/2005 1/1/2005 11:23:31 AM

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is Fact Table?


Fact Table
The centralized table in a star schema is called as FACT table. A fact table typically has two types
of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key
of a fact table is usually a composite key that is made up of all of its foreign keys.
In the example fig 1.6 "Sales Dollar" is a fact(measure) and it can be added across several dimensions.
Fact tables store different types of measures like additive, non additive and semi additive measures.
Measure Types

Additive - Measures that can be added across all dimensions.


Non Additive - Measures that cannot be added across all dimensions.
Semi Additive - Measures that can be added across few dimensions and not with others.
A fact table might contain either detail level facts or facts that have been aggregated (fact tables that
contain aggregated facts are often instead called summary tables).
In the real world, it is possible to have a fact table that contains no measures or facts. These tables are
called as Factless Fact tables.
Steps in designing Fact Table
Identify a business process for analysis (like sales).
Identify measures or facts (sales dollar).
Identify dimensions for facts (product dimension, location dimension, time dimension, organization
dimension).
List the columns that describe each dimension.(region name, branch name, region name).
Determine the lowest level of summary in a fact table (sales dollar).
for a product in a year within a location sold or serviced by an employee

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

Steps in designing Star Schema


Steps in designing Star Schema
Identify a business process for analysis (like sales).
Identify measures or facts (sales dollar).
Identify dimensions for facts (product dimension, location dimension, time dimension, organization
dimension).
List the columns that describe each dimension.(region name, branch name, region name).
Determine the lowest level of summary in a fact table(sales dollar).
Important aspects of Star Schema & Snow Flake Schema
In a star schema every dimension will have a primary key.
In a star schema, a dimension table will not have any parent table.
Whereas in a snow flake schema, a dimension table will have one or more parent tables.
Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
Whereas hierarchies are broken into separate tables in snow flake schema. These hierachies helps to
drill down the data from topmost hierarchies to the lowermost hierarchies.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is update strategy and what are the


options for update strategy?
Scenario:
What is update strategy and what are the options for update strategy?
Solution:
Informatica processes the source data row-by-row. By default every row is
marked to be inserted in the target table. If the row has to be updated/inserted
based on some logic Update Strategy transformation is used. The condition can be
specified in Update Strategy to mark the processed row for update or insert.
Following options are available for update strategy :

DD_INSERT : If this is used the Update Strategy flags the row for insertion.
Equivalent numeric value of DD_INSERT is 0.
DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent
numeric value of DD_UPDATE is 1.
DD_DELETE : If this is used the Update Strategy flags the row for deletion.
Equivalent numeric value of DD_DELETE is 2.
DD_REJECT : If this is used the Update Strategy flags the row for rejection.
Equivalent numeric value of DD_REJECT is 3.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is aggregator transformation? & what


is Incremental Aggregation?
Scenario:

What is aggregator transformation? & what is Incremental Aggregation?


Solution:
The Aggregator transformation allows performing aggregate calculations, such as
averages and sums. Unlike Expression Transformation, the Aggregator
transformation can only be used to perform calculations on groups. The Expression
transformation permits calculations on a row-by-row basis only.
Aggregator Transformation contains group by ports that indicate how to group the
data. While grouping the data, the aggregator transformation outputs the last row of
each group unless otherwise specified in the transformation properties.
Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST,
MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.
Whenever a session is created for a mapping Aggregate Transformation, the session
option for Incremental Aggregation can be enabled. When PowerCenter performs
incremental aggregation, it passes new source data through the mapping and uses
historical cache data to perform new aggregation calculations incrementally.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What are the different types of locks?


Scenario:
What are the different types of locks?
Solution:
There are five kinds of locks on repository objects:
1. Read lock => Created when you open a repository object in a folder for which
you do not have write permission. Also
created when you open an object with an existing write lock.
2. Write lock => Created when you create or edit a repository object in a folder for
which you have write permission.
3. Execute lock => Created when you start a session or batch, or when the
Informatica Server starts a scheduled session or batch.

4. Fetch lock => Created when the repository reads information about repository
objects from the database.
5. Save lock => Created when you save information to the repository.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

What is Shortcuts, Sessions, Batches,


mapplets, mappings, Worklet & workflow?
Scenario: What is Shortcuts, Sessions, Batches, mapplets, mappings, Worklet &
workflow?
Solution:
Shortcuts?
We can create shortcuts to objects in shared folders. Shortcuts provide the easiest
way to reuse objects. We use a shortcut as if it were the actual object, and when we
make a change to the original object, all shortcuts inherit the change.
Shortcuts to folders in the same repository are known as local shortcuts. Shortcuts
to the global repository are called global shortcuts.
We use the Designer to create shortcuts.
Sessions and Batches
Sessions and batches store information about how and when the Informatica Server
moves data through mappings. You create a session for each mapping you want to
run. You can group several sessions together in a batch. Use the Server Manager to
create sessions and batches.

Mapplets
You can design a mapplet to contain sets of transformation logic to be reused in
multiple mappings within a folder, a repository, or a domain. Rather than recreate
the same set of transformations each time, you can create a mapplet containing the

transformations, then add instances of the mapplet to individual mappings. Use the
Mapplet Designer tool in the Designer to create mapplets.
Mappings
A mapping specifies how to move and transform data from sources to targets.
Mappings include source and
target definitions and transformations.
Transformations describe how the Informatica Server transforms data. Mappings can
also include shortcuts, reusable transformations, and mapplets. Use the Mapping
Designer tool in the Designer to create mappings.

session
A session is a set of instructions to move data from sources to targets.
workflow
A workflow is a set of instructions that tells the Informatica server how to execute
the tasks.
Worklet
Worklet is an object that represents a set of tasks.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
19

Informetica PowerCenter 8.x Architecture


Informetica PowerCenter 8.x Architecture

The PowerCenter domain is the fundamental administrative unit in


PowerCenter. The domain supports the administration of the distributed
services. A domain is a collection of nodes and services that you can group in
folders based on administration ownership.
A node is the logical representation of a machine in a domain. One node in
the domain acts as a gateway to receive service requests from clients and
route them to the appropriate service and node. Services and processes run

on nodes in a domain. The availability of a service or process on a node


depends on how you configure the service and the node.
Services for the domain include the Service Manager and a set of application
services:
Service Manager. A service that manages all domain operations. It runs the
application services and performs domain functions on each node in the
domain. Some domain functions include authentication, authorization, and
logging. For more information about the Service Manager, see Service
Manager.
Application services. Services that represent PowerCenter server-based
functionality, such as the Repository Service and the Integration Service. The
application services that runs on a node depend on the way you configure
the services.

Posted 19th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

What is a Star Schema? Which Schema is


preferable in performance oriented way?
Why?
Scenario:
What is a Star Schema? Which Schema is preferable in performance oriented way?
Why?
Solution:
A Star Schema is composed of 2 kinds of tables, one Fact Table and multiple
Dimension Tables.
It is called a star schema because the entity-relationship diagram
between dimensions and fact tables resembles a star where one fact table is
connected to multiple dimensions. The center of the star schema consists of a large
fact table and it points towards the dimension tables. The advantage of star schema
are slicing down, performance increase and easy understanding of data.

F1act Table contains the actual transactions or values that are being analyzed.
Dimension Tables contain descriptive information about those transactions or
values.
In Star Schemas, Dimension Tables are denormalized tables and Fact Tables are
highly
normalized.
Star Schema

Star Schema is preferable because less number of joins will result in performance.
Because Dimension Tables are denormalized, there will be no need to go for joins all
the time.
Steps in designing Star Schema
Identify a business process for analysis(like sales).
Identify measures or facts (sales dollar).
Identify dimensions for facts(product dimension, location dimension, time dimension, organization
dimension).
List the columns that describe each dimension.(region name, branch name, region name).
Determine the lowest level of summary in a fact table(sales dollar).
Important aspects of Star Schema & Snow Flake Schema
In a star schema every dimension will have a primary key.
In a star schema, a dimension table will not have any parent table.
Whereas in a snow flake schema, a dimension table will have one or more parent tables.
Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to
drill down the data from topmost hierachies to the lowermost hierarchies.

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

Informatica Case study


Informatica Case study
Scenario:
Data has to be moved from a legacy application to Siebel staging tables (EIM). The
client will provide the data in a delimited flat file. This file contains Contact records
which need to be loaded into the EIM_CONTACT table.
Some facts
A contact can be uniquely identified by concatenating the First name with the Last
name and Zip code.
Known issues
A potential problem with the load could be the telephone number which is currently
stored as a string (XXX-YYY-ZZZZ format). We need to convert this into (XXX)
YYYYYYY format where XX X is the area code in brackets followed by a space and
the 7 digit telephone number. Any extensions should be dropped.
Requirements
The load should have a batch number of 100. If the record count exceeds 500,
increment the batch number by 5
Since the flat file may have duplicates due to alphabet case issues, its been decided
that all user keys on the table should be stored in uppercase. For uniformity sake,
the first name and last name should also be loaded in uppercase
Error logging
As per clients IT standards, its expected that any data migration run would provide
a automated high level report (a flat file report is acceptable) which will give
information on how many records were read, loaded successfully and failed due to
errors.
Output expected from case study:
1. Informatica mapping from flat file to EIM_CONTACT table
2. Log file created for error logging

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

Informatica Training Effectiveness


Assessment

Informatica Training Effectiveness Assessment

Trainee:
Trainer:
Date of training:

1.
a.
b.
c.
d.

Informatica is an ETL tool where ETL stands for


Extract Transform Load
Evaluate Transform Load
Extract Test Load
Evaluate Test Load

2.
a.
b.
c.
d.

Informatica allows for the following:


One source multiple targets to be loaded within the same mapping
Multiple source multiple targets to be loaded within the same mapping
Multiple sources single target to be loaded within the same mapping
Multiple sources multiple targets to be loaded provided mapplets are used within
the mapping

3.

The ____ manages the connections to the repository from the Informatica client
application
Repository Server
Informatica Server
Informatica Repository Manager
Both a & b

a.
b.
c.
d.
4.
a.
b.
c.
d.
5.
a.
b.
c.
d.

During development phase, its best to use what type of tracing levels to debug
errors
Terse tracing
Verbose tracing
Verbose data tracing
Normal tracing
During Informatica installation, what is the installation sequence?
Informatica Client, Repository Server, Informatica Server
Informatica Server, Repository Server, Informatica Client
Repository Server, Informatica Server, Informatica Client
Either of the above is fine, however, to create the repository we need the
Informatica client installed and the repository server process should be running

6.
a.
b.
c.
d.
7.
a.
b.

There is a requirement to concatenate the first name and last name from a flat file
and use this concatenated value at 2 locations in the target table. The best way to
achieve this functionality is by using the
Expression transformation
Filter transformation
Aggregator transformation
Using the character transformation
The workflow monitor does not allow the user to edit workflows.
True
False

8.

There is a requirement to increment a batch number by one for every 5000 records
that are loaded. The best way to achieve this is:
a. Use Mapping parameter in the session
b. Use Mapping variable in the session
c. Store the batch information in the workflow manager
d. Write code in a transformation to update values as required

9.

There is a requirement to reuse some complex logic across 3 mappings. The best
way to achieve this is:
Create a mapplet to encapsulate the reusable functionality and call this in the 3
mappings
Create a worklet and reuse this at the session level during execution of the
mapping
Cut and paste the code across the 3 mappings
Keep this functionality as a separate mapping and call this mapping in 3 different
mappings this would make the code modular and reusable

a.
b.
c.
d.

10. You imported a delimited flat file ABC.TXT from you workstation into the Source
qualifier in Informatica client. You then proceeded with developing a mapping and
validated it for correctness using the Validate function. You then set it up for
execution in the workflow manager. When you execute the mapping, you get an
error stating that the file was not found. The most probable cause of this error is:
a. Your mapping is not correct and the file is not being parsed correctly by the source
qualifier
b. The file cannot be loaded from your workstation, it has to be on the server
c. Informatica did not have access to the NT directory on your workstation where the
file is stored
d. You forgot to mention the location of the file in the workflow properties and hence
the error
11. Various administrative functions such as folder creation and user access control are
done using:
a. Informatica Administration console
b. Repository Manager
c. Informatica Server
d. Repository Server

12. You created a mapping a few months back which is not invalid because the
database schema underwent updates in the form of new column extensions. In
order to fix the problem, you would:
a. Re-import the table definitions from the database
b. Make the updates to the table structure manually in the mapping
c. Informatica detects updates to table structures automatically. All you have to do is
click on Validate option for the mapping
d. None of the above. The mapping has to be scrapped and a new one needs to be
created
13.
a.
b.
c.

The parameter file is used to store the following information


Workflow parameters, session parameters, mapping parameters and variables
Workflow variables, session variables, mapping variables
Mapping parameters, session constants, workflow constants.

14.
a.
b.
c.
d.

The Gantt chart view in Informatica is useful for:


Tracking dependencies for sessions and mappings
Schedule workflows
View progress of workflows and view overall schedule
Plan start and end dates / times for each workflow run

15.
a.
b.
c.
d.

When using the debugger function, you can stop execution at the following:
Errors or breakpoints
Errors only
Breakpoints only
First breakpoint after the error occurs

16. There is a requirement to selectively update or insert values in the target table
based on the value of a field in the source table. This can be achieved using:
a. Update Strategy transformation
b. Aggregator transformation
c. Router transformation
d. Use the Expression transformation to write code for this logic
17. A mapping can contain more than one source qualifier one for each source that is
imported.
a. True
b. False
18. Which of the following sentences are accurate
a. Power Channels are used to improve data migration across WAN / LAN networks
b. Power Channels are adapters that Informatica provides for various ERP / CRM
packages
c. Power Connect are used to improve data migration across WAN / LAN networks
d. None of the above
19. To create a valid mapping in Informatica, at a minimum, the following entities are
required:
a. Source, Source Qualifier, Transformation, Target

b.
c.
d.

Source Qualifier, Transformation, Target


Source and Target
Source, Transformation, Target

20. When one imports a relational database table using the Source Analyzer, it always
creates the following in the mapping:
a. An instance of the table with a source qualifier with a one to one mapping for each
field
b. Source sorter with one to one mapping for each field
c. None of the above

Name:
Score:
Pass / Fail:

Ans:
1. a 2. b 3. a 4. c 5. a 6. a 7. a 8. b 9. a 10. b 11. b 12. a,b 13. a 14. c 15. a 16. a 17.
b8. a 19. a 20. a

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

What is difference between mapping


Parameter, SESSION Parameter, Database
connection session parameters? Its possible
to create 3parameters at a time? If Possible
which one will fire FIRST?
Scenario:

What is difference between mapping Parameter, SESSION Parameter, Database


connection session parameters? Its possible to create 3parameters at a time? If
Possible which one will fire FIRST?
Solution:
We can pass all these three types of parameters by using Perameterfile.we
can declare all in one parameter file.
A mapping parameter is set at the mapping level for values that do not change from
session to session for example tax rates.
Session parameter is set at the session level for values that can change from sesion
to session, such as database connections for DEV, QA and PRD environments.
The database connection session parameters can be created for all input fields to
connection objects. For example, username, password, etc.
It is possible to have multiple parameters at a time?
The order of execution is wf/s/m.

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

What is parameter file? & what is the


difference between mapping level and session
level variables?
Scenario:
What is parameter file? & what is the difference between mapping level and session
level variables?
Solution:
Parameter file it will supply the values to session level variables and mapping level
variables.
Variables are of two types:

Session level variables


Mapping level variables
Session level variables are of four types:

$DBConnection_Source
$DBConnection_Target
$InputFile
$OutputFile
Mapping level variables are of two types:

Variable
Parameter
What is the difference between mapping level and session level variables?
Mapping level variables always starts with $$.
A session level variable always starts with $.

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

What is Worklet?
Scenario:
What is Worklet?
Solution:
Worklet is a set of reusable sessions. We cannot run the worklet without workflow.
If we want to run 2 workflow one after another.

1. If both workflow exists in same folder we can create 2 worklet rather


than creating 2 workfolws.

2. Finally we can call these 2 worklets in one workflow.

3. There we can set the dependency.


If both workflows exists in different folders or repository then
we cannot create worklet.

5. We can set the dependency between these two workflow using shell
script is one approach.
The other approach is event wait and event rise.
Posted 16th December 2011 by Prafull Dangore
0

Add a comment

Dec
16

Differences between dynamic lookup cache


and static lookup cache?
Scenario:
Differences between dynamic lookup cache and static lookup cache?
Solution:
Dynamic Lookup Cache
In dynamic lookup the cache memory
will get refreshed as soon as the
record
get
inserted
or
updated/deleted in the lookup table.
When
we
configure
a
lookup
transformation to use a dynamic
lookup cache, you can only use the
equality operator in the lookup
condition.
NewLookupRow port will enable
automatically.
Best example where we need to use
dynamic cache is if suppose first
record and last record both are same
but there is a change in the address.
What informatica mapping has to do
here is first record needs to get insert
and last record should get update in
the target table.

Static Lookup Cache


In static lookup the cache memory
will not get refreshed even though
record inserted or updated in the
lookup table it will refresh only in the
next session run.
It is a default cache.

If we use static lookup first record it


will go to lookup and check in the
lookup cache based on the condition
it will not find the match so it will
return null value then in the router it
will send that record to insert flow.
But still this record dose not available
in the cache memory so when the

last record comes to lookup it will


check in the cache it will not find the
match so it returns null value again it
will go to insert flow through router
but it is suppose to go to update flow
because cache didnt get refreshed
when the first record get inserted
into target table.

Posted 16th December 2011 by Prafull Dangore


0

Add a comment

Dec
16

What is the difference between joiner and


lookup?
Scenario:
What is the difference between joiner and lookup?
Solution:
Joiner
In joiner on multiple matches it will
return all matching records.
In joiner we cannot configure to use
persistence cache, shared cache,
uncached and dynamic cache
We cannot override the query in joiner
We can perform outer join in joiner
transformation.
We cannot use relational operators in
joiner transformation.(i.e. <,>,<= and so
on)

Posted 16th December 2011 by Prafull Dangore

Lookup
In lookup it will return either first record
or last record or any value or error value.
Where as in lookup we can configure to
use persistence cache, shared cache,
uncached and dynamic cache.
We can override the query in lookup to
fetch the data from multiple tables.
We cannot perform outer join in lookup
transformation.
Where as in lookup we can use the
relation operators. (i.e. <,>,<= and so
on)

Add a comment

Dec
15

Syllabus for Informatica Professional


Certification

Skill Set Inventory


Informatica Professional Certification Examination S
PowerCenter 8 Mapping Design
Includes Informatica PowerCenter 8.1.1
The PowerCenter 8 Mapping Design examination is composed of the fourteen
sections listed below. In order to ensure that you are prepared for the test, review the
subtopics associated with each section. The Informatica documentation is an excellent
source of information on the material that will be covered in the examination. If you are
thoroughly knowledgeable in the areas mentioned in this Skill Set Inventory, you will do well
on the examination.
The examination is designed to test for expert level knowledge. Informatica strongly
urges you to attain a complete understanding of these topics before you attempt to take
the examination. Hands-on experience with the software is the best way to gain this
understanding.
1. Designer configuration
A. Be familiar with the rules for using shared and non-shared folders.
B. Understand the meaning of each of the Designer configuration options.

C. Know what Designer options can be configured separately for each client machine.
D. Be familiar with the Designer toolbar functions, such as Find.
2. Transformation ports
A. Know the rules for linking transformation ports together.
B. Know the rules for using and converting the PowerCenter datatypes.
C. Know what types of transformation ports are supported and the uses for each.
D. Be familiar with the types of data operations that can be performed at the port level.
3. Source and target definitions
A. Understand how editing source and target definitions affects associated objects such as
mappings and mapplets.
B. Understand how the repository stores referential integrity.
C. Know how to edit flat file definitions at any time.
D. Know the types of source and target definitions PowerCenter supports.
E. Know how to determine if a session is considered to have heterogeneous targets.
F. Understand the rules and guidelines of overriding target types.
4. Validation
A. Know all the possible reasons why an expression may be invalid.
B. Understand how to use strings correctly in PowerCenter expressions.
C. Know the rules regarding connecting transformations to other transformations.
D. Know the rules for mapping and mapplet validation.
5. Transformation language
A. Be familiar with all transformation language functions and key words.
B. Know how the Integration Service evaluates expressions.
C. Be able to predict the output or result of a given expression.

6. Source Qualifier transformation


A. Understand how the Source Qualifier transformation handles datatypes.
B. Know how the default query is generated and the rules for modifying it.
C. Understand how to use the Source Qualifier transformation to perform various types of
joins.
7. Aggregator transformation
A. Know how to use PowerCenter aggregate functions.
B. Understand how to be able to use a variable port in an Aggregator transformation.
C. Be able to predict the output of a given Aggregator transformation.
D. Know the rules associated with defining and using aggregate caches.
8. Sorter and Sequence Generator transformations
A. Know the rules and guidelines for using the Sorter transformation.
B. Know how the Integration Service processes data at a Sorter transformation.
C. Understand how the Sorter transformation uses hardware resources.
D. Understand the meaning and use of the Distinct Output Rows Sorter transformation
property.
E. Understand the difference in the ports used in the Sequence Generator
transformation
and how each can be used.
9. Lookup transformation
A. Know the rules and guidelines for using connected and unconnected Lookup
transformations.
B. Know the ways a Lookup transformation may cause a session to fail.
C. Be familiar with the meaning of all Lookup transformation properties.
D. Know how the Integration Service processes a dynamic lookup cache.
E. Know what types of Lookup transformations are supported under various configurations.
10. Joiner transformation
A. Know how to create and use Joiner transformations.
B. Know how to configure a Joiner transformation for sorted input.
C. Understand the supported join types and options available for controlling the join.
11. Update Strategy transformation
A. Know how to use an Update Strategy transformation in conjunction with the session
properties.
B. Understand how an Update Strategy transformation affects downstream transformations.
C. Be familiar with the Update Strategy transformation properties and options.
D. Know what can happen to a given row for each different type of row operation.

12. Filter and Router transformations

A. Understand how to create and use Router and Filter Transformations.

13. Mapplets and reusable logic


A. Be familiar with the rules and guidelines regarding mapplets.
B. Know how to use mapplet Output transformations and output groups.
C. Know the rules regarding active and passive mapplets.
D. Know the rules and guidelines for copying parts of a mapping.
14. Data preview
A. Know the connectivity requirements and options for previewing data using the
PowerCenter Client.
B. Know the rules and guidelines for previewing data using the PowerCenter Client.

Skill Set Inventory

Informatica Professional Certification Examination R


PowerCenter 8 Architecture and Administration
Includes Informatica PowerCenter 8.1.1
The PowerCenter 8 Architecture and Administration examination is composed of the
twelve
sections listed below. In order to ensure that you are prepared for the test, review the
subtopics
associated with each section. The Informatica documentation is an excellent source of
information on the material that will be covered in the examination. If you are thoroughly
knowledgeable in the areas mentioned in this Skill Set Inventory, you will do well on the
examination.
The examination is designed to test for expert level knowledge. Informatica
strongly
urges you to attain a complete understanding of these topics before you attempt to take
the examination. Hands-on experience with the software is the best way to gain this
understanding..
1. Platform components and Service Architecture
A. Know what operations can be performed with each client tool (Administration
Console,
Repository Manager, Designer, Workflow Manager, Workflow Monitor).
B. Know the purpose and uses for each of the windows in the client tools (Output
window,
Details window, Navigator window, Task View, Gantt Chart View, etc).
C. Be able to specify which components are necessary to perform various
development
and maintenance operations.
D. Know the purpose and uses for each of the tabs and folders in the PowerCenter
Administration Console.
2. Nomenclature

A. Be able to define all object types and properties used by the client and service tools.
B. Be familiar with the properties of the Repository Service and the Integration Service.
C. Know the meaning of the terms used to describe development and maintenance
operations.
D. Know how to work with repository variables.
E. Understand the relationships between all PowerCenter object types.
F. Know which tools are used to create and modify all objects.

3. Repository Service
A. Know how each client and service component communicates with relational databases.
B. Be familiar with the connectivity options that are available for the different tools.
C. Understand how the client and service tools access flat files, COBOL files, and XML
Files.
D. Know the requirements for using various types of ODBC drivers with the client tools.
E. Know the meaning of all database connection properties.
F. Be familiar with the sequence of events involving starting the Repository Service.
G. Know which repository operations can be performed from the command line.
H. Know how local and global repositories interact.
4. Installation
A. Understand the basic procedure for installing the client and service software.
B. Know what non-Informatica hardware and software is required for installation.
C. Be familiar with network related requirements and limitations.
D. Know which components are needed to perform a repository upgrade.
E. Be familiar with the data movement mode options.
5. Security
A. Be familiar with the security permissions for application users.
B. Be familiar with the meaning of the various user types for an Informatica system.
C. Know the basic steps for creating and configuring application users.
D. Understand how user security affects folder operations.
E. Know which passwords and other key information are needed to install and connect new
client software to a service environment.
6. Object sharing
A. Understand the differences between copies and shortcuts.
B. Know which object properties are inherited in shortcuts.
C. Know the rules associated with transferring and sharing objects between folders.
D. Know the rules associated with transferring and sharing objects between repositories.
7. Repository organization and migration
A. Understand the various options for organizing a repository.
B. Be familiar with how a repository stores information about its own properties.
C. Be familiar with metadata extensions.

D. Know the capabilities and limitations of folders and other repository objects.
E. Know what type of information is stored in the repository.
8. Database connections
A. Understand the purpose and relationships between the different types of code
pages.
B. Know the differences between using native and ODBC database connections in the
Integration Service.
C. Understand how and why the client tools use database connectivity.
D. Know the differences between client and service connectivity.

9. Workflow Manager configuration


A. Know what privileges and permissions are needed to perform various operations
with

the Workflow Manager.


B. Be able to identify which interface features in the Workflow Manager are user
configurable.
C. Be familiar with database, external loader, and FTP configuration using the
Workflow
Manager.
10. Workflow properties
A. Be familiar with all user-configurable workflow properties.
B. Know what permissions are required to make all possible changes to workflow
properties.
C. Know the reasons why a workflow may fail and how these reasons relate to the workflow
properties.
D. Know the rules for linking tasks within a workflow.
E. Be familiar with the properties and rules of all types of workflow tasks.
F. Know how to use a workflow to read a parameter file.
11. Running and monitoring workflows
A. Know what types of privileges and permissions are needed to run and schedule
workflows.
B. Understand how and why target rows may be rejected for loading.
C. Be familiar with the rules associated with workflow links.
D. Understand how tasks behave when run outside of a workflow.
E. Know which mapping properties can be overridden at the session level.
F. Know how to work with reusable workflow schedules.
12. Workflow and task errors
A. Know how to abort or stop a workflow or task.
B. Understand how to work with workflow and session log files.
C. Understand how to work with reject files.

D. Know how to use the Workflow Monitor to quickly determine the status of any workflow or
task

Skill Set Inventory


Informatica Professional Certification Examination U
PowerCenter 8 Advanced Mapping Design
Includes Informatica PowerCenter 8.1.1

The PowerCenter 8 Advanced Mapping Design examination is composed of the


sections listed
below. All candidates should be aware that Informatica doesnt recommend preparing for
certification exams only by studying the software documentation. The exams are intended
for
candidates who have acquired their knowledge through hands-on experience. This Skill Set
Inventory is intended to help you ensure that there are no gaps in your knowledge. If you are
thoroughly knowledgeable in the areas mentioned in this Skill Set Inventory, there will not
be
any surprises when you take the examination.
The examination is designed to test for expert level knowledge. Informatica
strongly
urges you to attain a complete understanding of these topics before you attempt to take
the examination. Hands-on experience with the software is the best way to gain this
understanding.
1. XML sources and targets
A.Be familiar with the procedures and methods involved in defining an XML
source definition.
B. Know how to define and use an XML target definition.
C. Know the limitations associated with using XML targets.
D. Understand how XML definitions are related to code pages.
E. Know how to edit existing XML definitions.
F. Understand how the Designer validates XML sources.
2. Datatype formats and conversions
A.Understand the date/time formats available in the transformation language.
B. Be familiar with how transformation functions handle null values.
C.Know the valid input datatypes for the various conversion functions.
D. Know how transformation functions behave when given incorrectly formatted
arguments.
E.Know how to extract a desired subset of data from a given input (the hour
from a date/time value, for example).
3. The Debugger

A. Be familiar with the procedure to run a debug session.


B. Know the rules for working with breakpoints.
C. Know how to test expressions in a debug session.
D.Be familiar with the options available while using the Debugger.
E.Know how Debugger session properties and breakpoints can be saved.
4. Custom and Union transformations
A. Understand how the Custom transformation works.
B.Understand the rules and guidelines of the Custom and Union
transformations; modes, input and output groups, scope, etc.
C. Know the purpose of the Union transformation.

5. User-defined functions
A. Know how to create user-defined functions.
B. Know the scope of user-defined functions.
C.Know how to use and manage user-defined functions.
D. Understand the different properties for user-defined functions.
E. Know how to create expressions with user-defined functions.
6. Normalizer transformation
A. Be familiar with the possible uses of the Normalizer transformation.
B. Understand how to read a COBOL data source in a mapping.
C. Be familiar with the rules regarding reusable Normalizer transformations.
D.Know how the OCCURS and REDEFINES COBOL keywords affect the
Normalizer transformation.
7.Lookup transformation caching and performance
A. Know the difference between static and dynamic lookup caches.
B.Know the advantages and disadvantages of dynamic lookup caches.
C. Be familiar with the rules regarding Lookup SQL overrides.
D.Know how to improve Lookup transformation performance.
E.Know how to use a dynamic Lookup transformation in a mapping.
8. Mapping parameters and variables
A.Understand the differences between mapping parameters and variables.
B. Know how to create and use mapping parameters and variables.
C. Understand what affects the value of a mapping variable.
D. Know the parameter order of precedence.
E. Understand how to use the property mapping variable aggregation type.
F.Be familiar with the rules affecting parameters used with mapplets and reusable
transformations.
9. Transaction control

A.Understand how the Transaction Control transformation works and the


purpose of each related variable.
B.Know how to create and use a Transaction Control transformation in a
mapping.
C.Know the difference between effective and ineffective Transaction Control
transformations and what makes it effective or ineffective.
D.Know the rules and guidelines for using Transaction Control transformations
in a mapping.
E.Understand the meaning and purpose of a transaction control unit.

10. Advanced expressions


A. Be familiar with all special functions, such as ABORT and ERROR.
B.Know the allowable input datatypes for the Informatica transformation
language functions.
C.Know how to use expressions to set variables.
D. Know the details behind the meaning and use of expression time stamps.
E. Understand how to use the system variables.
11. Mapping optimization
A.Know how to collect and view performance details for transformations in a
mapping.
B.Know how to use local variables to improve transformation performance.
C. Know when is the best time in the development cycle to optimize mappings.
D.Be familiar with specific mapping optimization techniques described in the
PowerCenter documentation.
12. Version control
A. Know the difference between a deleted object and a purged object.
B. Understand parent/child relationships in a versioned repository.
C.Understand how to view object history and how/when objects get versioned.
13. Incremental Aggregation
A.Understand how incremental aggregation works.
B.Know how to use incremental aggregation
C. Know how to work with the session Sort Order property.
D.Know which files are created when a session using incremental aggregation
runs.
14. Global SDK
15. Installation and Support

Posted 15th December 2011 by Prafull Dangore


0

Add a comment

Dec
15

Informatica Performance Tuning


Scenario:
Informatica Performance Tuning
Solution:
Identifying Target Bottlenecks
-----------------------------The most common performance bottleneck occurs when the Informatica Server writes to a target
database. You can identify target bottlenecks by configuring the session to write to a flat file target. If the
session performance increases significantly when you write to a flat file, you have
a target bottleneck.
Consider performing the following tasks to increase performance:
* Drop indexes and key constraints.
* Increase checkpoint intervals.
* Use bulk loading.
* Use external loading.
* Increase database network packet size.
* Optimize target databases.
Identifying Source Bottlenecks
-----------------------------If the session reads from relational source, you can use a filter transformation, a read test mapping, or a
database query to identify source bottlenecks:
* Filter Transformation - measure the time taken to process a given amount of data, then add an always
false filter transformation in the mapping after each source qualifier so that no data is processed past the
filter transformation. You have a source bottleneck if the new session runs in about the same time.
* Read Test Session - compare the time taken to process a given set of data using the session with that
for a session based on a copy of the mapping with all transformations after the source qualifier removed
with the source qualifiers connected to file targets. You have a source bottleneck if the new session runs
in about the same time.
* Extract the query from the session log and run it in a query tool. Measure the time taken to return the
first row and the time to return all rows. If there is a significant difference in time, you can use an optimizer
hint to eliminate the source bottleneck
Consider performing the following tasks to increase performance:
* Optimize the query.
* Use conditional filters.
* Increase database network packet size.
* Connect to Oracle databases using IPC protocol.
Identifying Mapping Bottlenecks
------------------------------If you determine that you do not have a source bottleneck, add an Always False filter transformation in the

mapping before each target definition so that no data is loaded into the target tables. If the time it takes to
run the new session is the same as the original session, you have a mapping bottleneck.
You can also identify mapping bottlenecks by examining performance counters.
Readfromdisk and Writetodisk Counters: If a session contains Aggregator, Rank, or Joiner
transformations, examine each Transformation_readfromdisk and Transformation_writetodisk counter. If
these counters display any number other than zero, you can improve session performance by increasing
the index and data cache sizes. Note that if the session uses Incremental Aggregation, the counters must
be examined during the run, because the Informatica Server writes to disk when saving historical data at
the end of the run.
Rowsinlookupcache Counter: A high value indicates a larger lookup, which is more likely to be a
bottleneck
Errorrows Counters: If a session has large numbers in any of the Transformation_errorrows counters, you
might improve performance by eliminating the errors.
BufferInput_efficiency and BufferOutput_efficiency counters: Any dramatic difference in a given set of
BufferInput_efficiency and BufferOutput_efficiency counters indicates inefficiencies that may benefit from
tuning.
To enable collection of performance data:
1. Set session property Collect Performance Data (on Performance tab)
2. Increase the size of the Load Manager Shared Memory by 200kb for each session in shared memory
that you configure to create performance details. If you create performance details for all sessions,
multiply the MaxSessions parameter by 200kb to calculate the additional shared memory requirements.
To view performance details in the Workflow Monitor:
1. While the session is running, right-click the session in the Workflow Monitor and choose Properties.
2. Click the Performance tab in the Properties dialog box.
To view the performance details file:
1. Locate the performance details file. The Informatica Server names the file session_name.perf, and
stores it in the same directory as the session log.
2. Open the file in any text editor.
General Optimizations
--------------------Single-pass reading - instead of reading the same data several times, combine mappings that use the
same set of source data and use a single source qualifier
Avoid unnecessary data conversions: For example, if your mapping moves data from an Integer column
to a Decimal column, then back to an Integer column, the unnecessary data type conversion slows
performance.
Factor out common expressions/transformations and perform them before data pipelines split
Optimize Char-Char and Char-Varchar Comparisons by using the Treat CHAR as CHAR On Read option
in the Informatica Server setup so that the Informatica Server does not trim trailing spaces from the end of
Char source fields.
Eliminate Transformation Errors (conversion errors, conflicting mapping logic, and any condition set up as
an error, such as null input). In large numbers they restrict performance because for each one, the
Informatica Server pauses to determine its cause, remove the row from the data flow and write it to the
session log or bad file.

As a short term fix, reduce the tracing level on sessions that must generate large numbers of errors.
Optimize lookups
---------------Cache lookups if
o the number of rows in the lookup table is significantly less than the typical number of source rows
o un-cached lookups perform poorly (e.g. they are based on a complex view or an unindexed table)
Optimize Cached lookups
o Use a persistent cache if the lookup data is static
o Share caches if several lookups are based on the same data set
o Reduce the number of cached rows using a SQL override with a restriction
o Index the columns in the lookup ORDER BY
o Reduce the number of co

Posted 15th December 2011 by Prafull Dangore


0

Add a comment

Dec
15

Informatica OPB table which have gives


source table and the mappings and folders
using an sql query
Scenario:
Informatica OPB table which have gives source table and the mappings and folders
using an sql query
Solution:
-

SQL query
select OPB_SUBJECT.SUBJ_NAME,
OPB_MAPPING.MAPPING_NAME,
OPB_SRC.source_name
from opb_mapping, opb_subject, opb_src, opb_widget_inst
where opb_subject.SUBJ_ID = opb_mapping.SUBJECT_ID
and OPB_MAPPING.MAPPING_ID = OPB_WIDGET_INST.MAPPING_ID
and OPB_WIDGET_Inst.WIDGET_ID = OPB_SRC.SRC_ID

and OPB_widget_inst.widget_type=1;

Posted 15th December 2011 by Prafull Dangore


0

Add a comment

Dec
15

What is Pushdown Optimization and things


to consider
Scenario: What is Pushdown Optimization and things to consider
Solution:
The process of pushing transformation logic to the source or target database by Informatica
Integration service is known as Pushdown Optimization. When a session is configured to run
for Pushdown Optimization, the Integration Service translates the transformation logic into
SQL queries and sends the SQL queries to the database. The Source or Target Database
executes the SQL queries to process the transformations.

How does
Works?

Pushdown

Optimization

(PO)

The Integration Service generates SQL statements when native database driver is used. In
case of ODBC drivers, the Integration Service cannot detect the database type and
generates ANSI SQL. The Integration Service can usually push more transformation logic to
a database if a native driver is used, instead of an ODBC driver.
For any SQL Override, Integration service creates a view (PM_*) in the database while
executing the session task and drops the view after the task gets complete. Similarly it also
create sequences (PM_*) in the database.
Database schema (SQ Connection, LKP connection), should have the Create View / Create
Sequence Privilege, else the session will fail.

Few Benefits in using PO

There is no memory or disk space required to manage the cache in the Informatica
server for Aggregator, Lookup, Sorter and Joiner Transformation, as the
transformation logic is pushed to database.

SQL Generated by Informatica Integration service can be viewed before running the
session through Optimizer viewer, making easier to debug.

When inserting into Targets, Integration Service do row by row processing using bind
variable (only soft parse only processing time, no parsing time). But In case of
Pushdown Optimization, the statement will be executed once.

Without Using Pushdown optimization:

INSERT INTO EMPLOYEES(ID_EMPLOYEE, EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL,


PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT,
MANAGER_ID,MANAGER_NAME,
DEPARTMENT_ID) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13) executes
7012352 times

With Using Pushdown optimization


INSERT INTO EMPLOYEES(ID_EMPLOYEE, EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL,
PHONE_NUMBER,
HIRE_DATE,
JOB_ID,
SALARY,
COMMISSION_PCT,
MANAGER_ID,
MANAGER_NAME,
DEPARTMENT_ID)
SELECT
CAST(PM_SJEAIJTJRNWT45X3OO5ZZLJYJRY.NEXTVAL
AS
NUMBER(15,
2)),
EMPLOYEES_SRC.EMPLOYEE_ID, EMPLOYEES_SRC.FIRST_NAME, EMPLOYEES_SRC.LAST_NAME,
CAST((EMPLOYEES_SRC.EMAIL
||
@gmail.com)
AS
VARCHAR2(25)),
EMPLOYEES_SRC.PHONE_NUMBER,
CAST(EMPLOYEES_SRC.HIRE_DATE
AS
date),
EMPLOYEES_SRC.JOB_ID,
EMPLOYEES_SRC.SALARY,
EMPLOYEES_SRC.COMMISSION_PCT,
EMPLOYEES_SRC.MANAGER_ID,
NULL,
EMPLOYEES_SRC.DEPARTMENT_ID
FROM
(EMPLOYEES_SRC
LEFT
OUTER
JOIN
EMPLOYEES
PM_Alkp_emp_mgr_1
ON
(PM_Alkp_emp_mgr_1.EMPLOYEE_ID
=
EMPLOYEES_SRC.MANAGER_ID))
WHERE
((EMPLOYEES_SRC.MANAGER_ID = (SELECT PM_Alkp_emp_mgr_1.EMPLOYEE_ID FROM
EMPLOYEES
PM_Alkp_emp_mgr_1
WHERE
(PM_Alkp_emp_mgr_1.EMPLOYEE_ID
=
EMPLOYEES_SRC.MANAGER_ID))) OR (0=0)) executes 1 time

Things to note when using PO


There are cases where the Integration Service and Pushdown Optimization can produce
different result sets for the same transformation logic. This can happen during data type
conversion, handling null values, case sensitivity, sequence generation, and sorting of data.
The database and Integration Service produce different output when the following settings
and conversions are different:

Nulls treated as the highest or lowest value: While sorting the data,
the Integration Service can treat null values as lowest, but database treats null
values as the highest value in the sort order.

SYSDATE built-in variable: Built-in Variable SYSDATE in the Integration


Service returns the current date and time for the node running the service process.
However, in the database, the SYSDATE returns the current date and time for the
machine hosting the database. If the time zone of the machine hosting the database
is not the same as the time zone of the machine running the Integration Service
process, the results can vary.

Date Conversion: The Integration Service converts all dates before pushing
transformations to the database and if the format is not supported by the database,
the session fails.

Logging: When the Integration Service pushes transformation logic to the


database, it cannot trace all the events that occur inside the database server. The
statistics the Integration Service can trace depend on the type of pushdown
optimization. When the Integration Service runs a session configured for full
pushdown optimization and an error occurs, the database handles the errors. When
the database handles errors, the Integration Service does not write reject rows to the
reject file.

Posted 15th December 2011 by Prafull Dangore

Add a comment

Dec
14

Informatica Interview Questionnaire


Informatica Interview Questionnaire
1.

What are the components of Informatica? And what is the purpose of each?
Ans: Informatica Designer, Server Manager & Repository Manager. Designer for Creating Source & Target
definitions, Creating Mapplets and Mappings etc. Server Manager for creating sessions & batches, Scheduling the
sessions & batches, Monitoring the triggered sessions and batches, giving post and pre session commands, creating
database connections to various instances etc. Repository Manage for Creating and Adding repositories, Creating &
editing folders within a repository, Establishing users, groups, privileges & folder permissions, Copy, delete, backup a
repository, Viewing the history of sessions, Viewing the locks on various objects and removing those locks etc.

2.

What is a repository? And how to add it in an informatica client?


Ans: Its a location where all the mappings and sessions related information is stored. Basically its a database where
the metadata resides. We can add a repository through the Repository manager.

3.

Name at least 5 different types of transformations used in mapping design and state the use of each.
Ans: Source Qualifier Source Qualifier represents all data queries from the source, Expression Expression performs simple
calculations,
Filter Filter serves as a conditional filter,
Lookup Lookup looks up values and passes to other objects,
Aggregator - Aggregator performs aggregate calculations.

4.

How can a transformation be made reusable?


Ans: In the edit properties of any transformation there is a check box to make it reusable, by checking that it becomes
reusable. You can even create reusable transformations in Transformation developer.

5.

How are the sources and targets definitions imported in informatica designer? How to create Target
definition for flat files?
Ans: When you are in source analyzer there is a option in main menu to Import the source from Database, Flat File,
Cobol File & XML file, by selecting any one of them you can import a source definition. When you are in Warehouse
Designer there is an option in main menu to import the target from Database, XML from File and XML from sources
you can select any one of these.
There is no way to import target definition as file in Informatica designer. So while creating the target definition for a
file in the warehouse designer it is created considering it as a table, and then in the session properties of that
mapping it is specified as file.

6.

Explain what is sql override for a source table in a mapping.


Ans: The Source Qualifier provides the SQL Query option to override the default query. You can enter any SQL
statement supported by your source database. You might enter your own SELECT statement, or have the database
perform aggregate calculations, or call a stored procedure or stored function to read the data and perform some
tasks.

7.

What is lookup override?


Ans: This feature is similar to entering a custom query in a Source Qualifier transformation. When entering a Lookup
SQL Override, you can enter the entire override, or generate and edit the default SQL statement.
The lookup query override can include WHERE clause.

8.

9.

What are mapplets? How is it different from a Reusable Transformation?


Ans: A mapplet is a reusable object that represents a set of transformations. It allows you to reuse transformation
logic and can contain as many transformations as you need. You create mapplets in the Mapplet Designer.
Its different than a reusable transformation as it may contain a set of transformations, while a reusable transformation
is a single one.
How to use an oracle sequence generator in a mapping?
Ans: We have to write a stored procedure, which can take the sequence name as input and dynamically generates a
nextval from that sequence. Then in the mapping we can use that stored procedure through a procedure
transformation.

10. What is a session and how to create it?


Ans: A session is a set of instructions that tells the Informatica Server how and when to move data from sources to
targets. You create and maintain sessions in the Server Manager.
11. How to create the source and target database connections in server manager?
Ans: In the main menu of server manager there is menu Server Configuration, in that there is the menu Database
connections. From here you can create the Source and Target database connections.
12. Where are the source flat files kept before running the session?
Ans: The source flat files can be kept in some folder on the Informatica server or any other machine, which is in its
domain.
13. What are the oracle DML commands possible through an update strategy?
Ans: dd_insert, dd_update, dd_delete & dd_reject.
14. How to update or delete the rows in a target, which do not have key fields?
Ans: To Update a table that does not have any Keys we can do a SQL Override of the Target Transformation by
specifying the WHERE conditions explicitly. Delete cannot be done this way. In this case you have to specifically
mention the Key for Target table definition on the Target transformation in the Warehouse Designer and delete the
row using the Update Strategy transformation.
15. What is option by which we can run all the sessions in a batch simultaneously?
Ans: In the batch edit box there is an option called concurrent. By checking that all the sessions in that Batch will run
concurrently.
16. Informatica settings are available in which file?
Ans: Informatica settings are available in a file pmdesign.ini in Windows folder.
17. How can we join the records from two heterogeneous sources in a mapping?
Ans: By using a joiner.
18. Difference between Connected & Unconnected look-up.
Ans: An unconnected Lookup transformation exists separate from the pipeline in the mapping. You write an
expression using the :LKP reference qualifier to call the lookup within another transformation. While the connected
lookup forms a part of the whole flow of mapping.

19. Difference between Lookup Transformation & Unconnected Stored Procedure Transformation Which one is
faster ?

20. Compare Router Vs Filter & Source Qualifier Vs Joiner.


Ans: A Router transformation has input ports and output ports. Input ports reside in the input group, and output ports reside in the
output groups. Here you can test data based on one or more group filter conditions.
But in filter you can filter data based on one or more conditions before writing it to targets.
A source qualifier can join data coming from same source database. While a joiner is used to combine data from
heterogeneous sources. It can even join data from two tables from same database.
A source qualifier can join more than two sources. But a joiner can join only two sources.
21. How to Join 2 tables connected to a Source Qualifier w/o having any relationship defined ?
Ans: By writing an sql override.
22. In a mapping there are 2 targets to load header and detail, how to ensure that header loads first then detail
table.
Ans: Constraint Based Loading (if no relationship at oracle level) OR Target Load Plan (if only 1 source qualifier for
both tables) OR select first the header target table and then the detail table while dragging them in mapping.
23. A mapping just take 10 seconds to run, it takes a source file and insert into target, but before that there is a
Stored Procedure transformation which takes around 5 minutes to run and gives output Y or N. If Y then
continue feed or else stop the feed. (Hint: since SP transformation takes more time compared to the
mapping, it shouldnt run row wise).
Ans: There is an option to run the stored procedure before starting to load the rows.
Data warehousing concepts
1.What is difference between view and materialized view?
Views contains query whenever execute views it has read from base table
Where as M views loading or replicated takes place only once, which gives you better query performance
Refresh m views 1.on commit and 2. on demand
(Complete, never, fast, force)
2.What is bitmap index why its used for DWH?
A bitmap for each key value replaces a list of rowids. Bitmap index more efficient for data warehousing because low
cardinality, low updates, very efficient for where class
3.What is star schema? And what is snowflake schema?
The center of the star consists of a large fact table and the points of the star are the dimension tables.
Snowflake schemas normalized dimension tables to eliminate redundancy. That is, the
Dimension data has been grouped into multiple tables instead of one large table.

Star schema contains demoralized dimension tables and fact table, each primary key values in dimension table
associated with foreign key of fact tables.
Here a fact table contains all business measures (normally numeric data) and foreign key values, and dimension
tables has details about the subject area.
Snowflake schema basically a normalized dimension tables to reduce redundancy in the dimension tables
4.Why need staging area database for DWH?
Staging area needs to clean operational data before loading into data warehouse.
Cleaning in the sense your merging data which comes from different source
5.What are the steps to create a database in manually?
create os service and create init file and start data base no mount stage then give create data base command.
6.Difference between OLTP and DWH?
OLTP system is basically application orientation (eg, purchase order it is functionality of an application)
Where as in DWH concern is subject orient (subject in the sense custorer, product, item, time)
OLTP

Application Oriented

Used to run business

Detailed data

Current up to date

Isolated Data

Repetitive access

Clerical User

Performance Sensitive

Few Records accessed at a time (tens)

Read/Update Access

No data redundancy

Database Size 100MB-100 GB


DWH

Subject Oriented

Used to analyze business

Summarized and refined

Snapshot data

Integrated Data

Ad-hoc access

Knowledge User

Performance relaxed

Large volumes accessed at a time(millions)

Mostly Read (Batch Update)

Redundancy present

Database Size 100 GB - few terabytes


7.Why need data warehouse?
A single, complete and consistent store of data obtained from a variety of different sources made available to end
users in a what they can understand and use in a business context.
A process of transforming data into information and making it available to users in a timely enough manner to make a
difference Information

Technique for assembling and managing data from various sources for the purpose of answering business questions.
Thus making decisions that were not previous possible

8.What is difference between data mart and data warehouse?


A data mart designed for a particular line of business, such as sales, marketing, or finance.
Where as data warehouse is enterprise-wide/organizational
The data flow of data warehouse depending on the approach
9.What is the significance of surrogate key?
Surrogate key used in slowly changing dimension table to track old and new values and its derived from primary key.
10.What is slowly changing dimension. What kind of scd used in your project?
Dimension attribute values may change constantly over the time. (Say for example customer dimension has
customer_id,name, and address) customer address may change over time.
How will you handle this situation?
There are 3 types, one is we can overwrite the existing record, second one is create additional new record at the time
of change with the new attribute values.
Third one is create new field to keep new values in the original dimension table.
11.What is difference between primary key and unique key constraints?
Primary key maintains uniqueness and not null values
Where as unique constrains maintain unique values and null values
12.What are the types of index? And is the type of index used in your project?
Bitmap index, B-tree index, Function based index, reverse key and composite index.
We used Bitmap index in our project for better performance.
13.How is your DWH data modeling(Details about star schema)?
14.A table have 3 partitions but I want to update in 3rd partitions how will you do?
Specify partition name in the update statement. Say for example
Update employee partition(name) a, set a.empno=10 where ename=Ashok
15.When you give an update statement how memory flow will happen and how oracles allocate memory for
that?
Oracle first checks in Shared sql area whether same Sql statement is available if it is there it uses. Otherwise allocate memory in
shared sql area and then create run time memory in Private sql area to create parse tree and execution plan. Once it completed
stored in the shared sql area wherein previously allocated memory
16.Write a query to find out 5th max salary? In Oracle, DB2, SQL Server
Select (list the columns you want) from (select salary from employee order by salary)
Where rownum<5

17.When you give an update statement how undo/rollback segment will work/what are the steps?
Oracle keep old values in undo segment and new values in redo entries. When you say rollback it replace old values
from undo segment. When you say commit erase the undo segment values and keep new vales in permanent.
Informatica Administration
18.What is DTM? How will you configure it?
DTM transform data received from reader buffer and its moves transformation to transformation on row by row basis and it uses
transformation caches when necessary.
19.You transfer 100000 rows to target but some rows get discard how will you trace them? And where its get
loaded?
Rejected records are loaded into bad files. It has record indicator and column indicator.
Record indicator identified by (0-insert,1-update,2-delete,3-reject) and column indicator identified by (D-valid,Ooverflow,N-null,T-truncated).
Normally data may get rejected in different reason due to transformation logic
20.What are the different uses of a repository manager?
Repository manager used to create repository which contains metadata the informatica uses to transform data from source to
target. And also it use to create informatica users and folders and copy, backup and restore the repository
21.How do you take care of security using a repository manager?
Using repository privileges, folder permission and locking.
Repository privileges(Session operator, Use designer, Browse repository, Create session and batches, Administer
repository, administer server, super user)
Folder permission(owner, groups, users)
Locking(Read, Write, Execute, Fetch, Save)

22.What is a folder?
Folder contains repository objects such as sources, targets, mappings, transformation which are helps logically organize our data
warehouse.
23.Can you create a folder within designer?
Not possible
24.What are shortcuts? Where it can be used? What are the advantages?

There are 2 shortcuts(Local and global) Local used in local repository and global used in global repository. The advantage is
reuse an object without creating multiple objects. Say for example a source definition want to use in 10 mappings in 10 different
folder without creating 10 multiple source you create 10 shotcuts.
25.How do you increase the performance of mappings?
Use single pass read(use one source qualifier instead of multiple SQ for same table)
Minimize data type conversion (Integer to Decimal again back to Integer)
Optimize transformation(when you use Lookup, aggregator, filter, rank and joiner)
Use caches for lookup
Aggregator use presorted port, increase cache size, minimize input/out port as much as possible
Use Filter wherever possible to avoid unnecessary data flow
26.Explain Informatica Architecture?
Informatica consist of client and server. Client tools such as Repository manager, Designer, Server manager. Repository data base
contains metadata it read by informatica server used read data from source, transforming and loading into target.
27.How will you do sessions partitions?
Its not available in power part 4.7
Transformation
28.What are the constants used in update strategy?
DD_INSERT, DD_UPDATE, DD_DELETE, DD_REJECT
29.What is difference between connected and unconnected lookup transformation?
Connected lookup return multiple values to other transformation
Where as unconnected lookup return one values
If lookup condition matches Connected lookup return user defined default values
Where as unconnected lookup return null values
Connected supports dynamic caches where as unconnected supports static
30.What you will do in session level for update strategy transformation?
In session property sheet set Treat rows as Data Driven

31.What are the port available for update strategy , sequence generator, Lookup, stored procedure transformation?
Transformations
Update strategy
Sequence Generator
Lookup
Stored Procedure

Port
Input, Output
Output only
Input, Output, Lookup, Return
Input, Output

32.Why did you used connected stored procedure why dont use unconnected stored procedure?
33.What is active and passive transformations?
Active transformation change the no. of records when passing to targe(example filter)
where as passive transformation will not change the transformation(example expression)

34.What are the tracing level?


Normal It contains only session initialization details and transformation details no. records rejected, applied
Terse - Only initialization details will be there
Verbose Initialization Normal setting information plus detailed information about the transformation.
Verbose data Verbose init. Settings and all information about the session
35.How will you make records in groups?
Using group by port in aggregator
36.Need to store value like 145 into target when you use aggregator, how will you do that?
Use Round() function
37.How will you move mappings from development to production database?
Copy all the mapping from development repository and paste production repository while paste it will promt whether you want
replace/rename. If say replace informatica replace all the source tables with repository database.
38.What is difference between aggregator and expression?
Aggregator is active transformation and expression is passive transformation
Aggregator transformation used to perform aggregate calculation on group of records really
Where as expression used perform calculation with single record
39.Can you use mapping without source qualifier?
Not possible, If source RDBMS/DBMS/Flat file use SQ or use normalizer if the source cobol feed
40.When do you use a normalizer?

Normalized can be used in Relational to denormilize data.


41.What are stored procedure transformations. Purpose of sp transformation. How did you go about using
your project?
Connected and unconnected stored procudure.
Unconnected stored procedure used for data base level activities such as pre and post load
Connected stored procedure used in informatica level for example passing one parameter as input and capturing
return value from the stored procedure.
Normal - row wise check
Pre-Load Source - (Capture source incremental data for incremental aggregation)
Post-Load Source - (Delete Temporary tables)
Pre-Load Target - (Check disk space available)
Post-Load Target (Drop and recreate index)
42.What is lookup and difference between types of lookup. What exactly happens when a lookup is cached.
How does a dynamic lookup cache work.
Lookup transformation used for check values in the source and target tables(primary key values).

There are 2 type connected and unconnected transformation


Connected lookup returns multiple values if condition true
Where as unconnected return a single values through return port.
Connected lookup return default user value if the condition does not mach
Where as unconnected return null values
Lookup cache does:
Read the source/target table and stored in the lookup cache
43.What is a joiner transformation?
Used for heterogeneous sources(A relational source and a flat file)
Type of joins:
Assume 2 tables has values(Master - 1, 2, 3 and Detail - 1, 3, 4)
Normal(If the condition mach both master and detail tables then the records will be displaced. Result set 1, 3)
Master Outer(It takes all the rows from detail table and maching rows from master table. Result set 1, 3, 4)
Detail Outer(It takes all the values from master source and maching values from detail table. Result set 1, 2, 3)
Full Outer(It takes all values from both tables)
44.What is aggregator transformation how will you use in your project?
Used perform aggregate calculation on group of records and we can use conditional clause to filter data
45.Can you use one mapping to populate two tables in different schemas?
Yes we can use
46.Explain lookup cache, various caches?
Lookup transformation used for check values in the source and target tables(primary key values).
Various Caches:
Persistent cache (we can save the lookup cache files and reuse them the next time process the lookup transformation)
Re-cache from database (If the persistent cache not synchronized with lookup table you can configure the lookup
transformation to rebuild the lookup cache)
Static cache (When the lookup condition is true, Informatica server return a value from lookup cache and its does not update the
cache while it processes the lookup transformation)
Dynamic cache (Informatica server dynamically inserts new rows or update existing rows in the cache and the target. Suppose if
we want lookup a target table we can use dynamic cache)
Shared cache (we can share lookup transformation between multiple transformations in a mapping. 2 lookup in a mapping can
share single lookup cache)
47.Which path will the cache be created?
User specified directory. If we say c:\ all the cache files created in this directory.
48.Where do you specify all the parameters for lookup caches?
Lookup property sheet/tab.
49.How do you remove the cache files after the transformation?

After session complete, DTM remove cache memory and deletes caches files.
In case using persistent cache and Incremental aggregation then caches files will be saved.
50.What is the use of aggregator transformation?
To perform Aggregate calculation
Use conditional clause to filter data in the expression Sum(commission, Commission >2000)
Use non-aggregate function iif (max(quantity) > 0, Max(quantitiy), 0))
51.What are the contents of index and cache files?
Index caches files hold unique group values as determined by group by port in the transformation.
Data caches files hold row data until it performs necessary calculation.
52.How do you call a store procedure within a transformation?
In the expression transformation create new out port in the expression write :sp.stored procedure name(arguments)
53.Is there any performance issue in connected & unconnected lookup? If yes, How?
Yes
Unconnected lookup much more faster than connected lookup why because in unconnected not connected to any other
transformation we are calling it from other transformation so it minimize lookup cache value
Where as connected transformation connected to other transformation so it keeps values in the lookup cache.
54.What is dynamic lookup?
When we use target lookup table, Informatica server dynamically insert new values or it updates if the values exist and passes to
target table.
55.How Informatica read data if source have one relational and flat file?
Use joiner transformation after source qualifier before other transformation.
56.How you will load unique record into target flat file from source flat files has duplicate data?
There are 2 we can do this either we can use Rank transformation or oracle external table
In rank transformation using group by port (Group the records) and then set no. of rank 1. Rank transformation return
one value from the group. That the values will be a unique one.
57.Can you use flat file for repository?
No, We cant
58.Can you use flat file for lookup table?
No, We cant

59.Without Source Qualifier and joiner how will you join tables?
In session level we have option user defined join. Where we can write join condition.
60.Update strategy set DD_Update but in session level have insert. What will happens?
Insert take place. Because this option override the mapping level option
Sessions and batches
61.What are the commit intervals?
Source based commit (Based on the no. of active source records(Source qualifier) reads. Commit interval set
10000 rows and source qualifier reads 10000 but due to transformation logic 3000 rows get rejected when 7000
reach target commit will fire, so writer buffer does not rows held the buffer)
Target based commit (Based on the rows in the buffer and commit interval. Target based commit set 10000 but
writer buffer fills every 7500, next time buffer fills 15000 now commit statement will fire then 22500 like go on.)
62.When we use router transformation?
When we want perform multiple condition to filter out data then we go for router. (Say for example source records 50 filter
condition mach 10 records remaining 40 records get filter out but still we want perform few more filter condition to filter
remaining 40 records.)
63.How did you schedule sessions in your project?
Run once (set 2 parameter date and time when session should start)
Run Every (Informatica server run session at regular interval as we configured, parameter Days, hour, minutes, end
on, end after, forever)
Customized repeat (Repeat every 2 days, daily frequency hr, min, every week, every month)
Run only on demand(Manually run) this not session scheduling.
64.How do you use the pre-sessions and post-sessions in sessions wizard, what for they used?
Post-session used for email option when the session success/failure send email. For that we should configure
Step1. Should have a informatica startup account and create outlook profile for that user
Step2. Configure Microsoft exchange server in mail box applet(control panel)
Step3. Configure informatica server miscellaneous tab have one option called MS exchange profile where we have specify the
outlook profile name.
Pre-session used for even scheduling (Say for example we dont know whether source file available or not in particular directory.
For that we write one DOS command to move file directory to destination and set event based scheduling option in session
property sheet Indicator file wait for).
65.What are different types of batches. What are the advantages and dis-advantages of a concurrent batch?
Sequential(Run the sessions one by one)
Concurrent (Run the sessions simultaneously)

Advantage of concurrent batch:


Its takes informatica server resource and reduce time it takes run session separately.
Use this feature when we have multiple sources that process large amount of data in one session. Split sessions and
put into one concurrent batches to complete quickly.
Disadvantage
Require more shared memory otherwise session may get failed
66.How do you handle a session if some of the records fail. How do you stop the session in case of errors. Can it be achieved in
mapping level or session level?
It can be achieved in session level only. In session property sheet, log files tab one option is the error handling Stop on -----errors. Based on the error we set informatica server stop the session.
67.How you do improve the performance of session.
If we use Aggregator transformation use sorted port, Increase aggregate cache size, Use filter before aggregation so that it
minimize unnecessary aggregation.
Lookup transformation use lookup caches
Increase DTM shared memory allocation
Eliminating transformation errors using lower tracing level(Say for example a mapping has 50 transformation when
transformation error occur informatica server has to write in session log file it affect session performance)
68.Explain incremental aggregation. Will that increase the performance? How?
Incremental aggregation capture whatever changes made in source used for aggregate calculation in a session, rather than
processing the entire source and recalculating the same calculation each time session run. Therefore it improve session
performance.
Only use incremental aggregation following situation:
Mapping have aggregate calculation
Source table changes incrementally
Filtering source incremental data by time stamp
Before Aggregation have to do following steps:
Use filter transformation to remove pre-existing records
Reinitialize aggregate cache when source table completely changes for example incremental changes happing daily and
complete changes happenings monthly once. So when the source table completely change we have reinitialize the aggregate
cache and truncate target table use new source table. Choose Reinitialize cache in the aggregation behavior in transformation tab
69.Concurrent batches have 3 sessions and set each session run if previous complete but 2nd fail then what will happen the
batch?
Batch will fail

General Project
70. How many mapping, dimension tables, Fact tables and any complex mapping you did? And what is your database size, how
frequently loading to DWH?
I did 22 Mapping, 4 dimension table and one fact table. One complex mapping I did for slowly changing dimension table.
Database size is 9GB. Loading data every day
71. What are the different transformations used in your project?
Aggregator, Expression, Filter, Sequence generator, Update Strategy, Lookup, Stored Procedure, Joiner, Rank,
Source Qualifier.
72. How did you populate the dimensions tables?
73. What are the sources you worked on?
Oracle
74. How many mappings have you developed on your whole dwh project?
45 mappings
75. What is OS used your project?
Windows NT
76. Explain your project (Fact table, dimensions, and database size)
Fact table contains all business measures (numeric values) and foreign key values, Dimension table contains details about subject
area like customer, product
77.What is difference between Informatica power mart and power center?
Using power center we can create global repository
Power mart used to create local repository
Global repository configure multiple server to balance session load
Local repository configure only single server
78.Have you done any complex mapping?
Developed one mapping to handle slowly changing dimension table.
79.Explain details about DTM?
Once we session start, load manager start DTM and it allocate session shared memory and contains reader and writer. Reader will
read source data from source qualifier using SQL statement and move data to DTM then DTM transform data to transformation
to transformation and row by row basis finally move data to writer then writer write data into target using SQL statement.

I-Flex Interview (14th May 2003)


80.What are the key you used other than primary key and foreign key?
Used surrogate key to maintain uniqueness to overcome duplicate value in the primary key.

81.Data flow of your Data warehouse(Architecture)


DWH is a basic architecture (OLTP to Data warehouse from DWH OLAP analytical and report building.
82.Difference between Power part and power center?
Using power center we can create global repository
Power mart used to create local repository
Global repository configure multiple server to balance session load
Local repository configure only single server
83.What are the batches and its details?
Sequential(Run the sessions one by one)
Concurrent (Run the sessions simultaneously)
Advantage of concurrent batch:
Its takes informatica server resource and reduce time it takes run session separately.
Use this feature when we have multiple sources that process large amount of data in one session. Split sessions and
put into one concurrent batches to complete quickly.
Disadvantage
Require more shared memory otherwise session may get failed
84.What is external table in oracle. How oracle read the flat file
Used for read flat file. Oracle internally write SQL loader script with control file.
85.What are the index you used? Bitmap join index?
Bitmap index used in data warehouse environment to increase query response time, since DWH has low cardinality,
low updates, very efficient for where clause.
Bitmap join index used to join dimension and fact table instead reading 2 different index.
86.What are the partitions in 8i/9i? Where you will use hash partition?
In oracle8i there are 3 partition (Range, Hash, Composite)
In Oracle9i List partition is additional one
Range (Used for Dates values for example in DWH ( Date values are Quarter 1, Quarter 2, Quarter 3, Quater4)
Hash (Used for unpredictable values say for example we cant able predict which value to allocate which partition
then we go for hash partition. If we set partition 5 for a column oracle allocate values into 5 partition accordingly).
List (Used for literal values say for example a country have 24 states create 24 partition for 24 states each)
Composite (Combination of range and hash)
91.What is main difference mapplets and mapping?

Reuse the transformation in several mappings, where as mapping not like that.
If any changes made in mapplets it automatically inherited in all other instance mapplets.
92. What is difference between the source qualifier filter and filter transformation?
Source qualifier filter only used for relation source where as Filter used any kind of source.
Source qualifier filter data while reading where as filter before loading into target.
93. What is the maximum no. of return value when we use unconnected
transformation?
Only one.
94. What are the environments in which informatica server can run on?
Informatica client runs on Windows 95 / 98 / NT, Unix Solaris, Unix AIX(IBM)
Informatica Server runs on Windows NT / Unix
Minimum Hardware requirements
Informatica Client Hard disk 40MB, RAM 64MB
Informatica Server Hard Disk 60MB, RAM 64MB
95. Can unconnected lookup do everything a connected lookup transformation can do?
No, We cant call connected lookup in other transformation. Rest of things its possible
96. In 5.x can we copy part of mapping and paste it in other mapping?
I think its possible
97. What option do you select for a sessions in batch, so that the sessions run one
after the other?
We have select an option called Run if previous completed
98. How do you really know that paging to disk is happening while you are using a lookup transformation?
Assume you have access to server?
We have collect performance data first then see the counters parameter lookup_readtodisk if its greater than 0 then
its read from disk
Step1. Choose the option Collect Performance data in the general tab session property
sheet.
Step2. Monitor server then click server-request session performance details
Step3. Locate the performance details file named called session_name.perf file in the session
log file directory
Step4. Find out counter parameter lookup_readtodisk if its greater than 0 then informatica
read lookup table values from the disk. Find out how many rows in the cache see
Lookup_rowsincache
99. List three option available in informatica to tune aggregator transformation?

Use Sorted Input to sort data before aggregation


Use Filter transformation before aggregator
Increase Aggregator cache size

100.Assume there is text file as source having a binary field to, to source qualifier What native data type
informatica will convert this binary field to in source qualifier?
Binary data type for relational source for flat file ?
101.Variable v1 has values set as 5 in designer(default), 10 in parameter file, 15 in
repository. While running session which value informatica will read?
Informatica read value 15 from repository
102. Joiner transformation is joining two tables s1 and s2. s1 has 10,000 rows and s2 has 1000 rows . Which
table you will set master for better performance of joiner
transformation? Why?
Set table S2 as Master table because informatica server has to keep master table in the cache so if it is 1000 in cache will get
performance instead of having 10000 rows in cache
103. Source table has 5 rows. Rank in rank transformation is set to 10. How many rows the rank
transformation will output?
5 Rank
104. How to capture performance statistics of individual transformation in the mapping and explain some
important statistics that can be captured?
Use tracing level Verbose data
105. Give a way in which you can implement a real time scenario where data in a table is changing and you
need to look up data from it. How will you configure the lookup transformation for this purpose?
In slowly changing dimension table use type 2 and model 1
106. What is DTM process? How many threads it creates to process data, explain each
thread in brief?
DTM receive data from reader and move data to transformation to transformation on row by row basis. Its create 2 thread one is
reader and another one is writer

107. Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows explain
the commit points for source based commit & target based commit. Assume appropriate value wherever
required?
Target Based commit (First time Buffer size full 7500 next time 15000)
Commit Every 15000, 22500, 30000, 40000, 50000
Source Based commit(Does not affect rows held in buffer)

Commit Every 10000, 20000, 30000, 40000, 50000


108.What does first column of bad file (rejected rows) indicates?
First Column - Row indicator (0, 1, 2, 3)
Second Column Column Indicator (D, O, N, T)
109. What is the formula for calculation rank data caches? And also Aggregator, data, index caches?
Index cache size = Total no. of rows * size of the column in the lookup condition (50 * 4)
Aggregator/Rank transformation Data Cache size = (Total no. of rows * size of the column in the lookup condition) +
(Total no. of rows * size of the connected output ports)
110. Can unconnected lookup return more than 1 value? No
INFORMATICA TRANSFORMATIONS

Aggregator
Expression
External Procedure
Advanced External Procedure
Filter
Joiner
Lookup
Normalizer
Rank
Router
Sequence Generator
Stored Procedure
Source Qualifier
Update Strategy
XML source qualifier

Expression Transformation
-

You can use ET to calculate values in a single row before you write to the target
You can use ET, to perform any non-aggregate calculation
To perform calculations involving multiple rows, such as sums of averages, use the Aggregator. Unlike ET the
Aggregator Transformation allow you to group and sort data

Calculation
-

To use the Expression Transformation to calculate values for a single row, you must include the following ports.
Input port for each value used in the calculation
Output port for the expression

NOTE
You can enter multiple expressions in a single ET. As long as you enter only one expression for each port, you can
create any number of output ports in the Expression Transformation. In this way, you can use one expression
transformation rather than creating separate transformations for each calculation that requires the same set of data.
Sequence Generator Transformation
-

Create keys
Replace missing values
This contains two output ports that you can connect to one or more transformations. The server generates a value
each time a row enters a connected transformation, even if that value is not used.
There are two parameters NEXTVAL, CURRVAL
The SGT can be reusable
You can not edit any default ports (NEXTVAL, CURRVAL)
SGT Properties
Start value
Increment By
End value
Current value
Cycle
(If selected, server cycles through sequence range. Otherwise,
Stops with configured end value)
Reset
No of cached values
NOTE
Reset is disabled for Reusable SGT
Unlike other transformations, you cannot override SGT properties at session level. This protects the integrity of
sequence values generated.
Aggregator Transformation

Difference between Aggregator and


Expression Transformation
We can use Aggregator to perform calculations on groups. Where as the Expression transformation permits
you to calculations on row-by-row basis only.
The server performs aggregate calculations as it reads and stores necessary data group and row data in an
aggregator cache.

When Incremental aggregation occurs, the server passes new source data through the mapping and uses historical
cache data to perform new calculation incrementally.

Components
Aggregate Expression
Group by port
Aggregate cache
When a session is being run using aggregator transformation, the server creates Index and data caches in memory to
process the transformation. If the server requires more space, it stores overflow values in cache files.
NOTE
The performance of aggregator transformation can be improved by using Sorted Input option. When this is selected,
the server assumes all data is sorted by group.

Incremental Aggregation
(1)

Using this, you apply captured changes in the source to aggregate calculation in a session. If the source changes
only incrementally and you can capture changes, you can configure the session to process only those changes
This allows the sever to update the target incrementally, rather than forcing it to process the entire source and
recalculate the same calculations each time you run the session.
Steps:
The first time you run a session with incremental aggregation enabled, the server process the entire source.
At the end of the session, the server stores aggregate data from that session ran in two files, the index file and data file. The
server creates the file in local directory.
The second time you run the session, use only changes in the source as source data for the session. The server then performs the
following actions:
For each input record, the session checks the historical information in the index file for a corresponding group, then:
If it finds a corresponding group
The server performs the aggregate operation incrementally, using the aggregate data for that group, and saves the
incremental changes.
Else
Server create a new group and saves the record data

Posted 14th December 2011 by Prafull Dangore


0

Add a comment

Dec
13

Router T/R is active or passive, what is


reason behind that?

Scenario:
Router T/R is active but some people are saying some times passive, what is reason
behind that?

Solution:
First of all Every Active transformation is a Passive transformation, but every
passive not Active.
In Router Transformation there is a special feature with Default group. Because of
Default Group its passive. We can avoid this Default group by some transformation
Settings, Now Its Active.

Posted 13th December 2011 by Prafull Dangore


0

Add a comment

Dec
13

If the source has duplicate records as id and


name columns, values: 1 a, 1 b, 1 c, 2 a, 2 b,
the target should be loaded as 1 a+b+c or 1 a||
b||c, what transformations should be used for
this?
Scenario:
If the source has duplicate records as id and name columns, values: 1 a, 1 b, 1 c, 2
a, 2 b, the target should be loaded as 1 a+b+c or 1 a||b||c, what transformations
should be used for this?
Solution:
Follow the below steps - smiler exp
1. user a sorter transformation and sort the data as per emp_id
2. Use Exp transformation:
Create blow ports
V_emp_id = emp_id

V_previous_emp_id = emp_id
V_emp_name = emp_name
V_emp_full_name = iif(V_emp_id = V_previous_emp_id , V_emp_name|| ||
V_emp_full_name, V_emp_name)
O_emp_full_name = V_emp_full_name
O_counter = iif(O_counter is null,1,O_counter+1)
3. output will look like
emp_id emp_name
Counter
101
soha
1
101
soha ali
101
soha ali kahn
3
102
Siva
4
102
Siva shanker
5
102
Siva shanker Reddy
6

4. Send Emp_id and Counter to Agg, where take a max counter for each id so

101
102

o/p will be
Emp_id Counter
101
3
102
6
5. Joint output of step three and 4, you will get desire output as
Emp_id Emp_name
Soha ali Kahn
Siva shanker Reddy

Posted 13th December 2011 by Prafull Dangore


0

Add a comment

Dec
13

I have a flat file, in which I have two fields


emp_id, emp_name. The data is like thisemp_id emp_name 101 soha 101 ali 101 kahn
102 Siva 102 shanker 102 Reddy How to
merge the names so that my output is like
this Emp_id Emp_name 101 Soha ali Kahn
102 Siva shanker Reddy

Scenario:
I have a flat file, in which I have two fields emp_id, emp_name. The data is like thisemp_id emp_name
101
soha
101
ali
101
kahn
102
Siva
102
shanker
102
Reddy
How to merge the names so that my output is like this
101
102

Emp_id Emp_name
Soha ali Kahn
Siva shanker Reddy

Solution:
Follow the below steps
1. user a sorter transformation and sort the data as per emp_id
2. Use Exp transformation:
Create blow ports
V_emp_id = emp_id
V_previous_emp_id = emp_id
V_emp_name = emp_name
V_emp_full_name = iif(V_emp_id = V_previous_emp_id , V_emp_name|| ||
V_emp_full_name, V_emp_name)
O_emp_full_name = V_emp_full_name
O_counter = iif(O_counter is null,1,O_counter+1)
3. output will look like
emp_id emp_name
Counter
101
soha
1
101
soha ali
101
soha ali kahn
3
102
Siva
4
102
Siva shanker
5
102
Siva shanker Reddy
6

4. Send Emp_id and Counter to Agg, where take a max counter for each id so

103

o/p will be
Emp_id Counter
101
3
102
6
5. Joint output of step three and 4, you will get desire output as
Emp_id Emp_name
Soha ali Kahn

104

Siva shanker Reddy

Posted 13th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

Difference between data mart and data


warehouse
Scenario:
Difference between data mart and data warehouse
Solution:
Data Mart
Data mart is usually sponsored at the
department level and developed with a
specific issue or subject in mind, a data
mart is a data warehouse with a focused
objective.
A data mart is used on a business
division/ department level.
A Data Mart is a subset of data from a
Data Warehouse. Data Marts are built for
specific user groups.

Data Warehouse
Data warehouse is a Subject-Oriented,
Integrated, Time-Variant, Nonvolatile
collection of data in support of decision
making.

A data warehouse is used on an


enterprise level
A Data Warehouse is simply an
integrated consolidation of data from a
variety of sources that is specially
designed to support strategic and
tactical decision making.
By providing decision makers with only a The main objective of Data Warehouse is
subset of data from the Data Warehouse, to provide an integrated environment
Privacy,
Performance
and
Clarity and coherent picture of the business at a
Objectives can be attained.
point in time.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment


Dec
12

What is the difference between snow flake


and star schema
Scenario:
What is the difference between snow flake and star schema

Solution:
Star Schema
The star schema is the simplest data
warehouse scheme.
In star schema each of the dimensions
is represented in a single table .It
should not have any hierarchies
between dims.
It contains a fact table surrounded by
dimension tables. If the dimensions are
de-normalized, we say it is a star
schema design.
In star schema only one join
establishes the relationship between
the fact table and any one of the
dimension tables.
A
star
schema
optimizes
the
performance by keeping queries simple
and providing fast response time. All
the information about the each level is
stored in one row.
It is called a star schema because the
diagram resembles a star.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Snow Flake Schema


Snowflake schema is a more complex
data warehouse model than a star
schema.
In snow flake schema at least one
hierarchy
should
exists
between
dimension tables.
It contains a fact table surrounded by
dimension tables. If a dimension is
normalized, we say it is a snow flaked
design.
In snow flake schema since there is
relationship between the dimensions
tables it has to do many joins to fetch
the data.
Snowflake
schemas
normalize
dimensions to eliminated redundancy.
The result is more complex queries and
reduced query performance.
It is called
because the
snowflake.

a snowflake schema
diagram resembles a


Dec
12

Difference between OLTP and


DWH/DS/OLAP
Scenario:
Difference between OLTP and DWH/DS/OLAP
Solution:
OLTP
OLTP maintains only current information.
It is a normalized structure.
Its volatile system.
It cannot be used for reporting purpose.
Since it is normalized structure so here it
requires multiple joins to fetch the data.
Its not time variant.
Its a pure relational model.

DWH/DSS/OLAP
OLAP contains full history.
It is a de-normalized structure.
Its non-volatile system.
Its a pure reporting system.
Here it does not require much joins to
fetch the data.
Its time variant.
Its a dimensional model.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

Differences between rowid and rownum


Scenario:
Differences between rowid and rownum
Solution:

Rowid
Rowid is an oracle internal id that is
allocated every time a new record is
inserted in a table. This ID is unique
and cannot be changed by the user.
Rowid is permanent.
Rowid is a globally unique identifier for
a row in a database. It is created at the
time the row is inserted into the table,
and destroyed when it is removed from
a table.

Rownum
Rownum is a row number returned by a
select statement.
Rownum is temporary.
The rownum pseudocoloumn returns a
number indicating the order in which
oracle selects the row from a table or
set of joined rows.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

Differences between stored procedure and


functions
Scenario:
Differences between stored procedure and functions
Solution:
Stored Procedure
Stored procedure may or may not return
values.
Stored procedure can be used to solve
the business logic.
Stored procedure is a pre-compiled
statement.
Stored procedure accepts more than one
argument.
Stored procedures are mainly used to
process the tasks.
Cannot be invoked from SQL statements.

Functions
Function should return at least one
output parameter. Can return more than
one parameter using OUT argument.
Function can be used to calculations
But function is not a pre-compiled
statement.
Whereas function does not accept
arguments.
Functions are mainly used to compute
values
Can be invoked form SQL statements

E.g. SELECT
Can affect the state of database using
commit.
Stored as a pseudo-code in database i.e.
compiled form.

e.g. SELECT
Cannot affect the state of database.
Parsed and compiled at runtime.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

Differences between where clause and having


clause
Scenario:
Differences between where clause and having clause
Solution:
Where clause
Having clause
Both where and having clause can be used to filter the data.
Where as in where clause it is not But having clause we need to use it
mandatory.
with the group by.
Where clause applies to the individual Where as having clause is used to
rows.
test some condition on the group
rather than on individual rows.
Where clause is used to restrict rows. But having clause is used to restrict
groups.
Restrict normal query by where
Restrict group by function by having
In where clause every record is In having clause it is with aggregate
filtered based on where.
records (group by functions).

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

What is the difference between view and


materialized view?
Scenario:
What is the difference between view and materialized view?
Solution:

View
A view has a logical existence. It does
not contain data.
Its not a database object.
We cannot perform DML operation on
view.
When we do select * from view it will
fetch the data from base table.
In view we cannot schedule to refresh.

Materialized view
A materialized view has a physical
existence.
It is a database object.
We can perform DML operation on
materialized view.
When we do select * from materialized
view it will fetch the data from
materialized view.
In materialized view we can schedule
to refresh.
We can keep aggregated data into
materialized view. Materialized view
can be created based on multiple
tables.

Materialized View
Materialized view is very essential for reporting. If we dont have the materialized
view it will directly fetch the data from dimension and facts. This process is very
slow since it involves multiple joins. So the same report logic if we put in the
materialized view. We can fetch the data directly from materialized view for
reporting purpose. So that we can avoid multiple joins at report run time.
It is always necessary to refresh the materialized view. Then it can simply perform
select statement on materialized view.

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

SQL command to kill a session/sid


Scenario:
SQL command to kill a session/sid
Solution:
ALTER SYSTEM KILL SESSION 'sid,serial#';
Query to find SID :
select module, a.sid,machine, b.SQL_TEXT,piece
from v$session a,v$sqltext b
where status='ACTIVE'
and a.SQL_ADDRESS=b.ADDRESS
--and a.USERNAME='NAME'
and sid=95
order by sid,piece;
Query to find serial#
select * from v$session where type = 'USER' and status = 'ACTIVE';--t0 get serial no

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

SQL command to find execution timing of a


query Like total execution time and so far
time spent

Scenario:
SQL command to find execution timing of a query
Like total execution time and so far time spent
Solution:
select target, sofar, totalwork, round((sofar/totalwork)*100,2) pct_done
from v$session_longops
where SID=95 and serial#=2020;
Query to find SID :
select module, a.sid,machine, b.SQL_TEXT,piece
from v$session a,v$sqltext b
where status='ACTIVE'
and a.SQL_ADDRESS=b.ADDRESS
--and a.USERNAME='NAME'
and sid=95
order by sid,piece;
Query to find serial#
select * from v$session where type = 'USER' and status = 'ACTIVE';--t0 get serial no

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
12

Query to find the SQL text of running


procedure
Scenario:
How to find which query part/query of Procedure is running?

Solution:
select module, a.sid,machine, b.SQL_TEXT,piece
from v$session a,v$sqltext b
where status='ACTIVE'
and a.SQL_ADDRESS=b.ADDRESS
--and a.USERNAME='NAME'
and sid=95
order by sid,piece;

Posted 12th December 2011 by Prafull Dangore


0

Add a comment

Dec
7

Design a mapping to load the first record


from a flat file into one table A, the last
record from a flat file into table B and the
remaining records into table C?
Scenario:
Design a mapping to load the first record from a flat file into one table A, the last
record from a flat file into table B and the remaining records into table C?

Solution:
Please follow the below steps
1. From source qualifier pass data to the exp1 transformation, Add an variable
port as
2. V_row_number. We can assign value to variable V_row_number by two ways
a. By using sequence generator
b. By using below logic in the expression transformation
V_ row_number =V_ row_number +1
O_ row_number =V_ row_number

Input, O_ row_number
a,
1,
b,
2,
c,
3,
d,
4,
e,
5,

3. Table A - In one pipeline, send data from exp transformation to filter where
you filter out first row as O_ row_number = 1 to table A.
4. Table B - Now again there are two ways to identify last records,
a. Pass all rows from exp1 transformation to agg transformation and dont
select any column in group by port,it will sent last record to table B.
b. By using max in agg
5. Table c - Now send out of step 4 to an exp2 transformation, where you will
get O_ row_number=5 then add a dummy port into a same exp with value 1
now join this exp2 with the very first exp1 so that you will get output like
below
Input, O_ row_number, O_ last_row_number
a,
1,
5
b,
2,
5
c,
3,
5
d,
4,
5
e,
5,
5
Now pass the data to filter and add condition add O_ row_number <> 1 and O_
row_number <> O_ last_row_number

Posted 7th December 2011 by Prafull Dangore


0

Add a comment

Dec
7

Separating duplicate and non-duplicate rows


to separate tables
Scenario:
Separating duplicate and non-duplicate rows to separate tables
Solution:

Please follow the below steps


1. After source qualifier, send data to aggregator transformation and use count
agg function
2. after aggregator transformation, pass data to the Router in which create two
route as duplicate where count>1 and non-duplicate where count=1
3. Then send data to the respective target tables

Posted 7th December 2011 by Prafull Dangore


0

Add a comment

Dec
7

Converion columns to rows without using


Normaliser transformation
Scenario:
We have a source table containing 3 columns : Col1, Col2 and Col3. There is only 1
row in the table as follows:
Col1

Col2

Col3

a
b
c
There is target table contain only 1 column Col. Design a mapping so that the target
table contains 3 rows as follows:
Col
a
b
c
Without using Normaliser transformation.

Solution:
Please follow the below steps
1. After source qualifier, send data to three different Exp transformation like
pass Col1 to Exp1,Col2 to Exp2 and Col3 to Exp3

2. Then pass data from Exp1,Exp2 & Exp3 to 3 instances of same target table.
Posted 7th December 2011 by Prafull Dangore
0

Add a comment

Dec
7

What are the limitations of Informatica


scheduler?
Scenario:
What are the limitations of Informatica scheduler?
Solution:

1. It will unscheduled the wf if there is a failure in previous run (the IS removes


the workflow from the schedule.)
2. If you want to run the workflow based on the success or failure of other
application job like mainframe job you need help of third party tool (like
control m, maestro,autosys, tivoli)

Note: You can depend on operating system native schedulers like [Windows
Scheduler - Windows or crontab - Unix] else any third party scheduling tool to run
which gives more flexibility in setting time and more control over running the job.

Posted 7th December 2011 by Prafull Dangore


0

Add a comment

Dec
6

Combining Sessions which Read Data from


Different Schemas
Scenario:
I have to combine 32 sessions which read data from different (32) schemas and load target to
same table.
Can you please tell me how to read parameter file which contain schema connections?

Solution:
If you want to parameterize connection, setup the session to use connection variables instead of
specific connections e.g. $DBConnection<Name>. Then specify the value for the connection
variable in the parameter file.
Posted 6th December 2011 by Prafull Dangore
0

Add a comment

Dec
6

Command to Run Workflow with Conditions


Scenario:
I have to see whether a file has been dropped in a particular location. If the file
dropped , then the workflow should run
Solution:
To run from window server:
IF exist E:\softs\Informatica\server\infa_shared\SrcFiles\FILE_NAME*.csv pmcmd
startworkflow -sv service -d Dom -u userid -p password wf_workflow_name
Posted 6th December 2011 by Prafull Dangore
0

Add a comment

Dec
6

Informatica Logic Building - select all the


distinct regions and apply it to 'ALL'
Scenario:
I have a task for which I am not able to find a logic. It is exception handling.
I have a column 'region' in table 'user'. 1 user can belong to more than 1 region. Total I have 10
regions. Exception is 1 user has 'ALL' in the region column. I have to select all the distinct
regions and apply it to 'ALL'. the output should have 10 records of the user corresponding to
each region.
How can I equal 'ALL' to 10 regions and get 10 records into the target?

Scenario:
Please follow the below steps
1. Use two flow in your mapping in first flow pass all data with Region != 'ALL'
2. In second flow pass the data with Region=ALL to exp where create 10 output port with value
of 10 region
3. then pass all columns to normalizer and in Normalizer create o/p port in which for Region port
set occurrence to 10.
3. Pass data to target table.
Posted 6th December 2011 by Prafull Dangore
0

Add a comment

Dec
2

Concatenate the Data of Just the First


Column of a Table in One Single Row
Scenario:
Concatenate the Data of Just the First Column of a Table in One Single Row
Solution:
Step 1: pass Emp_Number to expression transformation.
Step 2: in expression transformation use variable port
var1 : var2
var2 : Emp_Number
var3 : IIF(ISNULL(var1),Emp_Number,var3||' '||Emp_Number)
Step 3: In outport port
out_Emp_Number : var3
Step 4: Pass this port through aggregator transformation. Don't do any
group by or aggregation.

Posted 2nd December 2011 by Prafull Dangore


1

View comments

Nov
29

How to pass parameters to Procedures and


Functions in PL/SQL ?
Parameters in Procedure and Functions:
In PL/SQL, we can pass parameters to procedures and functions in three ways.
1) IN type parameter: These types of parameters are used to send values to stored procedures.
2) OUT type parameter: These types of parameters are used to get values from stored
procedures. This is similar to a return type in functions.
3) IN OUT parameter: These types of parameters are used to send values and get values from
stored procedures.
NOTE: If a parameter is not explicitly defined a parameter type, then by default it is an IN type
parameter.
1) IN parameter: This is similar to passing parameters in programming languages. We can pass
values to the stored procedure through these parameters or variables. This type of parameter is a
read only parameter. We can assign the value of IN type parameter to a variable or use it in a
query, but we cannot change its value inside the procedure.
The General syntax to pass a IN parameter is
CREATE [OR REPLACE] PROCEDURE procedure_name (
param_name1 IN datatype, param_name12 IN datatype ... )

param_name1, param_name2... are unique parameter names.

datatype - defines the datatype of the variable.

IN - is optional, by default it is a IN type parameter.

2) OUT Parameter: The OUT parameters are used to send the OUTPUT from a procedure or a
function. This is a write-only parameter i.e, we cannot pass values to OUT paramters while
executing the stored procedure, but we can assign values to OUT parameter inside the stored
procedure and the calling program can recieve this output value.
The General syntax to create an OUT parameter is
CREATE [OR REPLACE] PROCEDURE proc2 (param_name OUT datatype)

The parameter should be explicity declared as OUT parameter.


3) IN OUT Parameter:
The IN OUT parameter allows us to pass values into a procedure and get output values from the
procedure. This parameter is used if the value of the IN parameter can be changed in the calling

program.
By using IN OUT parameter we can pass values into a parameter and return a value to the calling
program using the same parameter. But this is possible only if the value passed to the procedure
and output value have a same datatype. This parameter is used if the value of the parameter will
be changed in the procedure.
The General syntax to create an IN OUT parameter is
CREATE [OR REPLACE] PROCEDURE proc3 (param_name IN OUT datatype)

The below examples show how to create stored procedures using the above three types of
parameters.
Example1:
Using IN and OUT parameter:
Lets create a procedure which gets the name of the employee when the employee id is passed.
1>
2>
3>
4>
5>
6>
7>

CREATE OR REPLACE PROCEDURE emp_name (id IN NUMBER, emp_name OUT NUMBER)


IS
BEGIN
SELECT first_name INTO emp_name
FROM emp_tbl WHERE empID = id;
END;
/

We can call the procedure emp_name in this way from a PL/SQL Block.
1> DECLARE
2> empName varchar(20);
3> CURSOR id_cur SELECT id FROM emp_ids;
4> BEGIN
5> FOR emp_rec in id_cur
6> LOOP
7>
emp_name(emp_rec.id, empName);
8>
dbms_output.putline('The employee ' || empName || ' has id ' || emprec.id);
9> END LOOP;
10> END;
11> /

In the above PL/SQL Block


In line no 3; we are creating a cursor id_cur which contains the employee id.
In line no 7; we are calling the procedure emp_name, we are passing the id as IN parameter
and empName as OUT parameter.
In line no 8; we are displaying the id and the employee name which we got from the procedure
emp_name.
Example 2:
Using IN OUT parameter in procedures:
1> CREATE OR REPLACE PROCEDURE emp_salary_increase
2> (emp_id IN emptbl.empID%type, salary_inc IN OUT emptbl.salary%type)
3> IS
4>
tmp_sal number;
5> BEGIN
6>
SELECT salary
7>
INTO tmp_sal
8>
FROM emp_tbl
9>
WHERE empID = emp_id;
10>
IF tmp_sal between 10000 and 20000 THEN
11>
salary_inout := tmp_sal * 1.2;

12>
ELSIF tmp_sal between 20000 and 30000 THEN
13>
salary_inout := tmp_sal * 1.3;
14>
ELSIF tmp_sal > 30000 THEN
15>
salary_inout := tmp_sal * 1.4;
16>
END IF;
17> END;
18> /

The below PL/SQL block shows how to execute the above 'emp_salary_increase' procedure.

1> DECLARE
2>
CURSOR updated_sal is
3>
SELECT empID,salary
4>
FROM emp_tbl;
5>
pre_sal number;
6> BEGIN
7>
FOR emp_rec IN updated_sal LOOP
8>
pre_sal := emp_rec.salary;
9>
emp_salary_increase(emp_rec.empID, emp_rec.salary);
10>
dbms_output.put_line('The salary of ' || emp_rec.empID ||
11>
' increased from '|| pre_sal || ' to '||emp_rec.salary);
12>
END LOOP;
13> END;
14> /

Posted 29th November 2011 by Prafull Dangore


0

Add a comment

Nov
29

Explicit Cursors
Explicit Cursors
An explicit cursor is defined in the declaration section of the PL/SQL Block. It is created on a
SELECT Statement which returns more than one row. We can provide a suitable name for the
cursor.
The General Syntax for creating a cursor is as given below:
CURSOR cursor_name IS select_statement;

cursor_name A suitable name for the cursor.

select_statement A select query which returns multiple rows.

How to use Explicit Cursor?


There are four steps in using an Explicit Cursor.
DECLARE the cursor in the declaration section.

OPEN the cursor in the Execution Section.

FETCH the data from cursor into PL/SQL variables or records in the Execution Section.

CLOSE the cursor in the Execution Section before you end the PL/SQL Block.
1) Declaring a Cursor in the Declaration Section:
DECLARE
CURSOR emp_cur IS
SELECT *
FROM emp_tbl
WHERE salary > 5000;

In the above example we are creating a cursor emp_cur on a query which returns the
records of all the
employees with salary greater than 5000. Here emp_tbl in the table which contains records
of all the
employees.
2) Accessing the records in the cursor:
Once the cursor is created in the declaration section we can access the cursor in the execution
section of the PL/SQL program.

How to access an Explicit Cursor?


These are the three steps in accessing the cursor.
1) Open the cursor.
2) Fetch the records in the cursor one at a time.
3) Close the cursor. General Syntax to open a cursor is:
OPEN cursor_name;

General Syntax to fetch records from a cursor is:


FETCH cursor_name INTO record_name;

OR

FETCH cursor_name INTO variable_list;

General Syntax to close a cursor is:


CLOSE cursor_name;

When a cursor is opened, the first row becomes the current row. When the data is fetched it is
copied to the record or variables and the logical pointer moves to the next row and it becomes the
current row. On every fetch statement, the pointer moves to the next row. If you want to fetch
after the last row, the program will throw an error. When there is more than one row in a cursor
we can use loops along with explicit cursor attributes to fetch all the records.
Points to remember while fetching a row:
We can fetch the rows in a cursor to a PL/SQL Record or a list of variables created in the
PL/SQL Block.
If you are fetching a cursor to a PL/SQL Record, the record should have the same structure as
the cursor.
If you are fetching a cursor to a list of variables, the variables should be listed in the same order
in the fetch statement as the columns are present in the cursor.
General Form of using an explicit cursor is:
DECLARE

variables;
records;
create a cursor;
BEGIN
OPEN cursor;
FETCH cursor;
process the records;
CLOSE cursor;
END;

Lets Look at the example below


Example 1:

1> DECLARE
2>
emp_rec emp_tbl%rowtype;
3>
CURSOR emp_cur IS
4>
SELECT *
5>
FROM
6>
WHERE salary > 10;
7> BEGIN
8>
OPEN emp_cur;
9>
FETCH emp_cur INTO emp_rec;
10>
dbms_output.put_line (emp_rec.first_name || '
emp_rec.last_name);
11>
CLOSE emp_cur;
12> END;

' ||

In the above example, first we are creating a record emp_rec of the same structure as of table
emp_tbl in line no 2. We can also create a record with a cursor by replacing the table name with
the cursor name. Second, we are declaring a cursor emp_cur from a select query in line no 3 6. Third, we are opening the cursor in the execution section in line no 8. Fourth, we are fetching
the cursor to the record in line no 9. Fifth, we are displaying the first_name and last_name of the
employee in the record emp_rec in line no 10. Sixth, we are closing the cursor in line no 11.

What are Explicit Cursor Attributes?


Oracle provides some attributes known as Explicit Cursor Attributes to control the data
processing while using cursors. We use these attributes to avoid errors while accessing cursors
through OPEN, FETCH and CLOSE Statements.

When does an error occur while accessing an explicit cursor?


a) When we try to open a cursor which is not closed in the previous operation.
b) When we try to fetch a cursor after the last operation.
These are the attributes available to check the status of an explicit cursor.
Attributes
Return values
Example
%FOUND
TRUE, if fetch statement returns at least Cursor_name%FOUND
one row.
FALSE, if fetch statement doesnt return
a row.
%NOTFOUND
TRUE, , if fetch statement doesnt return Cursor_name%NOTFOUND
a row.
FALSE, if fetch statement returns at least

%ROWCOUNT

%ISOPEN

one row.
The number of rows fetched by the fetch Cursor_name%ROWCOUNT
statement
If no row is returned, the PL/SQL
statement returns an error.
TRUE, if the cursor is already open in the Cursor_name%ISNAME
program
FALSE, if the cursor is not opened in the
program.

Using Loops with Explicit Cursors:


Oracle provides three types of cursors namely SIMPLE LOOP, WHILE LOOP and FOR LOOP.
These loops can be used to process multiple rows in the cursor. Here I will modify the same
example for each loops to explain how to use loops with cursors.
Cursor with a Simple Loop:
1> DECLARE
2>
CURSOR emp_cur IS
3>
SELECT first_name, last_name, salary FROM emp_tbl;
4>
emp_rec emp_cur%rowtype;
5> BEGIN
6>
IF NOT sales_cur%ISOPEN THEN
7>
OPEN sales_cur;
8>
END IF;
9>
LOOP
10>
FETCH emp_cur INTO emp_rec;
11>
EXIT WHEN emp_cur%NOTFOUND;
12>
dbms_output.put_line(emp_cur.first_name || ' ' ||emp_cur.last_name
13>
|| ' ' ||emp_cur.salary);
14> END LOOP;
15> END;
16> /

In the above example we are using two cursor attributes %ISOPEN and %NOTFOUND.
In line no 6, we are using the cursor attribute %ISOPEN to check if the cursor is open, if the
condition is true the program does not open the cursor again, it directly moves to line no 9.
In line no 11, we are using the cursor attribute %NOTFOUND to check whether the fetch
returned any row. If there is no rows found the program would exit, a condition which exists
when you fetch the cursor after the last row, if there is a row found the program continues.
We can use %FOUND in place of %NOTFOUND and vice versa. If we do so, we need to
reverse the logic of the program. So use these attributes in appropriate instances.
Cursor with a While Loop:
Lets modify the above program to use while loop.
1> DECLARE
2> CURSOR emp_cur IS
3> SELECT first_name, last_name, salary FROM emp_tbl;
4> emp_rec emp_cur%rowtype;
5> BEGIN
6>
IF NOT sales_cur%ISOPEN THEN
7>
OPEN sales_cur;

8>
END IF;
9>
FETCH sales_cur INTO sales_rec;
10> WHILE sales_cur%FOUND THEN
11> LOOP
12>
dbms_output.put_line(emp_cur.first_name || ' ' ||emp_cur.last_name
13>
|| ' ' ||emp_cur.salary);
15>
FETCH sales_cur INTO sales_rec;
16> END LOOP;
17> END;
18> /

In the above example, in line no 10 we are using %FOUND to evaluate if the first fetch
statement in line no 9 returned a row, if true the program moves into the while loop. In the loop
we use fetch statement again (line no 15) to process the next row. If the fetch statement is not
executed once before the while loop the while condition will return false in the first instance and
the while loop is skipped. In the loop, before fetching the record again, always process the record
retrieved by the first fetch statement, else you will skip the first row.
Cursor with a FOR Loop:
When using FOR LOOP you need not declare a record or variables to store the cursor values,
need not open, fetch and close the cursor. These functions are accomplished by the FOR LOOP
automatically.
General Syntax for using FOR LOOP:
FOR record_name IN cusror_name
LOOP
process the row...
END LOOP;

Lets use the above example to learn how to use for loops in cursors.
1> DECLARE
2> CURSOR emp_cur IS
3> SELECT first_name, last_name, salary FROM emp_tbl;
4> emp_rec emp_cur%rowtype;
5> BEGIN
6> FOR emp_rec in sales_cur
7> LOOP
8> dbms_output.put_line(emp_cur.first_name || ' ' ||emp_cur.last_name
9>
|| ' ' ||emp_cur.salary);
10> END LOOP;
11>END;
12> /

In the above example, when the FOR loop is processed a record emp_recof structure emp_cur
gets created, the cursor is opened, the rows are fetched to the record emp_rec and the cursor is
closed after the last row is processed. By using FOR Loop in your program, you can reduce the
number of lines in the program.
NOTE: In the examples given above, we are using backward slash / at the end of the program.
This indicates the oracle engine that the PL/SQL program has ended and it can begin processing
the statements.
Posted 29th November 2011 by Prafull Dangore
0

Add a comment

Nov

24

how to check table size in oracle 9i ?


Scenario :
how to check table size in oracle 9i ?
Solution :
select
segment_name
table_name,
sum(bytes)/(1024*1024) table_size_meg
from
user_extents
where
segment_type='TABLE'
and segment_name = 'TABLE_NAME'
GROUP BY segment_name
Posted 24th November 2011 by Prafull Dangore
0

Add a comment

Nov
1

how to do the incremental loading using


mapping variable
Scenario:
how to do the incremental loading using mapping variable
Solution:
Step 1: create a mapping variable, $$MappingDateVariable. In the source qualifier, create a
filter to read only rows whose transaction date equals $$MappingDateVariable, such as:
transaction_date = $$MappingDateVariable
(I am assuming you have a column in your source table called, transaction date or any date
column to help do incremental load).

Step 2: Create a mapping variable, $$MappingDateVariable and hard code the value from which
date you need to extract.
Step 3: In the mapping, use the variable function to set the variable value to increment one day
each time the session runs.
lets say you set the initial value of $$MappingDateVariable as 11/16/2010. The first time the
integration service runs the session, it reads only rows dated 11/16/2010. And would set $
$MappingDateVariable to 11/17/2010. It saves 11/17/2010 to the repository at the end of the
session. The next time it runs the session, it reads only rows from 11/17/2010.
Posted 1st November 2011 by Prafull Dangore
0

Add a comment

Nov
1

Data type conversion issue


Scenario:
We are creating hundreds of passthrough mappings that need to store numeric source data
columns as varchar2 string columns in the target Oracle staging tables. (This is to bypass a
current iDQ bug where large numbers are presented in profiling results in scientific notation KB 117753).
Unfortunaly Powercentre pads all available significant places with zeros.
E.g. a source column (15,2) passing value 12.3 into a target column of varchar2(20) will populate
with "12.3000000000000".
Can this be avoided without manipulating each mapping with an additional expression with a
string function and new output port for each column?
Solution:
Enabling high pression ensures the source data type (both scale and prescision intact).
So if you want to avoid trailing 0's, u can use TRIM function in the SQ override query.
Posted 1st November 2011 by Prafull Dangore
0

Add a comment


Nov
1

i get out put file frist field like


#id,e_ID,pt_Status, but i dont want #
Scenario:
my source .csv stg oracle tgt .csv i get out put first field #along with columan name.
and i want to delete dummy files my server is windows
Solution:

It is not a problem.You need to provide the targetfile path and the name in the input filename and
output filename you can provide the file location and the name you want to have in the target file
(final file).
Ex:
Oracle_emp(source)--> SQ-->Logic-->TGT(emp.txt)(Flatfile)
In post session sucess command
sed 's/^#//g' d:\Informaticaroot\TGTfiles\ emp.txt > d:\Informaticaroot\TGTfiles\

Posted 1st November 2011 by Prafull Dangore


0

Add a comment

Oct
21

Need to get the lastest ID


Scenario:

We have data from source is coming as below


Source is Oracle database:
OLD_ID
NEW_ID
---------- ---------101
102
102
103
103
104
105
106
106
108
Need the output as below.
OLD_ID NEW_ID
---------- -----------101
104
102
104
103
104
105
108
106
108
Can anyone help me todo this in informatica.
Solution:
Mapping def
Exp2
Sq --> Exp1 -->

---->Jnr ----> TGT


Agg

Explnation:
In exp1 ,add tow variable shown below
OLD_ID

NEW_ID

---------- ---------101
102
102
103
103
104
105
106
106
108

Diff_of_rows

New_id

----------------- ---------1 (1)


1
1 (102-101)
1
1 (103-102)
1
2 (105-103)
2
1 (106-105)
2

Diff_of_rows - you have to maintain the old_id of prev row in exp variable,then you have to
minus it with current row lod_id

New_id - starting with one, if value of prev row of Diff_of_rows does not match with current
row Diff_of_rows,increment value of new_id by 1.
Thane send below rows to Exp 2
OLD_ID

NEW_ID

---------- ---------101
102
102
103
103
104
105
106
106
108

New_id

- ---------1
1
1
2
2

and in Agg o/p


NEW_ID
---------104
108

New_id
---------1
2

Then join exp2 o/p with agg o/p based on New_id column so you will get required o/p
OLD_ID NEW_ID
---------- -----------101
104
102
104
103
104
105
108
106
108

Posted 21st October 2011 by Prafull Dangore


0

Add a comment

Oct
21

Aborting a Session in Informatica 8.6.1


Scenario:
I am trying to abort a session in the workflow monitor by using 'Abort' option.
But the status of the session is still being shown as 'Aborting' and remains same for the past 4
days.
Finally I had to request the UNIX team to kill the process.
Could anybody let me know the reason behind this as I couldn't find any info in the log file as
well.
Solution:
- If the session you want to stop is a part of batch, you must stop the batch
- If the batch is part of nested batch, stop the outermost batch
- When you issue the stop command, the server stops reading data. It continues processing and
writing data and committing data to targets
- If the server cannot finish processing and committing data, you can issue the ABORT
command. It is similar to stop command, except it has a 60 second timeout. If the server cannot
finish processing and committing data within 60 seconds, You need to kill DTM process and
terminates the session.
As you said to kill the process we need to contact UNIX Admin.
But last time I coordinated with oracle team updated OPB table info related workflow status.

Posted 21st October 2011 by Prafull Dangore


0

Add a comment

Oct

20

Condition to Check for NULLS and SPACES


in Informatica
Scenario:
I have String data and I want to filter out NULLS and SPACES from that set of Data.
What can be the condition given in Informatica to check for NULLS and SPACES in
ONE EXPRESSION OR FILTER TRANSFORMATION.
Solution:
Use LENGTH(LTRIM(RTRIM(column_name)))<>0 in filter transformation.
OR
IIF(ISNULL(column_name) or LTRIM(RTRIM(column_name)) = '', 0, 1) -- do this in exp t/n
and use this flag in filte.
Posted 20th October 2011 by Prafull Dangore
0

Add a comment

Oct
20

Combining multiple rows as one based on


matching colum value of multiple rows
Scenario:
I have my source data like below:
ID Line-no Text
529 3 DI-9001
529 4 DI-9003
840 2 PR-031
840 2 DI-9001
616 1 PR-029
874 2 DI-9003
874 1 PR-031
959 1 PR-019

Now I want my target to be


ID Line-no Text
529 3 DI-9001
529 4 DI-9003
840 2 PR-031&DI-9001
616 1 PR-029
874 2 DI-9003
874 1 PR-031
959 1 PR-019
It means if both the ID and the LINE_NO both are same then the TEXT should
concatenate, else no change
Solution:
The mapping flow like this:
source-->sq-->srttrans-->exptrans--->aggtrans--->target
srttrans--->sort by ID, line_no ASC order
exp-->use variable ports as
ID(I/O)
Line_no(i/o)
Text(i)
text_v : iif(ID=pre_id and Line_no=pre_line_no,Text_v||'&'||Text,Text)
pre_id(v):ID
pre_line_no(v):Line_no
Text_op:Text_v
Aggtrans-->group by ID and Line_no. It will return last row which is
concatenation of text
Then pass to the Target.
Posted 20th October 2011 by Prafull Dangore
0

Add a comment

Oct
20

How to Read the Data between 2 Single


Quotes
Scenario:

I have a record like this:


field1 = "id='102',name='yty,wskjd',city='eytw' "
Note: sometimes data might come as [id,name, city] or sometimes it might come as
[code,name,id]. It varies...
I need to store the value of field1 into different fields,
value1 = id='102'
value2 = name='yty,wskjd'
value3 = city='eytw'
If I split the record based on comma(,) then the result wont come as expected as there is a
comma(,) in value of name.
Is there a way where we can achieve the solution in easier way i.e., if a comma comes in
between two single quotes then we have to suppress the comma(,). I gave a try with
different inbuilt functions but couldnt make it. Is there a way to read the data in between 2
single quotes ???

Solution:
Please try below solution it may help u in some extent
Field1 = "id='102',name='yty,wskjd',city='eytw' "
Steps 1. v_1 = Replace(field1, '"' , '') --i.e. no space
2. v_2 = substring-after ( v_1, id= ) --O/P '102',name='yty,wskjd',city='eytw'
3. v_3 = substring-after ( v_1, name= ) --O/P 'yty,wskjd',city='eytw'
4. v_4 = substring-after ( v_1, city= ) --O/P 'eytw'
5. v_5 = substring-before(v_2, name) --O/P '102',
6. v_6 = substring-before ( v_3, city ) --O/P 'yty,wskjd',
7. value1 = replace(v_5,',','') --O/P '102'
8. value3 = replace(v_6,',','') --O/P 'yty,wskjd'
7. value3 = v_4 --O/P 'eytw'

Posted 20th October 2011 by Prafull Dangore


0

Add a comment

Oct
20

How to use Lookup Transformation for


Incremental data loading in target table?

Scenario:
how to load new data from one table to another.
For eg: i have done a mapping from source table (Which contain bank details) to target
table.
For first time i will load all the data from source to target, if i have run the
mapping second day,
i need to get the data which is newly entered in the source table.
First time it have to load all the data from source to target, for second or third
time, if there is any new record in the source table, only that record must load to the target,
by comparing both source and the target.
How to use the lookup transformation for this issue?

Solution:
1) In mapping, create a lookup on target table and select dynamic lookup cache in property tab,
once you check it you can see NewLookupRow column in lookup port through which you can
identify whether incoming rows are new or existing. So after lookup you can use router to insert
or update it in target table.
Also in lookup port, you can use associate port to compare the specific/all columns of target table
lookup with source column.its a connected lookup where you send a source rows to lookup as
input and/or output ports and lookup ports as output and lookup.
OR
2) If there is any primary key column in the target table then we can create a lookup on the target
table and match the TGT primary key with the source primary key.If the lookup finds a match
then ignore those records ,if there is no match then insert those record into the target
The logic should be as below
SQ--> LKP--> FILTER-->TGT
In lookup match the ID column from src with ID column in the target .The lookup will return the
ID's if it is avail in the target else it will return null value.
In filter allow only null ID values which is returing from the lookup.
OR
3. If you have any datestamp in the source table then you can pull only the newly inserted

records from the source table based on the time stamp (this approach will applicable only if the
source table has a lastmodifieddate column).

Posted 20th October 2011 by Prafull Dangore


0

Add a comment

Oct
20

Is it possible to use a parameter to specify the


'Table Name Prefix'?
Scenario:
Is it possible to use a parameter to specify the 'Table Name Prefix'?
Solution:
Yes, you can use the parameter for specifying the tablename prefix.
Say suppose if you have table x with different tablename prefix like p1.x and p2.x
you can load into the tables seperately by specifying the tablenameprefix value in the parameter
file.
All you need to do is to create a workflow variable and assign value to the variable in the param
file.Use that variable in the table prefix property
Posted 20th October 2011 by Prafull Dangore
0

Add a comment

Oct

19

Informatica PowerCenter performance Lookups


Informatica PowerCenter performance - Lookups
Lookup performance
Lookup is an important and a useful transformation when used effectively.
What is a lookup transformation? It is just not another transformation which fetches you
data to look against the source data. It is a transformation when used improperly, makes
your flow run for ages.
I now try to explain different scenarios where you can face problems with Lookup and also
how to tackle them.

Unwanted columns:
By default, when you create a lookup on a table, PowerCenter gives you all the columns in
the table, but be sure to delete the unwanted columns from the lookup as they affect the
lookup cache very much. You only need columns that are to be used in lookup condition and
the ones that have to get returned from the lookup.
SQL query:
We will start from the database. Find the execution plan of the SQL override and see if you
can add some indexes or hints to the query to make it fetch data faster. You may have to
take the help of a database developer to accomplish this if you, yourself are not an SQLer.
Size of the source versus size of lookup:
Let us say, you have 10 rows in the source and one of the columns has to be checked
against a big table (1 million rows). Then PowerCenter builds the cache for the lookup table
and then checks the 10 source rows against the cache. It takes more time to build the
cache of 1 million rows than going to the database 10 times and lookup against the table
directly.
Use uncached lookup instead of building the static cache, as the number of source rows is
quite less than that of the lookup.
Conditional call of lookup:
Instead of going for connected lookups with filters for a conditional lookup call, go for
unconnected lookup. Is the single column return bothering for this? Go ahead and change
the SQL override to concatenate the required columns into one big column. Break them at
the calling side into individual columns again.
JOIN instead of Lookup:
In the same context as above, if the Lookup transformation is after the source qualifier and

there is no active transformation in-between, you can as well go for the SQL over ride of
source qualifier and join traditionally to the lookup table using database joins, if both the
tables are in the same database and schema.
Increase cache:
If none of the above seems to be working, then the problem is certainly with the cache. The
cache that you assigned for the lookup is not sufficient to hold the data or index of the
lookup. Whatever data that doesn't fit into the cache is spilt into the cache files designated
in $PMCacheDir. When the PowerCenter doesn't find the data you are lookingup in the
cache, it swaps the data from the file to the cache and keeps doing this until it finds the
data. This is quite expensive for obvious reasons being an I/O operation. Increase the cache
so that the whole data resides in the memory.
What if your data is huge and your whole system cache is less than that? Don't promise
PowerCenter the amount of cache that it can't be allotted during the runtime. If you promise
10 MB and during runtime, your system on which flow is running runs out of cache and can
only assign 5MB. Then PowerCenter fails the session with an error.
Cachefile file-system:
In many cases, if you have cache directory in a different file-system than that of the hosting
server, the cache file piling up may take time and result in latency. So with the help of your
system administrator try to look into this aspect as well.
Useful cache utilities:
If the same lookup SQL is being used in someother lookup, then you have to go for
shared cache or reuse the lookup. Also, if you have a table that doesn't get data
updated or inserted quite often, then use the persistent cache because the
consecutive runs of the flow don't have to build the cache and waste time.

Posted 19th October 2011 by Prafull Dangore


0

Add a comment

Oct
19

PowerCenter objects Introduction


PowerCenter objects Introduction
A repository is the highest physical entity of a project in PowerCenter.
A folder is a logical entity in a PowerCenter project. For example,
Customer_Data is a folder.
A workflow is synonymous to a set of programs in any other programming

language.
A mapping is a single program unit that holds the logical mapping between
source and target with required transformations. A mapping will just say a
source table by name EMP exists with some structure. A target flat file by
name EMP_FF exists with some structure. The mapping doesnt say in which
schema this EMP table exists and in which physical location this EMP_FF
table going to be stored.
A session is the physical representation of the mapping. The session
defines what a maping didnt do. The session stores the information about
where this EMP table comes from. Which schema, with what username and
password can we access this table in that schema. It also tells about the
target flat file. In which physical location the file is going to get created.
A transformation is a sub-program that performs a specific task with the
input it gets and returns some output. It can be assumed as a stored
procedure in any database. Typical examples of transformations are Filter,
Lookup, Aggregator, Sorter etc.
A set of transformations, that are reusable can be built into something
called mapplet. A mapplet is a set of transformations aligned in a specific
order of execution.
As with any other tool or programing language, PowerCenter also allows
parameters to be passed to have flexibility built into the flow. Parameters are
always passed as data in flat files to PowerCenter and that file is called the
parameter file.
Posted 19th October 2011 by Prafull Dangore
0

Add a comment

Oct
19

Dynamically generate parameter files


Scenario:
Dynamically generate parameter files
Solution:
Parameter file format for PowerCenter:

For a workflow parameter which can be used by any session in the workflow, below is the format
in which the parameter file has to be created.
[Folder_name:WF.Workflow_Name]
$$parameter_name1=value
$$parameter_name2=value
For a session parameter which can be used by the particular session, below is the format in which
the parameter file has to be created.
[Folder_name:WF.Workflow_Name:ST.Session_Name]
$$parameter_name1=value
$$parameter_name2=value
3. Parameter handling in a data model:
To have flexibility in maintaining the parameter files.
To reduce the overhead for the support to change the parameter file every time a value of a
parameter changes
To ease the deployment,
all the parameters have to be maintained in Oracle or any database tables and a PowerCenter
session is created to generate the parameter file in the required format automatically.
For this, 4 tables are to be created in the database:
1. FOLDER table will have entries for each folder.
2. WORKFLOWS table will have the list of each workflow but with a reference to the
FOLDERS table to say which folder this workflow is created in.
3. PARAMETERS table will hold all the parameter names irrespective of folder/workflow.
4. PARAMETER_VALUES table will hold the parameter of each session with references to
PARMETERS table for parameter name and WORKFLOWS table for the workflow name. When
the session name is NULL, that means the parameter is a workflow variable which can be used
across all the sessions in the workflow.

To get the actual names because PARAMETER_VALUES table holds only ID columns of
workflow and parameter, we create a view that gets all the names for us in the required format of
the parameter file. Below is the DDL for the view.
a. Parameter file view:
CREATE OR REPLACE VIEW PARAMETER_FILE
(
HEADER,
DETAIL
)
AS
select '['fol.folder_name'.WF:' wfw.workflow_name']' header
,pmr.parameter_namenvl2(dtl.logical_name, '_'dtl.logical_name, NULL)'='
dtl.value detail
from folder fol
,parameters pmr
,WORKFLOWS wfw
,PARAMETER_VALUES dtl
where fol.id = wfw.folder_id
and dtl.pmr_id = pmr.id
and dtl.wfw_id = wfw.id
and dtl.session_name is null
UNION
select '['fol.folder_name'.WF:' wfw.workflow_name'.ST:' dtl.session_name']' header
,decode(dtl.mapplet_name, NULL, NULL, dtl.mapplet_name'.')
pmr.parameter_namenvl2(dtl.logical_name, '_'dtl.logical_name, NULL)'=' dtl.value detail
from folder fol
,parameters pmr
,WORKFLOWS wfw
,PARAMETER_VALUES dtl
where fol.id = wfw.folder_id
and dtl.pmr_id = pmr.id
and dtl.wfw_id = wfw.id
and dtl.session_name is not null
b. FOLDER table
ID (NUMBER)
FOLDER_NAME (varchar50)
DESCRIPTION (varchar50)
c. WORKFLOWS table
ID (NUMBER)

WORKFLOW_NAME (varchar50)
FOLDER_ID (NUMBER) Foreign Key to FOLDER.ID
DESCRIPTION (varchar50)
d. PARAMETERS table
ID (NUMBER)
PARAMETER_NAME (varchar50)
DESCRIPTION (varchar50)
e. PARAMETER_VALUES table
ID (NUMBER)
WF_ID (NUMBER)
PMR_ID (NUMBER)
LOGICAL_NAME (varchar50)
VALUE (varchar50)
SESSION_NAME (varchar50)
LOGICAL_NAME is a normalization initiative in the above parameter logic. For example, in a
mapping if we need to use $$SOURCE_FX as a parameter and also $$SOURCE_TRANS as
another mapping parameter, instead of creating 2 different parameters in the PARAMETERS
table, we create one parameter $$SOURCE. Then FX and TRANS will be two
LOGICAL_NAME records of the PARAMETER_VALUES table.
m_PARAMETER_FILE is the mapping that creates the parameter file in the desired format and
the corresponding session name is s_m_PARAMETER_FILE.

Posted 19th October 2011 by Prafull Dangore


0

Add a comment

Oct
19

How to generate target file names (like


YYYYMMDDHH24:MISS.csv) dynamically
from the mapping?
Scenario:

How to generate target file names (like YYYYMMDDHH24:MISS.csv)


dynamically from the mapping?
Solution:
In order to generate the target file names from the mapping, we should make use of the
special "FileName" port in the target file. You can't create this special port from the usual
New port button. There is a special button with label "F" on it to the right most corner of
the target flat file when viewed in "Target Designer".
Below two screen-shots tell you how to create the special port in your target file.

Once this is done, the job is done. When you want to create the file name with a
timestamp attached to it, just use a port from an Expression transformation before the
target to pass a value of Output Port with expression $
$FILE_NAMEto_char(sessstarttime, 'YYYYMMDDHH24:MISS')'.csv'.
Please note that $$FILE_NAME is a parameter to the mapping and I've used
sessstarttime because it will be constant through out the session run.
If you use sysdate, it will change if you have 100s of millions of records and if the
session may run for an hour, each second a new file will get created.
Please note that a new file gets created with the current value of the port when the port
value which maps to the FileName changes.
We'll come to the mapping again. This mapping generates two files. One is a dummy
file with zero bytes size and the file name is what is given in the Session properties
under 'Mappings' tab for target file name. The other file is the actual file created with the
desired file name and data.
Posted 19th October 2011 by Prafull Dangore
0

Add a comment


Oct
18

To find the inserted deleted and updted rows


count
Scenario:
I want to find the no.of rows inserted,updated and deleted on the successful execution
of a session.
These details are present only in the session log file so how to grep these details from
the log file? or Is there anyother method?
I actually have to insert these details into a table.The other details which i have to
include in the table are session name, target table name,sesion start time,session end
time.
Thus my table structure is
Session_name
Tgt_Table_name
Start_time
End_time
Inserted_count
Deleted_count
Updated_count
Solution:
Hey u will get ths info through INFA metadata tables

Posted 18th October 2011 by Prafull Dangore


0

Add a comment


Oct
18

How to load first,last,remaining records into


different targets when source is a flat file?
Scenario:
How to load first,last,remaining records into different targets when source is
a flat file?
Solution:
If you are using seq and aggregator then the mapping flow should be like below
-->AGG
SRC-->SQ-->EXP
SEQ-->

-->JNR-->RTR-->TGT1
-->TGT2
-->TGT3

In router if seq value =1 then that record will go to target1


if seq value and agg count out put equal that means that is last record so it has to go to target 3
the remaining all records has to pass to target 2.
for sql query to get first, last and remaining records try the below
For First record:
select * from emp where rownum=1;
For Last record:
select * from (select * from (select empno,ename,sal,job,mgr,rownum from emp) order by
rownum DESC) where rownum=1;
For remaining record you can use minus function with above out puts.

Posted 18th October 2011 by Prafull Dangore


0

Add a comment

Oct
18

Email date in subject line


Scenario:
Is there a way to add the sysdate to the email subject sent from Informatica?
I am running a mapping which create an error file. I am sending this error at the end of the
process via an email. But the requirement is to send it with some text as error report and the
system in the subject line.
Solution:
Below is the approach,
Create a workflow variable $$Datestamp as datetime datatype.In assignment task assign the
sysdate to that variable and in email subject use the $$Datestamp variable and it will send the
timestamp in the subject.
Posted 18th October 2011 by Prafull Dangore
0

Add a comment

Oct
18

Route records as UNIQUE AND


DUPLICATE
Scenario:
I HAVE A SRC TABLE AS :
A
B
C
C
B
D
B
I HAVE 2 TGT TABLES UNIQUE AND DUPLICATE :

The first table should contain the following output


A
D
The second target should contain the following output
B
B
B
C
C
hOW DO I DO THIS :
Solution:
Try the following approach.
AGG-->
SRC-->SQ--------------> JNR--> RTR-->TGT1
-->TGT2
from source pass all the data to aggregator and group by source column. one one out put port
count(column)
so from agg you have two ports out puts
COLUMN,COUNT
A,1
B,2
C,2
D,1
Now join this data with source based on column.
Out put of joiner will be like below
COLUMN,COUNT
A,1
B,3
C,2
C,2
B,3
D,1
B,3
In router create two groups one for Unique and another one for duplicate

Unique=(count=1)
Duplicate=(count>1)

Posted 18th October 2011 by Prafull Dangore


0

Add a comment

Oct
18

Informatica Source Qualifier (Inner Joins )


Scenario:
have an 3 tables
ENO ENAM HIREDATE
001 XXX MAY/25/2009
002 JJJJ OCT/12/2010
008 KKK JAN/02/2011
006 HJJH AUG/12/2012
ENO S-ID
001 OO
002 OO
007 OO
ENO V-ID
006 DD
008 DD
001 DD
Using informatica source qualifier or other transformations I should be able to club the above
tables in such a way that if the HIREDATE>JAN/01/2011 then eno should select v-id and if
HIREDATE<JAN/01/2011 the ENO should select s-id and make a target table leaving the ID
columns blank based on condition IT SHOULD HAVE EITHER S-ID OR V-ID BUT NOT
BOTH .
ENO ENAM HIREDATE S-ID V-ID
Please give me the best advice for the following situation.
Solution:
Better u do it in source qualifier sql query by case statement
select ENO,ENAM,HIREDATE,
CASE
WHEN (HIREDATE<JAN/01/2011
THEN table2.s-id

ELSE table3.s-id
END
from table1 a,table2 b,table3 c
where a.eno=b.eno
and b.eno=c.eno;
OR
You can use lookup .
Second table and third table can be used as lookup.
In an expression:
s_id= IF (HIREDATE<JAN/01/2011, lkp_2nd_tbl,NULL)
v_id=IF(HIREDATE>JAN/01/2011,lkp_3rd_table,NULL )
Posted 18th October 2011 by Prafull Dangore
0

Add a comment

Oct
18

CMN_1650 A duplicate row was attempted to


be inserted into a dynamic lookup cache
Dynamic lookup error.
Scenario:
I have 2 ports going through a dynamic lookup, and then to a router. In the router it is a
simple
case of inserting new target rows (NewRowLookup=1) or rejecting existing rows
(NewRowLookup=0).
However, when I run the session I'm getting the error:
"CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache
Dynamic lookup error. The dynamic lookup cache only supports unique condition keys."
I thought that I was bringing through duplicate values so I put a distinct on the SQ.
There is also a not null filter on both ports.
However, whilst investigating the initial error that is logged for a specific pair of values
from the source, there is only 1 set of them (no duplicates). The pair exists on the target
so surely should just return from the dynamic lookup newrowlookup=0.
Is this some kind of persistent data in the cache that is causing this to think that it is

duplicate data? I haven't got the persistent cache or recache from database flags
checked.
Solution:
This occurs when the table on which the lookup is built has duplicate rows. Since a
dynamic cached lookup cannot be created with duplicate rows, the session fails with
this error.
Make sure there are no duplicate rows in the table before starting the session. OR Do a
Select DISTINCT in the lookup cache SQL.
OR
Make sure the data types of source and look up fields match and extra spaces are
trimmed, looks like the match is failing between src and lkp so the lookup is trying to
insert the row in cache even though its present already.

Posted 18th October 2011 by Prafull Dangore


0

Add a comment

Oct
18

Update Strategy for Deleting Records in


Informatica
Scenario:
I am using an update strategy transformation for deleting records from my target table. In my
Warehouse Designer, I have defined one column (say col1) as Primary Key and another column
(say col2) as Primary/Foreign Key.
My target has rows like this:
Col1 Col2 Col3
---1 A value1
2 A value2
3 B value3

I want to delete the record from the target which has the combination (Col1="2" and Col2="A").
Will linking the fields Col1 and Col2 from the Update Strategy transformation to the Target serve
the purpose?
Solution:
Define both the columns as primary key in target definition and link only col1 and col2 in
mapping. This will serve your purpose.
BTW, if you do only delete then update strategy is not required at all.
Posted 18th October 2011 by Prafull Dangore
0

Add a comment

Oct
18

Target Rows as Update or Insert Option in


the Target
Scenario:
When you have the option to treat target rows as update or Insert option in the target
why do you need lookup transformation. I mean why do you need a look up
transformation with update strategy in a mapping to mark the records for update or
Insert when you have update else Insert option in the target? Is there any difference
between both? Can someone please let me know what is the difference and when to use
which option?
Solution:
In slowly growing targets (Delta loads) target is loaded incrementally.
You need to know a particular record is existing or not in target target.
Look up is used to cache the Target records and compare the incoming records
with the records in Target.
If incoming record is new it will be insert in target otherwise not.
Expression is used to flag a record whether it is a new or existing.
If it is new Record is flagged as 'I' with the sense of Insert.
In Slowly Changing Dimensions(SCD), History of dimension is maintained.
Hence if a record exists in the Target and if it needs to update then it
will be flagged as 'U' with sense of Update.
Posted 18th October 2011 by Prafull Dangore

Add a comment

Oct
17

How to Fill Missing Sequence Numbers in


Surrogate Key Column in Informatica
Scenario:
Hello all,
I am new to working with surrogate key columns in database. Recently I developed a
workflow/mapping that populates an SCD table with a surrogate key column. For each record
that is inserted, I created a logic in expression t/r such that it generates a new sequence number.
This seems fine and works OK.
Now, We have a purge logic that runs every day in post-sql that will delete records that have not
been updated for the last 10 days. Due to this reason, after testing the ETL process for over 15
days, I find a lot of gaps in the surrogate key column.
Is there a way/logic in Informatica with which I can fill these gaps while loading the target and
create a new sequence number only if there a no gaps? Or can this be done at database level? I
searched over the Internet but did not find any solution whatsoever.
Please advise.
Solutions:
Hello,
If you can make a bit changes to ur mapping u can achive it.
1. First delete the record which is not been used from last 10 days in per-sql instead of deleting at
the end.
2. load the all data in temp table including old and new.
3. Now load all the data in target table with sequence generator.in sg change the setting so that its
value reset to 0 for every new run.
OR

Posted 17th October 2011 by Prafull Dangore


0

Add a comment

Oct
17

Surrogate Key
Scenario:
What is a surrogate key and where do you use it?
Solution:
A surrogate key is a substitution for the natural primary key.
It is just a unique identifier or number for each row that can be used for the primary key to the
table. The only requirement for a surrogate primary key is that it is unique for each row in the
table.
Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the
dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or
SQL Server Identity values for the surrogate key.
It is useful because the natural primary key (i.e. Customer Number in Customer table) can
change and this makes updates more difficult.
Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the
primary keys (according to the business users) but ,not only can these change, indexing on a
numerical value is probably better and you could consider creating a surrogate key called, say,
AIRPORT_ID. This would be internal to the system and as far as the client is concerned you may
display only the AIRPORT_NAME.

Posted 17th October 2011 by Prafull Dangore


0

Add a comment

Oct
14

Unique Constraint Violated


Scenario:
Database errors occurred:
ORA-00001: unique constraint (INF_PRACTICE1.SYS_C00163872) violated
Database driver error.
Function Name : Execute
SQL Stmt : INSERT INTO
D_CLAIM_INJURY_SAMPLEE(CK_SUM,DM_ROW_PRCS_DT,DM_RO
W_PRCS_UPDT_DT,CLAIM_INJRY_SID,DM_CRRNT_ROW_IND,INCDT_ID,ENAME,J
OB,FIRSTNAME,LASTNAME) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Database driver error...
Function Name : Execute Multiple
SQL Stmt : INSERT INTO
D_CLAIM_INJURY_SAMPLEE(CK_SUM,DM_ROW_PRCS_DT,DM_RO
W_PRCS_UPDT_DT,CLAIM_INJRY_SID,DM_CRRNT_ROW_IND,INCDT_ID,ENAME,J
OB,FIRSTNAME,LASTNAME) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Solution:
check the definition of unique index columns and then below query on source to fine out thd
duplicate rows.
if index def like
create index on targettable(col1,col2,col3);
select col1,col2,col3,count(1)
from sourcetable
group by col1,col2,col3
having count(1)>1
either u have to delete those records from source or use agg in informatica mapping
Posted 14th October 2011 by Prafull Dangore
0

Add a comment

Oct
13

Validating Multiple Sessions in Informatica


Scenario:

Is there anyway of validating multiple workflows and their respective sessions at the same time
in Informatica. Validating them separately is tedious.
Solution:
Best approach is to create a worklet instead of workflow then put a set of sessions in worklet
then call all those worklets in single workflow.
After doing this, you can validate your workflow which contains the multiple worklet to validate
multiple sessions.

Posted 13th October 2011 by Prafull Dangore


0

Add a comment

Oct
13

Informatica Workflow Execution based on


Conditions
Scenario:
I have a table which contains a single row which has a column ABC. The value of ABC defines
different scenarios.
For ex. if the value of ABC is say 1, 1st workflow should be executed; if 2, 2nd workflow should
be executed and so on.
Solution:
If there are few values 1,2,3 for ABC

Then we can have filter in the mapping having source table with column ABC.
Filter the records with condition ABC=1,ABC=2,ABC=3 and load target tables
in three different mappings.
Create three different sessions and then use decision task in workflow
level as If tgtsuccessrows=1 for session1 then run worklet1
If tgtsuccessrows=2 for session2 then run worklet2
If tgtsuccessrows=2 for session3 then run worklet3
Posted 13th October 2011 by Prafull Dangore
0

Add a comment

Oct
13

Finding Objects in Checkout Status


Scenario:
So, does anyone know of a way to find what objects are in checkout status and who has it
checked out?
Solution:
Under the Repository database, there must be folders that you have created. Open that folder and
then right click and goto version->find checkouts->all users.
This will show the history of changes made and saved on that particular code. It will show you
details such as last check-out, last-saved time, saved-by, etc.
Posted 13th October 2011 by Prafull Dangore
0

Add a comment

Oct
13

Convert data as per specific region


Scenario : Convert date as per specific region
Solution:
Specifying an NLS parameter for an SQL function means that any User Session NLS parameters
(or the lack of) will not affect evaluation of the function.
This feature may be important for SQL statements that contain numbers and dates as string

literals. For example, the following query is evaluated correctly only if the language specified for
dates is American:
SELECT ENAME FROM EMP
WHERE HIREDATE > '1-JAN-01'

This can be made independent of the current date language by specifying


NLS_DATE_LANGUAGE:
SELECT ENAME FROM EMP
WHERE HIREDATE > TO_DATE('1-JAN-01','DD-MON-YY',
'NLS_DATE_LANGUAGE = AMERICAN')
Using all numerics is also language-independent:
SELECT ENAME FROM EMP
WHERE HIREDATE > TO_DATE('1-01-01','DD-MM-YY')

NLS settings include Character set, Language and Territory


Common character sets:

WE8ISO8859P15 European English includes euro character


US7ASCII
American English

The DATE datatype always stores a four-digit year internally.


If you use the standard date format DD-MON-YY
YY will assume a year in the range 1900-1999 - it is strongly recommended you apply a specific
format mask.
Posted 13th October 2011 by Prafull Dangore
0

Add a comment

Oct
13

Compare the Total Number of Rows in a Flat


File with the Footer of the Flat File
Scenario : I have a requirement where I need to find the number of rows in the flat file and then
compare the row count with the row count mentioned in the footer of the flat file.
Solution :
Using Infomratica:
I believe you can identify the data records from the trailer record. you can use following method
to identify the count of the records
1. use router to create two data streams ; one for data records & other for trailer record
2. use aggregator (with out defining any group key) and use count() aggregate function

now both data stream will have single record.


3.use joiner to get one record from these two data streams
it will give you two different count ports in single record
4. use expression for comparing the counts and proceed as per you rules.
Using UNIX :
If you are on Unix, then go for a couple of line script or commands:
Count number of lines in file by wc -l. Assign the count to variable x = (wc -l) - 1 i.e. neglecting
footer record.
Grep the number of records from footer using grep/sed. Assign it to variable y.
Now equate both these variables and take decision.

Posted 13th October 2011 by Prafull Dangore


0

Add a comment

Oct
11

Loading Multiple Flat Files using one


mapping
Scenario:
Can any one explain that how can we load multiple flat files using one mapping
Solution:
Use Indirect option in session properties and give file_list name. In the file list you can have
actual file names with complete path.
Ex: In Session Properties SourceFileType --- Indirect and File Name ABC.txt
ABC.txt will contain all the input file names with complete path.
like
/home/.../...filename.dat
/home/.../...filename1.dat
Posted 11th October 2011 by Prafull Dangore
0

Add a comment

Oct

11

Capture filename while using indirect file


Scenario : I have 5 source files which I am planning to load using indirect file as they all are of
same format and go to same target table. One requirement is to capture the source file name in
the target. Is there any simple way to achieve this? The filename column is there only for file
targets, not for file sources.
Solution:
Sol 1.
Effective with PowerCenter 8.5 there is an option called Add Currently Processed Flat File Name
Port.
If this flat file source option is selected, the file name port will be added in the ports of the
source.
To add the CurrentlyProcessedFileName port:
1. Open the flat file source definition in the Source Analyzer.
2. Click the Properties tab.
3. Select Add Currently Processed Flat File Name Port.
The Designer adds the CurrentlyProcessedFileName port as the last column on the Columns tab.
The CurrentlyProcessedFileName port is a string port with default precision of 256 characters.
4. Click the Columns tab to see your changes.
You may change the precision of the CurrentlyProcessedFileName port if you wish.
5. To remove the CurrentlyProcessedFileName port, click the Properties tab and clear the Add
Currently Processed Flat File Name Port check box.
For previous versions a shell script or batch file can be used in a pre-session command task.
------- Short desc
Double click flat file source and Go to Properties tab and check "Add Currently Processed Flat
File Name Port" check box. This will add a column "CurrentlyProcessedFileName" in flat file
columns list. So simple, Isn't it?
or
Sol 2.
You can append Filename to the file using Shell script...
-

#!/bin/ksh
for a in $(ls *.CSV)
do
fname=$( echo $a | tr Hi, You can append Filename Hi, You can append Filename )
sed -i -e "s/$/
t$/g" $
done;
exit 0
This adds file name for all .CSV files. Here the delimiter is TAB. you can change the script
according to your spec.

Posted 11th October 2011 by Prafull Dangore


0

Add a comment

Oct
11

CurrentlyProcessedFileName port in source


coming as NULL
Issue - The "CurrentlyProcessedFileName" port is coming properly in test environment. I
imported the same objects into stage. But in stage, "CurrentlyProcessedFileName" port is always
coming as NULL.
Solution 1. Edit source defn by removing CurrentlyProcessedFileName port and add again, this should
solve your problem.
Posted 11th October 2011 by Prafull Dangore

Add a comment

Apr
19

How PL/SQL Exceptions Are Raised ?


How PL/SQL Exceptions Are Raised
Internal exceptions are raised implicitly by the run-time system, as are user-defined
exceptions that you have associated with an Oracle error number using EXCEPTION_INIT.
However, other user-defined exceptions must be raised explicitly by RAISE statements.
Raising Exceptions with the RAISE Statement
PL/SQL blocks and subprograms should raise an exception only when an error makes it
undesirable or impossible to finish processing. You can place RAISE statements for a given
exception anywhere within the scope of that exception. In Example 10-6, you alert your
PL/SQL block to a user-defined exception named out_of_stock.
Example 10-6 Using RAISE to Force a User-Defined Exception
DECLARE
out_of_stock EXCEPTION;
number_on_hand NUMBER := 0;
BEGIN
IF number_on_hand < 1 THEN
RAISE out_of_stock; -- raise an exception that we defined
END IF;
EXCEPTION
WHEN out_of_stock THEN
-- handle the error
DBMS_OUTPUT.PUT_LINE('Encountered out-of-stock error.');
END;
/
You can also raise a predefined exception explicitly. That way, an exception handler written
for the predefined exception can process other errors, as Example 10-7 shows:
Example 10-7 Using RAISE to Force a Pre-Defined Exception
DECLARE
acct_type INTEGER := 7;
BEGIN
IF acct_type NOT IN (1, 2, 3) THEN
RAISE INVALID_NUMBER; -- raise predefined exception
END IF;
EXCEPTION
WHEN INVALID_NUMBER THEN
DBMS_OUTPUT.PUT_LINE('HANDLING INVALID INPUT BY ROLLING BACK.');
ROLLBACK;
END;
/
How PL/SQL Exceptions Propagate

When an exception is raised, if PL/SQL cannot find a handler for it in the current block or
subprogram, the exception propagates. That is, the exception reproduces itself in successive
enclosing blocks until a handler is found or there are no more blocks to search. If no handler
is found, PL/SQL returns an unhandled exception error to the host environment.
Exceptions cannot propagate across remote procedure calls done through database links. A
PL/SQL block cannot catch an exception raised by a remote subprogram. For a workaround,
see "Defining Your Own Error Messages: Procedure RAISE_APPLICATION_ERROR".
Figure 10-1, Figure 10-2, and Figure 10-3 illustrate the basic propagation rules.
Figure 10-1 Propagation Rules: Example 1

Description of the illustration lnpls009.gif


Figure 10-2 Propagation Rules: Example 2

Description of the illustration lnpls010.gif


Figure 10-3 Propagation Rules: Example 3

Description of the illustration lnpls011.gif


An exception can propagate beyond its scope, that is, beyond the block in which it was
declared, as shown in Example 10-8.
Example 10-8 Scope of an Exception
BEGIN
DECLARE ---------- sub-block begins
past_due EXCEPTION;
due_date DATE := trunc(SYSDATE) - 1;
todays_date DATE := trunc(SYSDATE);
BEGIN
IF due_date < todays_date THEN
RAISE past_due;
END IF;
END; ------------- sub-block ends
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
END;
/
Because the block that declares the exception past_due has no handler for it, the exception
propagates to the enclosing block. But the enclosing block cannot reference the name
PAST_DUE, because the scope where it was declared no longer exists. Once the exception
name is lost, only an OTHERS handler can catch the exception. If there is no handler for a
user-defined exception, the calling application gets this error:
ORA-06510: PL/SQL: unhandled user-defined exception
Reraising a PL/SQL Exception
Sometimes, you want to reraise an exception, that is, handle it locally, then pass it to an
enclosing block. For example, you might want to roll back a transaction in the current block,
then log the error in an enclosing block.
To reraise an exception, use a RAISE statement without an exception name, which is allowed
only in an exception handler:
Example 10-9 Reraising a PL/SQL Exception

DECLARE
salary_too_high EXCEPTION;
current_salary NUMBER := 20000;
max_salary NUMBER := 10000;
erroneous_salary NUMBER;
BEGIN
BEGIN ---------- sub-block begins
IF current_salary > max_salary THEN
RAISE salary_too_high; -- raise the exception
END IF;
EXCEPTION
WHEN salary_too_high THEN
-- first step in handling the error
DBMS_OUTPUT.PUT_LINE('Salary ' || erroneous_salary || ' is out of range.');
DBMS_OUTPUT.PUT_LINE('Maximum salary is ' || max_salary || '.');
RAISE; -- reraise the current exception
END; ------------ sub-block ends
EXCEPTION
WHEN salary_too_high THEN
-- handle the error more thoroughly
erroneous_salary := current_salary;
current_salary := max_salary;
DBMS_OUTPUT.PUT_LINE('Revising salary from ' || erroneous_salary ||
' to ' || current_salary || '.');
END;
/
Handling Raised PL/SQL Exceptions
When an exception is raised, normal execution of your PL/SQL block or subprogram stops
and control transfers to its exception-handling part, which is formatted as follows:
EXCEPTION
WHEN exception1 THEN -- handler for exception1
sequence_of_statements1
WHEN exception2 THEN -- another handler for exception2
sequence_of_statements2
...
WHEN OTHERS THEN -- optional handler for all other errors
sequence_of_statements3
END;
To catch raised exceptions, you write exception handlers. Each handler consists of a WHEN
clause, which specifies an exception, followed by a sequence of statements to be executed
when that exception is raised. These statements complete execution of the block or
subprogram; control does not return to where the exception was raised. In other words, you
cannot resume processing where you left off.
The optional OTHERS exception handler, which is always the last handler in a block or
subprogram, acts as the handler for all exceptions not named specifically. Thus, a block or
subprogram can have only one OTHERS handler. Use of the OTHERS handler guarantees that
no exception will go unhandled.
If you want two or more exceptions to execute the same sequence of statements, list the
exception names in the WHEN clause, separating them by the keyword OR, as follows:
EXCEPTION
WHEN over_limit OR under_limit OR VALUE_ERROR THEN
-- handle the error

If any of the exceptions in the list is raised, the associated sequence of statements is
executed. The keyword OTHERS cannot appear in the list of exception names; it must appear
by itself. You can have any number of exception handlers, and each handler can associate a
list of exceptions with a sequence of statements. However, an exception name can appear
only once in the exception-handling part of a PL/SQL block or subprogram.
The usual scoping rules for PL/SQL variables apply, so you can reference local and global
variables in an exception handler. However, when an exception is raised inside a cursor FOR
loop, the cursor is closed implicitly before the handler is invoked. Therefore, the values of
explicit cursor attributes are not available in the handler.
Exceptions Raised in Declarations
Exceptions can be raised in declarations by faulty initialization expressions. For example, the
following declaration raises an exception because the constant credit_limit cannot store
numbers larger than 999:
Example 10-10 Raising an Exception in a Declaration
DECLARE
credit_limit CONSTANT NUMBER(3) := 5000; -- raises an error
BEGIN
NULL;
EXCEPTION
WHEN OTHERS THEN
-- Cannot catch the exception. This handler is never called.
DBMS_OUTPUT.PUT_LINE('Can''t handle an exception in a declaration.');
END;
/
Handlers in the current block cannot catch the raised exception because an exception raised
in a declaration propagates immediately to the enclosing block.
Handling Exceptions Raised in Handlers
When an exception occurs within an exception handler, that same handler cannot catch the
exception. An exception raised inside a handler propagates immediately to the enclosing
block, which is searched to find a handler for this new exception. From there on, the
exception propagates normally. For example:
EXCEPTION
WHEN INVALID_NUMBER THEN
INSERT INTO ... -- might raise DUP_VAL_ON_INDEX
WHEN DUP_VAL_ON_INDEX THEN ... -- cannot catch the exception
END;
Branching to or from an Exception Handler
A GOTO statement can branch from an exception handler into an enclosing block.
A GOTO statement cannot branch into an exception handler, or from an exception handler
into the current block.
Retrieving the Error Code and Error Message: SQLCODE and SQLERRM
In an exception handler, you can use the built-in functions SQLCODE and SQLERRM to find
out which error occurred and to get the associated error message. For internal exceptions,
SQLCODE returns the number of the Oracle error. The number that SQLCODE returns is
negative unless the Oracle error is no data found, in which case SQLCODE returns +100.
SQLERRM returns the corresponding error message. The message begins with the Oracle
error code.
For user-defined exceptions, SQLCODE returns +1 and SQLERRM returns the message UserDefined Exception unless you used the pragma EXCEPTION_INIT to associate the exception
name with an Oracle error number, in which case SQLCODE returns that error number and
SQLERRM returns the corresponding error message. The maximum length of an Oracle error

message is 512 characters including the error code, nested messages, and message inserts
such as table and column names.
If no exception has been raised, SQLCODE returns zero and SQLERRM returns the message:
ORA-0000: normal, successful completion.
You can pass an error number to SQLERRM, in which case SQLERRM returns the message
associated with that error number. Make sure you pass negative error numbers to SQLERRM.
Passing a positive number to SQLERRM always returns the message user-defined exception
unless you pass +100, in which case SQLERRM returns the message no data found. Passing
a zero to SQLERRM always returns the message normal, successful completion.
You cannot use SQLCODE or SQLERRM directly in a SQL statement. Instead, you must assign
their values to local variables, then use the variables in the SQL statement, as shown in
Example 10-11.
Example 10-11 Displaying SQLCODE and SQLERRM
CREATE TABLE errors (code NUMBER, message VARCHAR2(64), happened TIMESTAMP);
DECLARE
name employees.last_name%TYPE;
v_code NUMBER;
v_errm VARCHAR2(64);
BEGIN
SELECT last_name INTO name FROM employees WHERE employee_id = -1;
EXCEPTION
WHEN OTHERS THEN
v_code := SQLCODE;
v_errm := SUBSTR(SQLERRM, 1 , 64);
DBMS_OUTPUT.PUT_LINE('Error code ' || v_code || ': ' || v_errm);
-- Normally we would call another procedure, declared with PRAGMA
-- AUTONOMOUS_TRANSACTION, to insert information about errors.
INSERT INTO errors VALUES (v_code, v_errm, SYSTIMESTAMP);
END;
/
The string function SUBSTR ensures that a VALUE_ERROR exception (for truncation) is not
raised when you assign the value of SQLERRM to err_msg. The functions SQLCODE and
SQLERRM are especially useful in the OTHERS exception handler because they tell you which
internal exception was raised.
When using pragma RESTRICT_REFERENCES to assert the purity of a stored function, you
cannot specify the constraints WNPS and RNPS if the function calls SQLCODE or SQLERRM.
Catching Unhandled Exceptions
Remember, if it cannot find a handler for a raised exception, PL/SQL returns an unhandled
exception error to the host environment, which determines the outcome. For example, in the
Oracle Precompilers environment, any database changes made by a failed SQL statement or
PL/SQL block are rolled back.
Unhandled exceptions can also affect subprograms. If you exit a subprogram successfully,
PL/SQL assigns values to OUT parameters. However, if you exit with an unhandled exception,
PL/SQL does not assign values to OUT parameters (unless they are NOCOPY parameters).
Also, if a stored subprogram fails with an unhandled exception, PL/SQL does not roll back
database work done by the subprogram.
You can avoid unhandled exceptions by coding an OTHERS handler at the topmost level of
every PL/SQL program.
Tips for Handling PL/SQL Errors
In this section, you learn techniques that increase flexibility.
Continuing after an Exception Is Raised
An exception handler lets you recover from an otherwise fatal error before exiting a block.
But when the handler completes, the block is terminated. You cannot return to the current

block from an exception handler. In the following example, if the SELECT INTO statement
raises ZERO_DIVIDE, you cannot resume with the INSERT statement:
CREATE TABLE employees_temp AS
SELECT employee_id, salary, commission_pct FROM employees;
DECLARE
sal_calc NUMBER(8,2);
BEGIN
INSERT INTO employees_temp VALUES (301, 2500, 0);
SELECT salary / commission_pct INTO sal_calc FROM employees_temp
WHERE employee_id = 301;
INSERT INTO employees_temp VALUES (302, sal_calc/100, .1);
EXCEPTION
WHEN ZERO_DIVIDE THEN
NULL;
END;
/
You can still handle an exception for a statement, then continue with the next statement.
Place the statement in its own sub-block with its own exception handlers. If an error occurs
in the sub-block, a local handler can catch the exception. When the sub-block ends, the
enclosing block continues to execute at the point where the sub-block ends, as shown in
Example 10-12.
Example 10-12 Continuing After an Exception
DECLARE
sal_calc NUMBER(8,2);
BEGIN
INSERT INTO employees_temp VALUES (303, 2500, 0);
BEGIN -- sub-block begins
SELECT salary / commission_pct INTO sal_calc FROM employees_temp
WHERE employee_id = 301;
EXCEPTION
WHEN ZERO_DIVIDE THEN
sal_calc := 2500;
END; -- sub-block ends
INSERT INTO employees_temp VALUES (304, sal_calc/100, .1);
EXCEPTION
WHEN ZERO_DIVIDE THEN
NULL;
END;
/
In this example, if the SELECT INTO statement raises a ZERO_DIVIDE exception, the local
handler catches it and sets sal_calc to 2500. Execution of the handler is complete, so the
sub-block terminates, and execution continues with the INSERT statement. See also Example
5-38, "Collection Exceptions".
You can also perform a sequence of DML operations where some might fail, and process the
exceptions only after the entire operation is complete, as described in "Handling FORALL
Exceptions with the %BULK_EXCEPTIONS Attribute".
Retrying a Transaction
After an exception is raised, rather than abandon your transaction, you might want to retry
it. The technique is:
1. Encase the transaction in a sub-block.
2. Place the sub-block inside a loop that repeats the transaction.

3.

Before starting the transaction, mark a savepoint. If the transaction succeeds, commit, then
exit from the loop. If the transaction fails, control transfers to the exception handler, where
you roll back to the savepoint undoing any changes, then try to fix the problem.
In Example 10-13, the INSERT statement might raise an exception because of a duplicate
value in a unique column. In that case, we change the value that needs to be unique and
continue with the next loop iteration. If the INSERT succeeds, we exit from the loop
immediately. With this technique, you should use a FOR or WHILE loop to limit the number of
attempts.
Example 10-13 Retrying a Transaction After an Exception
CREATE TABLE results ( res_name VARCHAR(20), res_answer VARCHAR2(3) );
CREATE UNIQUE INDEX res_name_ix ON results (res_name);
INSERT INTO results VALUES ('SMYTHE', 'YES');
INSERT INTO results VALUES ('JONES', 'NO');
DECLARE
name
VARCHAR2(20) := 'SMYTHE';
answer VARCHAR2(3) := 'NO';
suffix NUMBER := 1;
BEGIN
FOR i IN 1..5 LOOP -- try 5 times
BEGIN -- sub-block begins
SAVEPOINT start_transaction; -- mark a savepoint
/* Remove rows from a table of survey results. */
DELETE FROM results WHERE res_answer = 'NO';
/* Add a survey respondent's name and answers. */
INSERT INTO results VALUES (name, answer);
-- raises DUP_VAL_ON_INDEX if two respondents have the same name
COMMIT;
EXIT;
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
ROLLBACK TO start_transaction; -- undo changes
suffix := suffix + 1;
-- try to fix problem
name := name || TO_CHAR(suffix);
END; -- sub-block ends
END LOOP;
END;
/
Using Locator Variables to Identify Exception Locations
Using one exception handler for a sequence of statements, such as INSERT, DELETE, or
UPDATE statements, can mask the statement that caused an error. If you need to know
which statement failed, you can use a locator variable:
Example 10-14 Using a Locator Variable to Identify the Location of an Exception
CREATE OR REPLACE PROCEDURE loc_var AS
stmt_no NUMBER;
name VARCHAR2(100);
BEGIN
stmt_no := 1; -- designates 1st SELECT statement
SELECT table_name INTO name FROM user_tables WHERE table_name LIKE 'ABC%';
stmt_no := 2; -- designates 2nd SELECT statement
SELECT table_name INTO name FROM user_tables WHERE table_name LIKE 'XYZ%';
EXCEPTION
WHEN NO_DATA_FOUND THEN
DBMS_OUTPUT.PUT_LINE('Table name not found in query ' || stmt_no);

END;
/
CALL loc_var();

Overview of PL/SQL Compile-Time Warnings


To make your programs more robust and avoid problems at run time, you can turn on
checking for certain warning conditions. These conditions are not serious enough to produce
an error and keep you from compiling a subprogram. They might point out something in the
subprogram that produces an undefined result or might create a performance problem.
To work with PL/SQL warning messages, you use the PLSQL_WARNINGS initialization
parameter, the DBMS_WARNING package, and the USER/DBA/ALL_PLSQL_OBJECT_SETTINGS
views.
PL/SQL Warning Categories
PL/SQL warning messages are divided into categories, so that you can suppress or display
groups of similar warnings during compilation. The categories are:
SEVERE: Messages for conditions that might cause unexpected behavior or wrong results,
such as aliasing problems with parameters.
PERFORMANCE: Messages for conditions that might cause performance problems, such as
passing a VARCHAR2 value to a NUMBER column in an INSERT statement.
INFORMATIONAL: Messages for conditions that do not have an effect on performance or
correctness, but that you might want to change to make the code more maintainable, such
as unreachable code that can never be executed.
The keyword All is a shorthand way to refer to all warning messages.
You can also treat particular messages as errors instead of warnings. For example, if you
know that the warning message PLW-05003 represents a serious problem in your code,
including 'ERROR:05003' in the PLSQL_WARNINGS setting makes that condition trigger an
error message (PLS_05003) instead of a warning message. An error message causes the
compilation to fail.
Controlling PL/SQL Warning Messages
To let the database issue warning messages during PL/SQL compilation, you set the
initialization parameter PLSQL_WARNINGS. You can enable and disable entire categories of
warnings (ALL, SEVERE, INFORMATIONAL, PERFORMANCE), enable and disable specific
message numbers, and make the database treat certain warnings as compilation errors so
that those conditions must be corrected.
This parameter can be set at the system level or the session level. You can also set it for a
single compilation by including it as part of the ALTER PROCEDURE ... COMPILE statement.
You might turn on all warnings during development, turn off all warnings when deploying for
production, or turn on some warnings when working on a particular subprogram where you
are concerned with some aspect, such as unnecessary code or performance.
Example 10-15 Controlling the Display of PL/SQL Warnings
-- To focus on one aspect
ALTER SESSION SET PLSQL_WARNINGS='ENABLE:PERFORMANCE';
-- Recompile with extra checking
ALTER PROCEDURE loc_var COMPILE PLSQL_WARNINGS='ENABLE:PERFORMANCE'
REUSE SETTINGS;
-- To turn off all warnings
ALTER SESSION SET PLSQL_WARNINGS='DISABLE:ALL';
-- Display 'severe' warnings, don't want 'performance' warnings, and
-- want PLW-06002 warnings to produce errors that halt compilation
ALTER SESSION SET PLSQL_WARNINGS='ENABLE:SEVERE', 'DISABLE:PERFORMANCE',
'ERROR:06002';
-- For debugging during development
ALTER SESSION SET PLSQL_WARNINGS='ENABLE:ALL';

Warning messages can be issued during compilation of PL/SQL subprograms; anonymous


blocks do not produce any warnings.
The settings for the PLSQL_WARNINGS parameter are stored along with each compiled
subprogram. If you recompile the subprogram with a CREATE OR REPLACE statement, the
current settings for that session are used. If you recompile the subprogram with an ALTER ...
COMPILE statement, the current session setting might be used, or the original setting that
was stored with the subprogram, depending on whether you include the REUSE SETTINGS
clause in the statement. For more information, see ALTER FUNCTION, ALTER PACKAGE, and
ALTER PROCEDURE in Oracle Database SQL Reference.
To see any warnings generated during compilation, you use the SQL*Plus SHOW ERRORS
command or query the USER_ERRORS data dictionary view. PL/SQL warning messages all
use the prefix PLW.
Using the DBMS_WARNING Package
If you are writing a development environment that compiles PL/SQL subprograms, you can
control PL/SQL warning messages by calling subprograms in the DBMS_WARNING package.
You might also use this package when compiling a complex application, made up of several
nested SQL*Plus scripts, where different warning settings apply to different subprograms.
You can save the current state of the PLSQL_WARNINGS parameter with one call to the
package, change the parameter to compile a particular set of subprograms, then restore the
original parameter value.
For example, Example 10-16 is a procedure with unnecessary code that could be removed. It
could represent a mistake, or it could be intentionally hidden by a debug flag, so you might
or might not want a warning message for it.
Example 10-16 Using the DBMS_WARNING Package to Display Warnings
-- When warnings disabled, the following procedure compiles with no warnings
CREATE OR REPLACE PROCEDURE unreachable_code AS
x CONSTANT BOOLEAN := TRUE;
BEGIN
IF x THEN
DBMS_OUTPUT.PUT_LINE('TRUE');
ELSE
DBMS_OUTPUT.PUT_LINE('FALSE');
END IF;
END unreachable_code;
/
-- enable all warning messages for this session
CALL DBMS_WARNING.set_warning_setting_string('ENABLE:ALL' ,'SESSION');
-- Check the current warning setting
SELECT DBMS_WARNING.get_warning_setting_string() FROM DUAL;
-- Recompile the procedure and a warning about unreachable code displays
ALTER PROCEDURE unreachable_code COMPILE;
SHOW ERRORS;
In Example 10-16, you could have used the following ALTER PROCEDURE without the call to
DBMS_WARNINGS.set_warning_setting_string:
ALTER PROCEDURE unreachable_code COMPILE
PLSQL_WARNINGS = 'ENABLE:ALL' REUSE SETTINGS;

Posted 19th April 2011 by Prafull Dangore


0

Add a comment

Apr
19

What are the different types of pragma and


where can we use them?
===============================
===============================
==========
What are the different types of pragma and where can we use them?
Pragma is a keyword in Oracle PL/SQL that is used to provide an instruction to the compiler.
The syntax for pragmas are as follows
PRAMA
The instruction is a statement that provides some instructions to the compiler.
Pragmas are defined in the declarative section in PL/SQL.
The following pragmas are available:
AUTONOMOUS_TRANSACTION:
Prior to Oracle 8.1, each Oracle session in PL/SQL could have at most one active transaction
at a given time. In other words, changes were all or nothing. Oracle8i PL/SQL addresses that
short comings with the AUTONOMOUS_TRANSACTION pragma. This pragma can perform an
autonomous transaction within a PL/SQL block between a BEGIN and END statement without
affecting the entire transaction. For instance, if rollback or commit needs to take place within
the block without effective the transaction outside the block, this type of pragma can be
used.
EXCEPTION_INIT:
The most commonly used pragma, this is used to bind a user defined exception to a
particular error number.
For example:
Declare
I_GIVE_UP EXCEPTION;
PRAGMA EXCEPTION_INIT(I_give_up, -20000);
BEGIN
..
EXCEPTION WHEN I_GIVE_UP
do something..
END;
RESTRICT_REFERENCES:
Defines the purity level of a packaged program. This is not required starting with Oracle8i.
Prior to Oracle8i if you were to invoke a function within a package specification from a SQL
statement, you would have to provide a RESTRICT_REFERENCE directive to the PL/SQL
engine for that function.

Associating a PL/SQL Exception with a Number: Pragma EXCEPTION_INIT


To handle error conditions (typically ORA- messages) that have no predefined name, you
must use the OTHERS handler or the pragma EXCEPTION_INIT. A pragma is a compiler
directive that is processed at compile time, not at run time.
In PL/SQL, the pragma EXCEPTION_INIT tells the compiler to associate an exception name
with an Oracle error number. That lets you refer to any internal exception by name and to
write a specific handler for it. When you see an error stack, or sequence of error messages,
the one on top is the one that you can trap and handle.
You code the pragma EXCEPTION_INIT in the declarative part of a PL/SQL block, subprogram,
or package using the syntax
PRAGMA EXCEPTION_INIT(exception_name, -Oracle_error_number);
where exception_name is the name of a previously declared exception and the number is a
negative value corresponding to an ORA- error number. The pragma must appear
somewhere after the exception declaration in the same declarative section, as shown in
Example 10-4.
Example 10-4 Using PRAGMA EXCEPTION_INIT
DECLARE
deadlock_detected EXCEPTION;
PRAGMA EXCEPTION_INIT(deadlock_detected, -60);
BEGIN
NULL; -- Some operation that causes an ORA-00060 error
EXCEPTION
WHEN deadlock_detected THEN
NULL; -- handle the error
END;
/
Defining Your Own Error Messages: Procedure RAISE_APPLICATION_ERROR
The procedure RAISE_APPLICATION_ERROR lets you issue user-defined ORA- error messages
from stored subprograms. That way, you can report errors to your application and avoid
returning unhandled exceptions.
To call RAISE_APPLICATION_ERROR, use the syntax
raise_application_error(
error_number, message[, {TRUE | FALSE}]);
where error_number is a negative integer in the range -20000 .. -20999 and message is a
character string up to 2048 bytes long. If the optional third parameter is TRUE, the error is
placed on the stack of previous errors. If the parameter is FALSE (the default), the error
replaces all previous errors. RAISE_APPLICATION_ERROR is part of package
DBMS_STANDARD, and as with package STANDARD, you do not need to qualify references to
it.
An application can call raise_application_error only from an executing stored subprogram (or
method). When called, raise_application_error ends the subprogram and returns a userdefined error number and message to the application. The error number and message can
be trapped like any Oracle error.
In Example 10-5, you call raise_application_error if an error condition of your choosing
happens (in this case, if the current schema owns less than 1000 tables):
Example 10-5 Raising an Application Error With raise_application_error
DECLARE
num_tables NUMBER;
BEGIN
SELECT COUNT(*) INTO num_tables FROM USER_TABLES;
IF num_tables < 1000 THEN
/* Issue your own error code (ORA-20101) with your own error message.
Note that you do not need to qualify raise_application_error with

DBMS_STANDARD */
raise_application_error(-20101, 'Expecting at least 1000 tables');
ELSE
NULL; -- Do the rest of the processing (for the non-error case).
END IF;
END;
/
The calling application gets a PL/SQL exception, which it can process using the errorreporting functions SQLCODE and SQLERRM in an OTHERS handler. Also, it can use the
pragma EXCEPTION_INIT to map specific error numbers returned by raise_application_error
to exceptions of its own, as the following Pro*C example shows:
EXEC SQL EXECUTE
/* Execute embedded PL/SQL block using host
variables v_emp_id and v_amount, which were
assigned values in the host environment. */
DECLARE
null_salary EXCEPTION;
/* Map error number returned by raise_application_error
to user-defined exception. */
PRAGMA EXCEPTION_INIT(null_salary, -20101);
BEGIN
raise_salary(:v_emp_id, :v_amount);
EXCEPTION
WHEN null_salary THEN
INSERT INTO emp_audit VALUES (:v_emp_id, ...);
END;
END-EXEC;
This technique allows the calling application to handle error conditions in specific exception
handlers.
Redeclaring Predefined Exceptions
Remember, PL/SQL declares predefined exceptions globally in package STANDARD, so you
need not declare them yourself. Redeclaring predefined exceptions is error prone because
your local declaration overrides the global declaration. For example, if you declare an
exception named invalid_number and then PL/SQL raises the predefined exception
INVALID_NUMBER internally, a handler written for INVALID_NUMBER will not catch the
internal exception. In such cases, you must use dot notation to specify the predefined
exception, as follows:
EXCEPTION
WHEN invalid_number OR STANDARD.INVALID_NUMBER THEN
-- handle the error
END;

===============================
===============================
==========
Posted 19th April 2011 by Prafull Dangore
0

Add a comment


Apr
13

A Comparison of Oracle's DATE and


TIMESTAMP Datatypes
==============================================
================
A Comparison of Oracle's DATE and TIMESTAMP Datatypes
Oracle date and time data types, calculations around these data types, and just plain who to
use them often plauge Oracle users more than they should. Here is an article I wrote a while
back but still holds some good insight (I think) to using these data types. Hope you agree. If
you want to store date and time information in Oracle, you really only have two different
options for the column's datatype. Lets take a quick look at these two datatypes and what
they offer.
DATE datatype
This is the datatype that we are all too familiar with when we think about representing date
and time values. It has the ability to store the month, day, year, century, hours, minutes,
and seconds. It is typically good for representing data for when something has happened or
should happen in the future. The problem with the DATE datatype is its' granularity when
trying to determine a time interval between two events when the events happen within a
second of each other. This issue is solved later in this article when we discuss the
TIMESTAMP datatype. In order to represent the date stored in a more readable format, the
TO_CHAR function has traditionally been wrapped around the date as in Listing A.
LISTING A:
Formatting a date
SQL> SELECT TO_CHAR(date1,'MM/DD/YYYY HH24:MI:SS') "Date" FROM date_table;
Date
--------------------------06/20/2003 16:55:14
06/26/2003 11:16:36
About the only trouble I have seen people get into when using the DATE datatype is doing
arithmetic on the column in order to figure out the number of years, weeks, days, hours, and
seconds between two dates. What needs to be realized when doing the calculation is that
when you do subtraction between dates, you get a number that represents the number of
days. You should then multiply that number by the number of seconds in a day (86400)
before you continue with calculations to determine the interval with which you are
concerned. Check out Listing B for my solution on how to extract the individual time
intervals for a subtraction of two dates. I am aware that the fractions could be reduced but I
wanted to show all the numbers to emphasize the calculation.
LISTING B:
Determine the interval breakdown between two dates for a DATE datatype
1
SELECT TO_CHAR(date1,'MMDDYYYY:HH24:MI:SS') date1,
2
TO_CHAR(date2,'MMDDYYYY:HH24:MI:SS') date2,
3
trunc(86400*(date2-date1))4
60*(trunc((86400*(date2-date1))/60)) seconds,
5
trunc((86400*(date2-date1))/60)6
60*(trunc(((86400*(date2-date1))/60)/60)) minutes,

7
trunc(((86400*(date2-date1))/60)/60)8
24*(trunc((((86400*(date2-date1))/60)/60)/24)) hours,
9
trunc((((86400*(date2-date1))/60)/60)/24) days,
10
trunc(((((86400*(date2-date1))/60)/60)/24)/7) weeks
11*
FROM date_table
DATE1
DATE2
SECONDS MINUTES
HOURS
----------------- ----------------- ---------- ---------- ---------- ---------- ---------06202003:16:55:14 07082003:11:22:57
43
27
18
06262003:11:16:36 07082003:11:22:57
21
6
0

DAYS
17
12

WEEKS
2
1

TIMESTAMP datatype
One of the main problems with the DATE datatype was its' inability to be granular enough to
determine which event might have happened first in relation to another event. Oracle has
expanded on the DATE datatype and has given us the TIMESTAMP datatype which stores all
the information that the DATE datatype stores, but also includes fractional seconds. If you
want to convert a DATE datatype to a TIMESTAMP datatype format, just use the CAST
function as I do in Listing C. As you can see, there is a fractional seconds part of '.000000' on
the end of this conversion. This is only because when converting from the DATE datatype
that does not have the fractional seconds it defaults to zeros and the display is defaulted to
the default timestamp format (NLS_TIMESTAMP_FORMAT). If you are moving a DATE datatype
column from one table to a TIMESTAMP datatype column of another table, all you need to do
is a straight INSERTSELECT FROM and Oracle will do the conversion for you. Look at Listing D
for a formatting of the new TIMESTAMP datatype where everything is the same as formatting
the DATE datatype as we did in Listing A. Beware while the TO_CHAR function works with
both datatypes, the TRUNC function will not work with a datatype of TIMESTAMP. This is a
clear indication that the use of TIMESTAMP datatype should explicitly be used for date and
times where a difference in time is of utmost importance, such that Oracle won't even let
you compare like values. If you wanted to show the fractional seconds within a TIMESTAMP
datatype, look at Listing E. In Listing E, we are only showing 3 place holders for the fractional
seconds.
LISTING C:
Convert DATE datatype to TIMESTAMP datatype
SQL> SELECT CAST(date1 AS TIMESTAMP) "Date" FROM t;
Date
----------------------------------------------------20-JUN-03 04.55.14.000000 PM
26-JUN-03 11.16.36.000000 AM
LISTING D:
Formatting of the TIMESTAMP datatype
1 SELECT TO_CHAR(time1,'MM/DD/YYYY HH24:MI:SS') "Date" FROM date_table
Date
------------------06/20/2003 16:55:14
06/26/2003 11:16:36
LISTING E:
Formatting of the TIMESTAMP datatype with fractional seconds
1 SELECT TO_CHAR(time1,'MM/DD/YYYY HH24:MI:SS:FF3') "Date" FROM date_table
Date
----------------------06/20/2003 16:55:14:000
06/26/2003 11:16:36:000

Calculating the time difference between two TIMESTAMP datatypesdatatype. Look at what
happens when you just do straight subtraction of the columns in Listing F. As you can see,
the results are much easier to recognize, 17days, 18hours, 27minutes, and 43seconds for
the first row of output. This means no more worries about how many seconds in a day and
all those cumbersome calculations. And therefore the calculations for getting the weeks,
days, hours, minutes, and seconds becomes a matter of picking out the number by using the
SUBSTR function as can be seen in Listing G.
LISTING F:
Straight subtraction of two TIMESTAMP datatypes
1 SELECT time1, time2, (time2-time1)
2* FROM date_table
TIME1
TIME2
(TIME2-TIME1)
------------------------------ ---------------------------- ---------------------06/20/2003:16:55:14:000000
07/08/2003:11:22:57:000000
+000000017
18:27:43.000000
06/26/2003:11:16:36:000000
07/08/2003:11:22:57:000000
+000000012
00:06:21.000000
LISTING G:
Determine the interval breakdown between two dates for a TIMESTAMP datatype
1 SELECT time1,
2
time2,
3
substr((time2-time1),instr((time2-time1),' ')+7,2)
seconds,
4
substr((time2-time1),instr((time2-time1),' ')+4,2)
minutes,
5
substr((time2-time1),instr((time2-time1),' ')+1,2)
hours,
6
trunc(to_number(substr((time2-time1),1,instr(time2-time1,' ')))) days,
7
trunc(to_number(substr((time2-time1),1,instr(time2-time1,' ')))/7) weeks
8* FROM date_table
TIME1
TIME2
SECONDS MINUTES HOURS DAYS WEEKS
------------------------- -------------------------- ------- ------- ----- ---- ----06/20/2003:16:55:14:000000 07/08/2003:11:22:57:000000 43
27
18 17 2
06/26/2003:11:16:36:000000 07/08/2003:11:22:57:000000 21
06
00 12 1
System Date and Time
In order to get the system date and time returned in a DATE datatype, you can use the
SYSDATE function such as :
SQL> SELECT SYSDATE FROM DUAL;
In order to get the system date and time returned in a TIMESTAMP datatype, you can use the
SYSTIMESTAMP function such as:
SQL> SELECT SYSTIMESTAMP FROM DUAL;
You can set the initialization parameter FIXED_DATE to return a constant value for what is
returned from the SYSDATE function. This is a great tool for testing date and time sensitive
code. Just beware that this parameter has no effect on the SYSTIMESTAMP function. This can
be seen in Listing H.
LISTING H:
Setting FIXED_DATE and effects on SYSDATE and SYSTIMESTAMP
SQL> ALTER SYSTEM SET fixed_date = '2003-01-01-10:00:00';
System altered.
SQL> select sysdate from dual;
SYSDATE
--------01-JAN-03
SQL> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------

09-JUL-03 11.05.02.519000 AM -06:00


When working with date and time, the options are clear. You have at your disposal the DATE
and TIMESTAMP datatypes. Just be aware, while there are similarities, there are also
differences that could create havoc if you try to convert to the more powerful TIMESTAMP
datatype. Each of the two has strengths in simplicity and granularity. Choose wisely.

===============================
===============================
====
Posted 13th April 2011 by Prafull Dangore
0

Add a comment

Apr
13

Informatica Power Center performance


Concurrent Workflow Execution
==============================================================
=============
Informatica Power Center performance Concurrent Workflow Execution
What is concurrent work flow?
A concurrent workflow is a workflow that can run as multiple instances concurrently.
What is workflow instance?
A workflow instance is a representation of a workflow.
How to configure concurrent workflow?
1) Allow concurrent workflows with the same instance name:
Configure one workflow instance to run multiple times concurrently. Each instance has the same
source, target, and variables parameters.
Eg: Create a workflow that reads data from a message queue that determines the source data and
targets. You can run the instance multiple times concurrently and pass different connection parameters
to the workflow instances from the message queue.
2) Configure unique workflow instances to run concurrently:
Define each workflow instance name and configure a workflow parameter file for the instance. You can
define different sources, targets, and variables in the parameter file.
Eg: Configure workflow instances to run a workflow with different sources and targets. For example,
your organization receives sales data from three divisions. You create a workflow that reads the sales
data and writes it to the database. You configure three instances of the workflow. Each instance has a
different workflow parameter file that defines which sales file to process. You can run all instances of
the workflow concurrently.
How concurrent workflow Works?
A concurrent workflow groups logical sessions and tasks together, like a sequential workflow, but runs
all the tasks at one time.
Advantages of Concurrent workflow?
This can reduce the load times into the warehouse, taking advantage of hardware platforms

Symmetric Multi-Processing (SMP) architecture.


LOAD SCENARIO:
Source table records count: 150,622,276

==============================================================
=============

Posted 13th April 2011 by Prafull Dangore


0

Add a comment

Apr
13

Informatica Performance Improvement Tips


==============================================================
=============
Informatica Performance Improvement Tips
We often come across situations where Data Transformation Manager (DTM) takes more time to read
from Source or when writing in to a Target. Following standards/guidelines can improve the overall
performance.

Use Source Qualifier if the Source tables reside in the same schema

Make use of Source Qualifer Filter Properties if the Source type is Relational.

If the subsequent sessions are doing lookup on the same table, use persistent cache in the first
session. Data remains in the Cache and available for the subsequent session for usage.

Use flags as integer, as the integer comparison is faster than the string comparison.

Use tables with lesser number of records as master table for joins.

While reading from Flat files, define the appropriate data type instead of reading as String and
converting.

Have all Ports that are required connected to Subsequent Transformations else check whether
we can remove these ports

Suppress ORDER BY using the at the end of the query in Lookup Transformations

Minimize the number of Update strategies.

Group by simple columns in transformations like Aggregate, Source Qualifier

Use Router transformation in place of multiple Filter transformations.

Turn off the Verbose Logging while moving the mappings to UAT/Production environment.

For large volume of data drop index before loading and recreate indexes after load.

For large of volume of records Use Bulk load Increase the commit interval to a higher value
large volume of data

Set Commit on Target in the sessions

==============================================================
=============

Posted 13th April 2011 by Prafull Dangore


0

Add a comment

Apr
13

What is Pushdown Optimization and things


to consider
==============================================================
===================
What is Pushdown Optimization and things to consider
The process of pushing transformation logic to the source or target database by Informatica Integration
service is known as Pushdown Optimization. When a session is configured to run for Pushdown
Optimization, the Integration Service translates the transformation logic into SQL queries and sends
the SQL queries to the database. The Source or Target Database executes the SQL queries to process
the transformations.
How does Pushdown Optimization (PO) Works?
The Integration Service generates SQL statements when native database driver is used. In case of
ODBC drivers, the Integration Service cannot detect the database type and generates ANSI SQL. The
Integration Service can usually push more transformation logic to a database if a native driver is used,
instead of an ODBC driver.
For any SQL Override, Integration service creates a view (PM_*) in the database while executing the
session task and drops the view after the task gets complete. Similarly it also create sequences (PM_*)
in the database.
Database schema (SQ Connection, LKP connection), should have the Create View / Create Sequence
Privilege, else the session will fail.
Few Benefits in using PO

There is no memory or disk space required to manage the cache in the Informatica server for
Aggregator, Lookup, Sorter and Joiner Transformation, as the transformation logic is pushed to
database.

SQL Generated by Informatica Integration service can be viewed before running the session
through Optimizer viewer, making easier to debug.

When inserting into Targets, Integration Service do row by row processing using bind variable
(only soft parse only processing time, no parsing time). But In case of Pushdown
Optimization, the statement will be executed once.
Without Using Pushdown optimization:
INSERT INTO EMPLOYEES(ID_EMPLOYEE, EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL,
PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT,
MANAGER_ID,MANAGER_NAME,
DEPARTMENT_ID) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13) executes 7012352 times
With Using Pushdown optimization
INSERT
INTO
EMPLOYEES(ID_EMPLOYEE,
EMPLOYEE_ID,
FIRST_NAME,
LAST_NAME,
EMAIL,
PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT, MANAGER_ID, MANAGER_NAME,
DEPARTMENT_ID) SELECT CAST(PM_SJEAIJTJRNWT45X3OO5ZZLJYJRY.NEXTVAL AS NUMBER(15, 2)),
EMPLOYEES_SRC.EMPLOYEE_ID,
EMPLOYEES_SRC.FIRST_NAME,
EMPLOYEES_SRC.LAST_NAME,

CAST((EMPLOYEES_SRC.EMAIL || @gmail.com) AS VARCHAR2(25)), EMPLOYEES_SRC.PHONE_NUMBER,


CAST(EMPLOYEES_SRC.HIRE_DATE AS date), EMPLOYEES_SRC.JOB_ID, EMPLOYEES_SRC.SALARY,
EMPLOYEES_SRC.COMMISSION_PCT,
EMPLOYEES_SRC.MANAGER_ID,
NULL,
EMPLOYEES_SRC.DEPARTMENT_ID
FROM
(EMPLOYEES_SRC
LEFT
OUTER
JOIN
EMPLOYEES
PM_Alkp_emp_mgr_1 ON (PM_Alkp_emp_mgr_1.EMPLOYEE_ID = EMPLOYEES_SRC.MANAGER_ID))
WHERE ((EMPLOYEES_SRC.MANAGER_ID = (SELECT PM_Alkp_emp_mgr_1.EMPLOYEE_ID FROM
EMPLOYEES
PM_Alkp_emp_mgr_1
WHERE
(PM_Alkp_emp_mgr_1.EMPLOYEE_ID
=
EMPLOYEES_SRC.MANAGER_ID))) OR (0=0)) executes 1 time
Things to note when using PO
There are cases where the Integration Service and Pushdown Optimization can produce different result
sets for the same transformation logic. This can happen during data type conversion, handling null
values, case sensitivity, sequence generation, and sorting of data.
The database and Integration Service produce different output when the following settings and
conversions are different:

Nulls treated as the highest or lowest value: While sorting the data, the Integration
Service can treat null values as lowest, but database treats null values as the highest value in
the sort order.

SYSDATE built-in variable: Built-in Variable SYSDATE in the Integration Service returns the
current date and time for the node running the service process. However, in the database, the
SYSDATE returns the current date and time for the machine hosting the database. If the time
zone of the machine hosting the database is not the same as the time zone of the machine
running the Integration Service process, the results can vary.

Date Conversion: The Integration Service converts all dates before pushing transformations
to the database and if the format is not supported by the database, the session fails.

Logging: When the Integration Service pushes transformation logic to the database, it cannot
trace all the events that occur inside the database server. The statistics the Integration Service
can trace depend on the type of pushdown optimization. When the Integration Service runs a
session configured for full pushdown optimization and an error occurs, the database handles
the errors. When the database handles errors, the Integration Service does not write reject
rows to the reject file.
==============================================================
===================

Posted 13th April 2011 by Prafull Dangore


0

Add a comment

Apr
13

Informatica OPB table which have gives


source table and the mappings and folders
using an sql query
SQL query
select OPB_SUBJECT.SUBJ_NAME,
OPB_MAPPING.MAPPING_NAME,OPB_SRC.source_name
from opb_mapping, opb_subject, opb_src, opb_widget_inst

where opb_subject.SUBJ_ID = opb_mapping.SUBJECT_ID


and OPB_MAPPING.MAPPING_ID = OPB_WIDGET_INST.MAPPING_ID
and OPB_WIDGET_Inst.WIDGET_ID = OPB_SRC.SRC_ID
and OPB_widget_inst.widget_type=1;

Posted 13th April 2011 by Prafull Dangore


0

Add a comment

Mar
28

How to remove/trim special characters in


flatfile source field? - Consolidated Info
Que. How to remove special characters like ## in the below ...
Can any one suggest...
Prod_Code
--#PC97##
#PC98##
#PC99##
#PC125#
#PC156#
------#PC767#
#PC766#
#PC921#
#PC1020
#PC1071
#PC1092
#PC1221
i want to remove that special characters....
i want to load in the target just
Prod_Code
--PC9
PC98
PC99
PC125
PC156 .
Ans:

In expression ,use the replacechar function and in that just replace # with null char.

REPLACECHR
Availability:
Designer
Workflow Manager
Replaces characters in a string with a single character or no character.
REPLACECHR searches the input string for the characters you specify and
replaces all occurrences of all characters with the new character you specify.

Syntax
REPLACECHR( CaseFlag, InputString, OldCharSet, NewChar )
Argume
nt

Required Description
/
Optional
CaseFlag Required Must be an integer. Determines whether the arguments
in this function are case sensitive. You can enter any
valid transformation expression.
When CaseFlag is a number other than 0, the function is
case sensitive.
When CaseFlag is a null value or 0, the function is not
case sensitive.
InputStrin Required Must be a character string. Passes the string you want to
g
search. You can enter any valid transformation
expression. If you pass a numeric value, the function
converts it to a character string.
If InputString is NULL, REPLACECHR returns NULL.
OldCharS Required Must be a character string. The characters you want to
et
replace. You can enter one or more characters. You can
enter any valid transformation expression. You can also
enter a text literal enclosed within single quotation
marks, for example, 'abc'.
If you pass a numeric value, the function converts it to a
character string.
If OldCharSet is NULL or empty, REPLACECHR returns
InputString.
NewChar Required Must be a character string. You can enter one character,
an empty string, or NULL. You can enter any valid
transformation expression.
If NewChar is NULL or empty, REPLACECHR removes all

occurrences of all characters in OldCharSet in


InputString.
If NewChar contains more than one character,
REPLACECHR uses the first character to replace
OldCharSet.

Return Value
String.
Empty string if REPLACECHR removes all characters in InputString.
NULL if InputString is NULL.
InputString if OldCharSet is NULL or empty.

Examples
The following expression removes the double quotes from web log data for
each row in the WEBLOG port:
REPLACECHR( 0, WEBLOG, '"', NULL )
WEBLOG
RETURN VALUE
"GET /news/index.html HTTP/1.1"
GET /news/index.html HTTP/1.1
"GET /companyinfo/index.html
GET /companyinfo/index.html
HTTP/1.1"
HTTP/1.1
GET /companyinfo/index.html
GET /companyinfo/index.html
HTTP/1.1
HTTP/1.1
NULL
NULL
The following expression removes multiple characters for each row in the
WEBLOG port:
REPLACECHR ( 1, WEBLOG, ']["', NULL )
WEBLOG
RETURN VALUE
[29/Oct/2001:14:13:50 -0700]
29/Oct/2001:14:13:50 -0700
[31/Oct/2000:19:45:46 -0700] "GET
31/Oct/2000:19:45:46 -0700 GET
/news/index.html HTTP/1.1"
/news/index.html HTTP/1.1
[01/Nov/2000:10:51:31 -0700] "GET
01/Nov/2000:10:51:31 -0700 GET
/news/index.html HTTP/1.1"
/news/index.html HTTP/1.1
NULL
NULL
The following expression changes part of the value of the customer code for
each row in the CUSTOMER_CODE port:
REPLACECHR ( 1, CUSTOMER_CODE, 'A', 'M' )
CUSTOMER_CODE
ABA
abA

RETURN VALUE
MBM
abM

BBC
BBC
ACC
MCC
NULL
NULL
The following expression changes part of the value of the customer code for
each row in the CUSTOMER_CODE port:
REPLACECHR ( 0, CUSTOMER_CODE, 'A', 'M' )
CUSTOMER_CODE
RETURN VALUE
ABA
MBM
abA
MbM
BBC
BBC
ACC
MCC
The following expression changes part of the value of the customer code for
each row in the CUSTOMER_CODE port:
REPLACECHR ( 1, CUSTOMER_CODE, 'A', NULL )
CUSTOMER_CODE
RETURN VALUE
ABA
B
BBC
BBC
ACC
CC
AAA
[empty string]
aaa
aaa
NULL
NULL
The following expression removes multiple numbers for each row in the
INPUT port:
REPLACECHR ( 1, INPUT, '14', NULL )
INPUT
RETURN VALUE
12345
235
4141
NULL
111115
5
NULL
NULL
When you want to use a single quote (') in either OldCharSet or NewChar,
you must use the CHR function. The single quote is the only character that
cannot be used inside a string literal.
The following expression removes multiple characters, including the single
quote, for each row in the INPUT port:
REPLACECHR (1, INPUT, CHR(39), NULL )

INPUT
'Tom Smith' 'Laura Jones'
Tom's
NULL

RETURN VALUE
Tom Smith Laura Jones
Toms
NULL

Posted 28th March 2011 by Prafull Dangore


0

Add a comment

Mar
25

What is Delta data load? -> Consolidated


Info
A delta load, by definition, is loading incremental changes to the data. When
doing a delta load to a fact table, for example, you perform inserts only...
appending the change data to the existing table.
Delta checks can be done in a number of ways. Different logics can
accomplish this. One way is to check if the record exists or not by doing a
lookup on the keys. Then if the Keys don't exist then it should be inserted as
new records and if the record exist then compare the Hash value of non key
attributes of the table which are candidates for change. If the Hash values
are different then they are updated records. (For Hash Values you can use
MD5 function in Informatica) If you are keeping History (Full History) for the
table then it adds a little more complexity in the sense that you have to
update the old record and insert a new record for changed data. This can
also be done with 2 separate tables with one as current version and another
as History version.
Posted 25th March 2011 by Prafull Dangore
0

Add a comment

Mar
25

Define: Surrogate Key -> Consolidated Info

Definition:
Surrogate key is a substitution for the natural primary key in Data
Warehousing.
It is just a unique identifier or number for each row that can be used for the
primary key to the table.
The only requirement for a surrogate primary key is that it is unique for each
row in the table.
It is useful because the natural primary key can change and this makes
updates more difficult.
Surrogated keys are always integer or numeric.

Scenario overview and details


To illustrate this example, we will use two made up sources of information to
provide data about customers dimension. Each extract contains customer
records with a business key (natural key) assigned to it.
In order to isolate the data warehouse from source systems, we will introduce
a technical surrogate key instead of re-using the source system's natural
(business) key.
A unique and common surrogate key is a one-field numeric key which is
shorter, easier to maintain and understand, and independent from changes
in source system than using a business key. Also, if a surrogate key
generation process is implemented correctly, adding a new source system to
the data warehouse processing will not require major efforts.
Surrogate key generation mechanism may vary depending on the
requirements, however the inputs and outputs usually fit into the design
shown below:
Inputs:
- an input respresented by an extract from the source system
- datawarehouse table reference for identifying the existing records
- maximum key lookup
Outputs:
- output table or file with newly assigned surrogate keys
- new maximum key
- updated reference table with new records

Potrebbero piacerti anche