Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1. Could u tell me about the stage variables? Could u tell me, how u used it.?
Stage variables are the intermediate processing variables in the transformation stage.
They are used for holding the intermediate values in complex derivations in the
transformation stage.
2. How do u eliminate the duplicates when u initial load into the transformer ,when
the hash files are empty.?
Duplicates are removes by the use of “Reject” in the constraints of transformation stage.
Alternatively they can be eliminated by effective use of the hash files.
3.How is the order of execution done internally in the transformer with the stage
editor having multiple input links on the left hand side and multiple output links ?
Thru Link ordering page, in the transformer properties.
15.Can u explain what a Project Life Cycle is and how did u implement it in ur last
project.?
Project life cycle normally has the following stages
1. Analysis
2. Logical design of the project
3. Physical design
4. Code
5. Testing
6. Maintenance
17. Do u know how to tune the Oracle or your target database especially the SQL
tuning?
Actually tuning was done by our DBA
I use to optimize the queries for the better performance, using the query analyzer
Last Interview:
1) How many source files and how many target files,
Source ---- 9
Target -----3
2) What were the formats of source files? How did u receive the source files? And
what was the frequency? Daily or weekly or quarterly?
Depending upon the project
6) How will u pass the parameter to the job schedule if the job is running at night?
11) If there r million of records, did u use OCI? If yes then why?
Yes bcoz OCI stage is three times faster when compares to other stages
12) Why did u use SQL Loader or OCI stage and why?.
Bcoz it is faster
13) Diff. Bet. The package and procedure and advantage of using package
Package will be like a class
Procedure is a like a method of a class
TEST1
functions used:
TERM
STUDENT DIMENSION
MATRICULATION TIME
DIMESION DIMENSION
STUDENT
STUDENT
BIODEMO FACT
DIMENSION
INSTITUTION
DIMENSION
The above diagram is Snowflaked because the Dimension tables are also connected
4) Explain the difference between OLTP data and DW data with reference to users
This Time when the Client is connected to the server is called Waiting period in the inter-
row Buffering
We handle rejected in ODBC by putting a constraint , this is just say “ REJECTED” for
rejected ones and check the box of reject row for the non-rejected ones.
Link ordering can be done in two ways. One is by going to the properties stage of the
transformer or any active stage and the other is specifying the execution order at the link
itself for both input and output order.
Stage variables are the variables used in the transformation stage. This variable can be
used in the transform and can be assigned a value at the run time for processing.
11) Give the reasons for selecting an alternative between fact table with more
dimensions and Fact with fewer dimensions and the Hierarchies
I prefer Fact table with more dimensions to Fact table with fewer Dimensions and the
Hierarchies. The reason for this is the Complexity involved in the approach.is it makes
the presentation to the user more complex and more intricate, less intuitive.
You can implement a dual level of granularity by storing data at both high and low levels
of granularity, that is, at both summarized and detailed levels.
DataStage provides a graphical Job Sequencer which allows you to specify a sequence of
server or parallel jobs to run.
20) What is the percentage and the count with reference to performance statistics?
21) How do you Differentiate between a I/O bottle neck and CPU bottleneck
If the Minimum = Average in the Performance parameters then it is CPU bound job
If the Minimum != Average in the Performance parameters then it is I/O bound job
A container is a group of stages and links. Containers enable you to simplify and
modularize your server job designs by replacing complex areas of the diagram with a
single container stage.
Local containers. These are created within a job and are only accessible by that job.
A local container is edited in a tabbed page of the job’s Diagram window.
Shared containers. These are created separately and are stored in the Repository in
the same way that jobs are. These are aceessible to any job. You can also use shared
containers as a way of incorporating server job functionality into parallel jobs.
• Server shared container. Used in server jobs (can also be used in parallel jobs).
• Parallel shared container Used in parallel jobs.
You can also include server shared containers in parallel jobs as a way of
incorporating server job functionality into a parallel stage (for example, you could
use one to make a server plug-in stage available to a parallel job).
Questions on DW
3.What is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and output links?
10. How wud. U tracks the performance statistics and enhances it?
JOB RUN OPTION>Tracing (Select the particular stage)– Check the performance
statistics and run the job. View Log (Performance statistics ) PCMA.
11.How will u call external function and subroutine from Data Stage?
Datastage Manager> Import > External Table Definition
13.Have u written any function or shell script and what was the purpose and logic.
Yes I have written shell scripts.
Scenario: If we have daily data, and we what to concatenate the daily files into a single
file to be fed in the weekly job, then this can be done as – write a before job routine in
shell script. This shell script will concatenate all the daily files into a single file. The shell
script will be used as a before job subroutine in the job properties.
14.Which stage in a data warehouse DataStage is used for and how do u use it?
15.Can u explain what a Project Life Cycle is and how did u implement it in ur last
project.???
Analysis, Design, Development, Testing, Deployment, Production Support.
17. Do u know how to tune the Oracle or ur target database especially the SQL tuning?
Use Indexes on tables.
Use Hints.
For Oracle, use the following utilities – TKPROF, EXPLAIN PLAN, UTILBSTAT,
UTILESTAT etc.
Use SQL TRACE.
Tune the SGA Memory and Data Dictionary Cache Memory.
Use the materialized views.
18. What versions of the entire platform used in the project?
Unix Sun Solaris
Windows NT 4.0
DataStage 7.0/6.0
Oracle 9i
SQL Server 7.0/6.5
16) How will u pass the parameter to the job sequence if the job is running at night?
The Job sequence is scheduled for the night in the DataStage Director. When
scheduling, the Datastage Director asks for the value of the job parameter to be
supplied. This value is then used when the job sequence runs at the scheduled time.
18) How will u pass the file name as the parameter for a job?
Using crontab, dsjob – p, in the job parameter the type will be selected as
PATHNAME
19) How will u run data stage from Unix? What’s the command?
You can start, stop, validate, and reset jobs using the –run option.
dsjob –run
–mode
NORMAL
RESET
VALIDATE
–param
name=value
–warn
n
–rows n
–wait
–stop
–jobstatus
–userstatus
-local
22) How did u get the source files? And in what format?
We get source files by connecting it to the respective databases
24) If there r million of records, did u use OCI? If yes then why?
OCI stage makes it faster to load from the database
26) How will u catch the bad rows from OCI stage?
Reject link
27) How do u schedule a job and how will it run at particular time?
Using Job Scheduler in the Job Director
28) Diff. Bet. The package and procedure and advantage of using package
31) What r XML files? And how do u read data from XML files and what stage to be
used?
XML input stage can be used to read data from the XML files
32) Have u worked on automation of datastage jobs? If yes, then how did u do it?
3. DataStage learns about the shape and size of the system from the
configuration file.job design need not be changed if platform is
changed.The configuration file describes available processing power
in terms of processing nodes.
You can use the configuration file to set up nodepools and resource
pools. A pool defines a group of related nodes or resources, and
when you design a DataStage job you can specify that execution be
confined to a particular pool.When you run a DataStage job,
DataStage first reads the configuration file to determine the
available system resources.You can define and edit the
configuration file using the DataStage Manager.
4. partition methods
round robin;random;same;entire;hash by field;modulus;range;db2
Same is the fastest partitioning method.
5. Inside a DataStage parallel job, data is moved around in data sets.
data sets can be landed as persistent data sets This is the most
efficient way of moving data between linked jobs. Persistent data
sets are stored in a series of files linked by a control file
6. run time column propagation:
if your job encounters extra columns that are not defined in the
meta data when it actually runs, it will adopt these extra columns
and propagate them through the rest of the job.
You can also specify the meta data for a stage in a plain text file
known as a schema file. partialschema means that you only need
define column definitions for those columns that you are
actually going to operate on.
7. If you have NLS enabled, parallel jobs support two types of
underlying character data types: strings and ustrings. String data
represents unmapped bytes, ustring data represents full Unicode
(UTF-16) data.
stages:
File sets enable you to spread data across a set of files
referenced by a single control file.
8. NLS Map Tab
If you have NLS enabled on your system, some of your stages will have an
NLS Map tab. This allows you to override the project default character set
map for this stage, and in some cases, allows you to enable per-column
mapping. When per-column mapping is enabled, you can override the
character set map for particular columns (an NLS map field appears on the
columns tab allowing you to do this).
1.Could u tell me about the stage variables, how do u use it, could u tell me, how u
used it.?
2.How do u eliminate the duplicates when u initial load into the transformer, when the
hash files are empty.?
3.how is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and output links ?
4.Did u write any routines ?
5 If u have million records, how do u test that in unit testing ?
6.how do u do oracle 4 way inner join as input oracle ,transformer and output flat file ?
7.Have u written SQL Scripts?
8.have u used aggregator stage?
9.how do u test the job ?
10. How wud. U track the performance statistics and enhance it?
11.How will u call external function and subroutine from Data Stage?
12.What r shared containers and how do u use it?
13.Have u written any function or shell script and what was the purpose and logic.
14.Which stage in a data warehouse DataStage is used for and how do u use it?
15.Can u explain what a Project Life Cycle is and how did u implement it in ur last
project.???
16.What was the target database used in the last project?
17. Do u know how to tune the Oracle or ur target database especially the SQL tuning?
18. What versions of all the platform used?
Last Interview:
33) How many source files and how many target files,
34) What were the formats of source files? How did u receive the source files? And
what was the frequency? Daily or weekly or quarterly?
35) How many jobs and transformations were used.
36) What r job parameters?
37) How do u schedule a job?
38) How will u pass the parameter to the job schedule if the job is running at night.
39) What happens if one job fails in the night.
40) How will u pass the file name as the parameter for a job?
41) How will u run data stage from Unix? What’s the command?
42) How will u pass the parameter to a job?
43) How did u get the source files? And in what format?
44) How did u populate the source files?
45) If there r million of records, did u use OCI? If yes then why?
46) Why did u use SQL Loader or OCI stage and why?.
47) How will u catch the bad rows from OCI stage?
48) How do u schedule a job and how will it run at particular time?
49) Diff. Bet. The package and procedure and advantage of using package
50) Unix command to find a particular pattern in a file?
51) Unix command to combine two files
52) What r XML files? And how do u read data from XML files and what stage to be
used?
53) Have u worked on automation of datastage jobs?if yes, then how did u do it?
DATASTAGE QUESTIONS
ASKED IN DIFFERENT INTERVIEWS
1.Could u tell me about the stage variables,,how do u use it,could u tell me,how u used
it.?
2.How do u eliminate the duplicates when u initial load into the transformer,when the
hash files are empty.?
3.how is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and outrput links ?
4.Did u write any routines ?
5 If u have million records,how do u test that in unit testing ?
6.how do u do oracle 4 way inner join as input oracle ,transformer and output flat file ?
7.Have u written SQL Scripts?
8.have u used aggregator stage?
9.how do u test the job ?
1.Could u tell me about the stage variables, and how u used it. ?
2.How do u eliminate the duplicates when u initially load into the transformer, when the
hash files are empty.?
3.how is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and output links on the right hand side?
4.Did u write any routines using BASIC programming?
5 If u have million records, how do u test that for unit testing ?
6.how do u do oracle 4 way inner join as input oracle stage, transformer and output flat
file stage ?
7.Have u written SQL Scripts?
8.have u used aggregator stage?
9.how do u test the job ?
10. Are you available now ?
11. where are you now ?
12.what are delta transactions?,could u exaplain,how did u use it?
1.Could u tell me about the stage variables, how do u use it, could u tell me, how u used
it.?
2.How do u eliminate the duplicates when u initial load into the transformer, when the
hash files are empty.?
3.how is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and output links ?
4.Did u write any routines ?
5 If u have million records, how do u test that in unit testing ?
6.how do u do oracle 4 way inner join as input oracle ,transformer and output flat file ?
7.Have u written SQL Scripts?
8.have u used aggregator stage?
9.how do u test the job ?
10. How wud. U track the performance statistics and enhance it?
11.How will u call external function and subroutine from Data Stage?
12.What r shared containers and how do u use it?
13.Have u written any function or shell script and what was the purpose and logic.
14.Which stage in a data warehouse DataStage is used for and how do u use it?
15.Can u explain what a Project Life Cycle is and how did u implement it in ur last
project.???
16.What was the target database used in the last project?
17. Do u know how to tune the Oracle or ur target database especially the SQL tuning?
18. What versions of all the platform used?
Last Interview:
54) How many source files and how many target files,
55) What were the formats of source files? How did u receive the source files? And
what was the frequency? Daily or weekly or quarterly?
56) How many jobs and transformations were used.
57) What r job parameters?
58) How do u schedule a job?
59) How will u pass the parameter to the job schedule if the job is running at night.
60) What happens if one job fails in the night.
61) How will u pass the file name as the parameter for a job?
62) How will u run data stage from Unix? What’s the command?
63) How will u pass the parameter to a job?
64) How did u get the source files? And in what format?
65) How did u populate the source files?
66) If there r million of records, did u use OCI? If yes then why?
67) Why did u use SQL Loader or OCI stage and why?.
68) How will u catch the bad rows from OCI stage?
69) How do u schedule a job and how will it run at particular time?
70) Diff. Bet. The package and procedure and advantage of using package
71) Unix command to find a particular pattern in a file?
72) Unix command to combine two files
73) What r XML files? And how do u read data from XML files and what stage to be
used?
Have u worked on automation of datastage jobs?if yes, then how did Krishna
u do it? karthik Desai
DATASTAGE
TEST1
1) Explain any five transformations used?
functions used:
TERM
STUDENT DIMENSION
MATRICULATION TIME
DIMESION DIMENSION
STUDENT STUDENT
BIODEMO FACT
DIMENSION
INSTITUTION
DIMENSION
The above diagram is Snowflaked because the Dimension tables are also connected
4) Explain the difference between OLTP data and DW data with reference to users
The connection between a DataStage client and the server times out after 86400 seconds
(24 hours) of inactivity. To change this we normally set it in Administrator stage ,
Projects, Properties, tunables, Inter-row buffering and change the timeout period.
This Time when the Client is connected to the server is called Waiting period in the inter-
row Buffering
We handle rejected in ODBC by putting a constraint , this is just say “ REJECTED” for
rejected ones and check the box of reject row for the non-rejected ones.
Stage variables are the variables used in the transformation stage. This variable can be
used in the transform and can be assigned a value at the run time for processing.
11) Give the reasons for selecting an alternative between fact table with more
dimensions and Fact with fewer dimensions and the Hierarchies
I prefer Fact table with more dimensions to Fact table with fewer Dimensions and the
Hierarchies. The reason for this is the Complexity involved in the approach.is it makes
the presentation to the user more complex and more intricate, less intuitive.
the time and space required for querying is reduced which makes the system more faster
You can implement a dual level of granularity by storing data at both high and low levels
of granularity, that is, at both summarized and detailed levels.
DataStage provides a graphical Job Sequencer which allows you to specify a sequence of
server or parallel jobs to run.
20) What is the percentage and the count with reference to performance statistics?
21) How do you Differentiate between a I/O bottle neck and CPU bottleneck
If the Minimum = Average in the Performance parameters then it is CPU bound job
If the Minimum != Average in the Performance parameters then it is I/O bound job
When we right Click on an active stage we get propagate column as an option and from
then we can propagate the columns to any other link which are connected to that
particular stage.
A container is a group of stages and links. Containers enable you to simplify and
modularize your server job designs by replacing complex areas of the diagram with a
single container stage.
Local containers. These are created within a job and are only accessible by that job.
A local container is edited in a tabbed page of the job’s Diagram window.
Shared containers. These are created separately and are stored in the Repository in
the same way that jobs are. These are aceessible to any job. You can also use shared
containers as a way of incorporating server job functionality into parallel jobs.
There are two types of shared container:
• Server shared container. Used in server jobs (can also be used in parallel jobs).
• Parallel shared container Used in parallel jobs.
You can also include server shared containers in parallel jobs as a way of
incorporating server job functionality into a parallel stage (for example, you could
use one to make a server plug-in stage available to a parallel job).
Questions on DW
3.What is the order of execution done internally in the transformer with the stage editor
having input links on the left hand side and output links?
10. How wud. U tracks the performance statistics and enhances it?
JOB RUN OPTION>Tracing (Select the particular stage)– Check the performance
statistics and run the job. View Log (Performance statistics ) PCMA.
11.How will u call external function and subroutine from Data Stage?
Datastage Manager> Import > External Table Definition
13.Have u written any function or shell script and what was the purpose and logic.
Yes I have written shell scripts.
Scenario: If we have daily data, and we what to concatenate the daily files into a single
file to be fed in the weekly job, then this can be done as – write a before job routine in
shell script. This shell script will concatenate all the daily files into a single file. The shell
script will be used as a before job subroutine in the job properties.
14.Which stage in a data warehouse DataStage is used for and how do u use it?
15.Can u explain what a Project Life Cycle is and how did u implement it in ur last
project.???
Analysis, Design, Development, Testing, Deployment, Production Support.
17. Do u know how to tune the Oracle or ur target database especially the SQL tuning?
Use Indexes on tables.
Use Hints.
For Oracle, use the following utilities – TKPROF, EXPLAIN PLAN, UTILBSTAT,
UTILESTAT etc.
Use SQL TRACE.
Tune the SGA Memory and Data Dictionary Cache Memory.
Use the materialized views.
75) What were the formats of source files? How did u receive the source files? And
what was the frequency? Daily or weekly or quarterly?
81) How will u pass the file name as the parameter for a job?
Using crontab, dsjob – p, in the job parameter the type will be selected as
PATHNAME
82) How will u run data stage from Unix? What’s the command?
You can start, stop, validate, and reset jobs using the –run option.
dsjob –run
[ –mode [ NORMAL | RESET | VALIDATE ]]
[ –param name=value ]
[ –warn n ]
[ –rows n ]
[ –wait ]
[ –stop ]
[ –jobstatus]
[–userstatus]
[-local]
83) How do u run a datastage job or sequence from Unix.?
Using dsjob –run –jobstatus –projectname –filename command
85) How did u get the source files? And in what format?
We get source files by connecting it to the respective databases
87) If there r million of records, did u use OCI? If yes then why?
OCI stage makes it faster to load from the database
89) How will u catch the bad rows from OCI stage?
Reject link
90) How do u schedule a job and how will it run at particular time?
Using Job Scheduler in the Job Director
91) Diff. Bet. The package and procedure and advantage of using package
94) What r XML files? And how do u read data from XML files and what stage to be
used?
XML input stage can be used to read data from the XML files
95) Have u worked on automation of datastage jobs? If yes, then how did u do it?
1. What is the difficulty you faced with the most un-cooperative user and how did
you resolve it?
a. One user was very unwilling to discuss the requirements and kept putting
off meetings, saying that he was too busy.
b. I explained to him that his knowledge and input is very important to the
success of the project, and arranged to meet him after working hours to
discuss the requirements. He was more cooperative after that.
5. What is the most difficult situation you faced, and how did you resolve it?
a. In my last project, the users were taking a long time to start user
acceptance testing, because the department was extremely busy. I spoke to
the manager and suggested that he could allow one key user to spend 1 to
2 hours each day testing with my supervision. He made the necessary
arrangements, and we were able to get the first round of testing going,
identifying defects and addressing them.
b. In the last project it was discovered that the Fund System that indirectly
interfaced to the Bond System, was not handling the results of the sale
transactions correctly. I reviewed the processing requirements for the Fund
System with the senior developer, and I wrote a detailed specification of
the interface process, discussed and handed over to the senior developer
for the Fund System, and generated test data from the Bond System. The
specs were used to modify the Fund System process, tested and
implemented successfully.
c. In the last week of the project, it was discovered that some accounting
entries were not being archived correctly for some transactions. We
investigated the full extent of the problem, and wrote scripts to update the
previous transactions and modified the stored procedures to correctly
update new transactions. It took us three nights working late to thoroughly
test and then implement the solution.
4. How can you delete a dataset in orchestrate ? and what happens if I use only
delete and dataset name. As a standard we donot delete a Dataset
5. how can you copy a data set
Thru te schema files; for mostly the description of column, we can have partial
schema def also
Truncate and delete are the DB functions. A truncated table can be broughtback thru
DBA operations, while deleted one can never be restored
10. How can we generate error code and how can u find it?
15. What is the difference between a inner log and an outer log?
16. What is the difference between Star schema and Snowflake?
18. Explain briefly the server components and client components in Data Stage?
Client Components
1.DataStage Designer
2.DataStage Director
3.DataStage Manager
4.DataStage Administrator
Server Components
1.Repository
2.DataStage Server
chmod
Extensive knowledge
22. Have you done project plans or scopes for project plans? Explain
No
Designing jobs, importing tables, testing ,debugging, performance tuning and (Q) I have
a server job that is trying to build a HF for lookup purposes. It seems to slow down immensely
as it progresses. The source file has approx. 1.2 mil records in it. I am using the type 30
dynamic hash file and the records being returned are 3 fields, 2 keys and the lookup value.
The job starts out very fast but after 400k records it begins to slow and progressively slows to
less than 700 records per sec. Any suggestions?
---Up the minimum modulus from 1. What you're probably seeing is the resize by doubling
effect degrade performance. Checkout this post as well:
--Dynamic hashed files don't resize by doubling, they resize by adding one group (logically -
probably about eight group buffers at a time physically). This actually gives you more pain, as
you're taking the hit of restructuring every N records loaded, where N is the number of
records per group.
if you create the hashed file with its minimum modulus set to approximately what you'll need
at the end, you take the hit of allocating this disk space up front, so that the load should
proceed more quickly and at non-diminishing rate.
Do you use write caching? This, too, can help load performance.
Another possibility, if your hashed file is large, is to use a static hashed file. This is one where
the disk space is necessarily pre-allocated, and you get more control over the size of groups
and the hashing algorithm used. Empirical evidence suggests that these perform slightly better
than the equivalent dynamic hashed file for larger sizes; the downsize is that they require
more calculation and more maintenance.
--Thanks for your input. I do not consider this hash file to be very large, around 65 Mb, approx
1.2 mil. records. I will try to use the write caching, right now, it takes around 30 minutes of
wall clock time to build this.
How to delete or clear the hashfile from the command line?
The command used to delete a hashed file depends on the command that was used to
create the hashed file, and whether a SETFILE command has been issued since.
The command to clear the contents of a hashed file is CLEAR.FILE if issued from within the
DataStage environment, or clearfile from the operating system. You can also use SQL
within the DataStage environment (DELETE FROM hashedfile;) or the TCL command
DELETE with an active Select List.
Can i increase the limit of hash file size cause after 2 GB my job get hanged and i couldn't
procure desired result .
I am giving a separate space to it specific for my application.
in general you can using the HFC unsupported utility get the syntax to manually create
the file.
you can't get it using the regular create option in the hash stage.
If the hashed file already exists, and is not corrupted, you can use the resize utility to switch it
to 64-bit addressing.
Use the following commands at the TCL prompt or in the Administrator client command
window.
Code:
RESIZE hashedfile * * * 64BIT
You will get a message indicating that the date/time stamp in the file header has been
modified. This message is terminated by an exclamation point (!) but really is just an
informative message. Don't worry about it.
Enough space for a second copy of the hashed file is required for this operation. If this does
not exist where the project or the hashed file reside, add a USING clause referring to a
directory where there is plenty of space (that is, at least as much as the hashed file currently
occupies).
Code:
RESIZE hashedfile * * * 64BIT USING pathname
If the hashed file already exists, and is not corrupted, you can use the resize utility to switch it to
64-bit addressing.
Use the following commands at the TCL prompt or in the Administrator client command window.
You will get a message indicating that the date/time stamp in the file header has been modified.
This message is terminated by an exclamation point (!) but really is just an informative message.
Don't worry about it.
Enough space for a second copy of the hashed file is required for this operation. If this does not
exist where the project or the hashed file reside, add a USING clause referring to a directory where
there is plenty of space (that is, at least as much as the hashed file currently occupies).
Code:
RESIZE hashedfile * * * 64BIT USING pathname
Hi! Ray
There is two entities created when i run my job for creating an hash file > 2 GB
1. D_H_CUST_SUM_ID - Directory
2. H_CUST_SUM_ID - File
Ascential has released 7.5 but I haven't been able to find anything about it on their web site,
devnet or e.services. Did find an article on dw-institute with a paragraph on new features, it
will be interesting to see the new job sequencer and the performance visualisation and
monitoring services:
Quote:
New features in the Suite include the addition of Mobile Services for RTI, through the
incorporation of RTI Mobile Director, which enables the remote administration of
DataStage from pocket PCs. Elsewhere, Ascential is touting new RTI clustering support for
Oracle, DB2, and SQLServer databases. Enhancements include improved reporting across
the Enterprise Integration Suite, a new job sequencer, real-time job validation, and
performance visualization and monitoring interfaces. Ascential has also updated the Suite’s
support for the HIPAA and EDIFACT standards to assist with compliance-related projects.
Finally, the Ascential Enterprise Integration Suite 7.5 has been certified for Oracle 10g,
Teradata V2R5.1, and SAP. Ascential will also make available a new packaged application
connectivity kit for SAP XI with the revamped Suite.
Mobile Director progam has been created so we can work from airports, hotels, lunch
breaks, between nines on the golf course. It will require Pocket PC 2000 OS, Pocket PC
2002 OS, or Windows Mobile for Pocket PC 2003 or higher; Microsoft .NET Compact
Framework 1.0 SP2; and RTI 7.5
I am doing a lookup, the lookup conbines 3 keys. So how do I define the contraints?
What are your requirements? Do you only want to pass input rows that have valid
matches in all three lookup tables? Or something else? If all three, you could do
something like:
Code:
There is a useful feature in the constraint screen where you can check the "reject" flag
on a constraint and it automatically gets all rows that do not go down any other
outputs. So instead of putting NOT(ISNULL.. down one constraint and ISNULL... down
the second constraint you can just turn on the reject flag of that second constraint.
The only caveat to that is to understand that these would no longer be 'silent' rejects.
With that check box marked, any rows that go down that link will be logged into the
job's log with a warning and a total rejected records count. Sometimes that's a good
thing and sometimes that can be a bad thing.
As long as you understand that and the fact that output link ordering is very important
when using 'true' reject rows, then you'll be fine.
Does anyone know of a way to modify the character used as a delimiter on the fly for a
Sequential File Stage?
We are receiving over 80 files, but some are delimited by pipes, others by commas. We'd
like to be able to use the same stage to process both file types.
Yes, that is correct. 80 files have the same metadata, only difference is the delimeter.
Sorry that your first post generates this as an answer, but... no, can't be done.
Do all of your 80 files have the same metadata and the only difference is the
delimiter? Seems like at best case you'd need 2 jobs to do the processing, at worst
case - 80.
Alas, all you can do is make copies of the job design. As Craig said, the sequential file
delimiter character is not able to be a job parameter. Indeed, none of the properties
on the Format tab can be job parameters.
Another option might be to explore creating a perl, awk or nawk prejob filter that will
convert your file delimiter to a predefined delimiter for your job.
If you choose the "Stage uses filter" checkbox then you can perform some sort of pre-
processing and read directly from stdout. I've done this with other things, but not
changing delimiters and I can't see that it would be much of problem to do.
Then you could have one job to process all different types of file delimiters as long as
the files match for metadata and the predefined delimiter is not in the data of any of
the files you will be processing.
I have 1 input file and 2 hash lookups. I want to direct records that have a 0 for account
number into my reject file and the rest go to the next transformer. Rows are being written
to the reject file and the next transformer, ie it puts them into both locations. For
example, my log will indicate:
125179 rows read from inRecs
125177 rows read from CC_lkup
125179 rows read from cal_day_lkup
125177 rows written to outRecs
17537 rows written to badStg
I only want to process good records. I am using a constraint right now to avoid warning
messages being issued. I use 2 stage variables and check to see if either = 0, if so, they
go to badStg. Any thoughts?
I need to have 3 or more source files joined on a 20 byte key giving one target file with
information from all 3. I found on the parallel side a function called join, reading in up to
60+ files and outputing up to 60+ files. On the Server side, which I have to use, I can
only fid a function called Merge. The Merge only reads in 2 source files.
Does anyone know of a different function, or data merger I can use on the Server that will
allow me to join 3 or more files to create one target?
How about your Operating System? UNIX can 'merge' files based on keys, from what I
recall. You may be able to do something before job to get it ready for loading, perhaps
even using the 'Filter' capability of the Sequential File stage instead of in a before job
routine.
If the keys are unique, another option would be to load the files into hash files and do
the joins through a lookup.
Alas this can't be done in server jobs. The Merge stage reads two text files; it does not
have the ability to be driven by two streams. That's just the way it's written.
The hashed file approach is likely to be the best if the keys are unique.
There are options in the UNIX sort command for merging multiple files. Read the man
page on sort .
You might, otherwise, use ODBC driver for text files, and perform an SQL join.
Performance will be horrible , particularly if the files are large.
Alas this can't be done in server jobs. The Merge stage reads two text files; it does not
have the ability to be driven by two streams. That's just the way it's written.
Good catch. It can't be done in a single server job using Merge stage.
Some time ago someone showed how to generate a single output row from a server job
Transformer stage without having any input link. If someone's got this in their Favorites, or
knows the technique, can you please post it here?
start your job with a transformer, put a Stage Variable (must even if not used) put the
constraint on @OUTROWNUM = 1 for 1 row.
put anything in the derivation and take it any where
I use this when ever I need to send a user defined sql to any server, especially to
invoke the switch between staging and prod tables.
How to insert parameters automatically for the jobs, in sequncer. Here I dont want to
enter parameters manually, it should happen automatically.
So do i have to design a job with that stores parameters in file ?
The sequence itself is a job and has parameters. Fill these in in the usual way, in the
Parameters grid.
Then, when you're loading parameters in a Job Activity, choose Insert Parameter to
select the sequence's parameter reference.
There is no "select all" for this functionality. If you can make a business case, log an
enhancement request with Ascential.
It is a pain having to put the job parameters into sequence job stages each time, they
really do need an automap or automatch button as frequently the sequence job
parameters have the same name as the server jobs being called. One short cut is to
upgrade to version 7.x and use the cut and paste. That way you set up the job
parameters for just one server job and you can endlessly copy and paste that stage
and change the job name leaving job paramters intact.
Check out Parameter Manager for DataStage. This gives you the ability to store global
parameters and bulk load them into jobs and sequences when and where you will.
I wrote a routine that retrieves a parameter value from a file given the filename and
parameter name. I add a routine stage to the Job Sequence job for each parameter
that needs to be retrieved and set the parameter in the Job Activity stage to the return
value of the appropriate routine stage. It's a little bulky to have a bunch of routine
stages at the beginning of each sequencer job, but it seems to work well enough.
THE HASHFILE IS JOINED WITH THE SOURCE MASTER QUERY TO LOAD THE CODE VALUE
FROM HASH FILE LOOKUP INTO MASTER TARGET TABLE.
NOW, FOR LOADING THE HISTORY TABLE I AM USING THE OLDCOLA, OLDCOLB AND
OLDCOLC VALUES TO POPULATE HISTORY RECORDS.
QUESTION ?
CAN I USE THE SAME HASFILE AND JOIN THE SOURCE NEWCOLA, OLDCOLA to get the
CODE VALUE for OLD AND NEW.
I know it can be done if I have multiple instances of same hash file. But, is it possible to
apply coditions in one hashfile and get old and new CODE VALUES.
You can probably do what you want. What is in the hashed file(s), and which column is
the key column? What information do you want to return from the hashed file?
Without knowing these things it is impossible to give you an answer.
Using a hashed file on a reference input link is almost the same as using any stage
type, except that the only possible "join" is an equi-join on the hashed file key. If that
key is not found, DataStage sets the value on every column to NULL (so it behaves
like a left outer join; you can constraint the output of the Transformer stage so that
keys found and keys not found can be directed to different output links, or discarded,
thereby mimicking an inner join).
Active
o How do u automate jobs from remote i.e when u want to run the jobs during night
time.
1. IN
2. OUT
3. RETURN
4. IN OUT
21. Read the following code:
22. CREATE OR REPLACE TRIGGER update_show_gross
23. {trigger information}
24. BEGIN
25. {additional code}
26. END;
The trigger code should only execute when the column, COST_PER_TICKET, is
greater than $3. Which trigger information will you add?
v_yearly_budget NUMBER;
BEGIN
SELECT yearly_budget
INTO v_yearly_budget
FROM studio
WHERE id = v_studio_id;
RETURN v_yearly_budget;
END;
1. An user defined exception must be declared and associated with the error
code and handled in the EXCEPTION section.
2. Handle the error in EXCEPTION section by referencing the error code
directly.
3. Handle the error in the EXCEPTION section by referencing the
UNIQUE_ERROR predefined exception.
4. Check for success by checking the value of SQL%FOUND immediately
after the UPDATE statement.
39. Read the following code:
40. CREATE OR REPLACE PROCEDURE calculate_budget IS
41. v_budget studio.yearly_budget%TYPE;
42. BEGIN
43. v_budget := get_budget(11);
44. IF v_budget < 30000
45. THEN
46. set_budget(11,30000000);
47. END IF;
48. END;
IF SQL%FOUND THEN
RETURN TRUEl;
ELSE
RETURN FALSE;
END IF;
COMMIT;
END;
1. 1.1
2.2
3.1
4.(1,3,4)
5.2
6.3
7.4
8.2
9.2
10.(3,4)
11.1
12.4
13.1
14.1
15.3
16.3
17.1
18.(1,4)
19.4
20.3
21.4
22.2
23.4
24.3
25.4
PLAY_TABLE
————————————-
“Midsummer Night’s Dream", SHAKESPEARE
“Waiting For Godot", BECKETT
“The Glass Menagerie", WILLIAMS
1. 60494
2. LOA
3. Terminated
4. ACTIVE
7. SELECT (TO_CHAR(NVL(SQRT(59483), “INVALID")) FROM DUAL is a
valid SQL statement.
1. TRUE
2. FALSE
8. The appropriate table to use when performing arithmetic calculations on
values defined within the SELECT statement (not pulled from a table
column) is
1. EMP
2. The table containing the column values
3. DUAL
4. An Oracle-defined table
9. Which of the following is not a group function?
1. avg( )
2. sqrt( )
3. sum( )
4. max( )
10. Once defined, how long will a variable remain so in SQL*Plus?
1. Until the database is shut down
2. Until the instance is shut down
3. Until the statement completes
4. Until the session completes
11. The default character for specifying runtime variables in SELECT
statements is
1. Ampersand
2. Ellipses
3. Quotation marks
4. Asterisk
12. A user is setting up a join operation between tables EMP and DEPT. There
are some employees in the EMP table that the user wants returned by the
query, but the employees are not assigned to departments yet. Which
SELECT statement is most appropriate for this user?
1. select e.empid, d.head from emp e, dept d;
2. select e.empid, d.head from emp e, dept d where e.dept# = d.dept#;
3. select e.empid, d.head from emp e, dept d where e.dept# = d.dept# (+);
4. select e.empid, d.head from emp e, dept d where e.dept# (+) = d.dept#;
13. Developer ANJU executes the following statement: CREATE TABLE
animals AS SELECT * from MASTER.ANIMALS; What is the effect of this
statement?
1. A table named ANIMALS will be created in the MASTER schema with
the same data as the ANIMALS table owned by ANJU.
2. A table named ANJU will be created in the ANIMALS schema with the
same data as the ANIMALS table owned by MASTER.
3. A table named ANIMALS will be created in the ANJU schema with the
same data as the ANIMALS table owned by MASTER.
4. A table named MASTER will be created in the ANIMALS schema with
the same data as the ANJU table owned by ANIMALS.
14. User JANKO would like to insert a row into the EMPLOYEE table, which
has three columns: EMPID, LASTNAME, and SALARY. The user would
like to enter data for EMPID 59694, LASTNAME Harris, but no salary.
Which statement would work best?
1. INSERT INTO employee VALUES (59694,’HARRIS’, NULL);
2. INSERT INTO employee VALUES (59694,’HARRIS’);
3. INSERT INTO employee (EMPID, LASTNAME, SALARY) VALUES
(59694,’HARRIS’);
4. INSERT INTO employee (SELECT 59694 FROM ‘HARRIS’);
15. Which three of the following are valid database datatypes in Oracle? (Choose
three.)
1. CHAR
2. VARCHAR2
3. BOOLEAN
4. NUMBER
16. Omitting the WHERE clause from a DELETE statement has which of the
following effects?
1. The delete statement will fail because there are no records to delete.
2. The delete statement will prompt the user to enter criteria for the deletion
3. The delete statement will fail because of syntax error.
4. The delete statement will remove all records from the table.
17. Creating a foreign-key constraint between columns of two tables defined
with two different datatypes will produce an error.
1. TRUE
2. FALSE
18. Dropping a table has which of the following effects on a nonunique index
created for the table?
1. No effect.
2. The index will be dropped.
3. The index will be rendered invalid.
4. The index will contain NULL values.
19. To increase the number of nullable columns for a table,
1. Use the alter table statement.
2. Ensure that all column values are NULL for all rows.
3. First increase the size of adjacent column datatypes, then add the column.
4. Add the column, populate the column, then add the NOT NULL
constraint.
20. Which line of the following statement will produce an error?
1. CREATE TABLE goods
2. (good_no NUMBER,
3. good_name VARCHAR2 check(good_name in (SELECT name FROM
avail_goods)),
4. CONSTRAINT pk_goods_01
5. PRIMARY KEY (goodno));
6. There are no errors in this statement.
21. MAXVALUE is a valid parameter for sequence creation.
1. TRUE
2. FALSE
22. Which of the following lines in the SELECT statement below contain an
error?
1. SELECT DECODE(empid, 58385, “INACTIVE", “ACTIVE") empid
2. FROM emp
3. WHERE SUBSTR(lastname,1,1) > TO_NUMBER(’S')
4. AND empid > 02000
5. ORDER BY empid DESC, lastname ASC;
6. There are no errors in this statement.
23. Which function below can best be categorized as similar in function to an IF-
THEN-ELSE statement?
1. SQRT
2. DECODE
3. NEW_TIME
4. ROWIDTOCHAR
24. Which two of the following orders are used in ORDER BY clauses? (choose
two)
1. ABS
2. ASC
3. DESC
4. DISC
25. You query the database with this command
SELECT name
FROM employee
WHERE name LIKE ‘_a%’;
1. What is a major difference between SQL Server 6.5 and 7.0 platform wise?
SQL Server 6.5 runs only on Windows NT Server. SQL Server 7.0 runs on
Windows NT Server, workstation and Windows 95/98.
7. I know the nvl function only allows the same data type(ie. number or char or
date Nvl(comm, 0)), if commission is null then the text “Not Applicable” want to
display, instead of blank space. How do I write the query?
Output :
NVL(TO_CHAR(COMM),'NA')
-----------------------
NA
300
500
NA
1400
NA
NA
Tips : 1. Here SQL%ISOPEN is false, because oracle automatically closed the implicit
cursor after executing SQL statements.
: 2. All are Boolean attributes.
18. Other way to replace query result null value with a text
SQL> Set NULL ‘N/A’
to reset SQL> Set NULL ‘’
21. What is the maximum number of triggers, can apply to a single table?
12 triggers.
Database Questions
Shell
-----
[1] How do I find out the names of files in a tar file called
"arch1.tar"?
[2] Say I have a directory tree under '.' and I want to look for the names of text files called
'*log' containing 'ORA-00054'. How could I do that?
[4] What command would you use to display which user owns the current directory?
=======================================================================
==
1) what editor do you use? (hint: a wrong answer here will heavily
influence the interview... what is on ALL Unix systems? ;)
2) what shell do you use? (hint: a wrong answer here will affect your
chances)
5) list the operating systems you are familiar with and the versions.
7) what commercial tools have you used for backup and archive?
11) what command identifies the NIS server you are bound to?
15) if you are going to modify CDE files, which directory should you
edit:
/usr/dt... or /etc/dt... ?
16) what differs between the NIS entry in the /etc/passwd file btwn HP
and Sun?
17) in Solaris 2.5 what is the name of the file with the NFS exported
files?
18) in Solaris 2.6 what is the name of the file with the NFS exported
files?
19) identify some differences btwn CDE in Solaris 2.6 and Solaris 2.7?
20) How can you tell what is attached to the SCSI chain on a Sun system?
21) How can you tell what is attached to the SCSI chain on an HP system?
22) What command will tell you have much memory a system has in Solaris?
23) What command will tell you have much memory a system has in HP-UX?
26) How would you "break" an NFS mount on a system that has users on it?
27) Explain how you could stop and start a process without rebooting?
28) What command will tell you how long a system has been running?
35) What files control basic hostname and network information on a Sun?
(hint: 3)
36) What files control basic hostname and network information on an HP?
(hint: 2)
45) What are some of the tags in HTML that do not require a closing
tag?
46) What command can you use to find the path to an application?
47) What option can you use with FTP to automatically transfer using
wildcards?
48) What character(s) show up in an ascii file created in MSDos or
Windoze
when FTP'd in Binary mode?
52) What are the tar commands to display the contents of an archive?
53) What directory are programs like perl and netscape installed in?
55) What command tells you who is logged into a file system?
Technical - UNIX
1. How do you list the files in an UNIX directory while also showing
hidden files?
2. How do you execute a UNIX command in the background?
3. What UNIX command will control the default file permissions when
files are created?
4. Explain the read, write, and execute permissions on a UNIX directory.
5. What is the difference between a soft link and a hard link?
6. Give the command to display space usage on the UNIX file system.
7. Explain iostat, vmstat and netstat.
8. How would you change all occurrences of a value using VI?
9. Give two UNIX kernel parameters that effect an Oracle install
10. Briefly, how do you install Oracle software on UNIX.