Sei sulla pagina 1di 6

Test 415: IBM WebSphere IIS DataStage Enterprise Edition

v7.5
Objectives Sample / assessment tests Training resources

Section 1 - Installation and Configuration (5%)

A. Describe how to properly install and configure DataStage EE


1. Describe users and groups
2. Describe the environment (e.g., dsenv, ODBC)
a. Setup database connectivity
3. Describe OS configuration/kernel
4. Describe USS configuration
5. Identify required components for server
a. C++ compiler
B. Identify the tasks required to create and configure a project to be used for EE jobs.
1. Project location
2. Assign DataStage EE roles
3. Environment defaults
C. Given a configuration file, identify its components and its overall intended purpose.
1. Project location
2. Assign DataStage EE roles
3. Environment defaults
D. List the steps necessary to start/stop DataStage EE properly.
1. netstat -a|grep ds

Section 2 - Metadata (5%)

A. Demonstrate knowledge of Orchestrate schema.


1. Distinguish internal data type (Orchestrate schema) vs external data type
2. Describe how to set extended properties for table definition
3. Import metadata using plug-ins vs orchdbutil
4. Explain nullable mapping rules (e.g., source vs target)
5. NLS data types
B. Identify the method of importing metadata.
1. Flat sources (e.g., sequential file, Orchestrate schema, ODBC, etc.)
2. COBOL CopyBook
3. XML
C. Given a scenario, demonstrate knowledge of runtime column propagation.
1. Usage
2. Impact on stage mapping and target databases

Section 3 - Persistent Storage (10%)

A. Given a scenario, explain the process of importing/exporting data to/from framework (e.g., sequential file,
external source/target).
1. Explain use of various file stages (e.g., file, CFF, fileset, dataset) and where appropriate to use
2. If USS, define the native file format (e.g., EBCDIC, VSDM)
B. Given a scenario, describe proper use of a sequential file.
1. Read in parallel (e.g., reader per node, multiple files)
2. Handle various formats (e.g., fix vs variable, delimeted vs nondelimited, etc.)
3. Describe how to import and export nullable data
4. Explain how to identify and capture rejected records (e.g., log counts, using reject link, options for
rejection)
C. Given a scenario, describe proper usage of CFF (native not plug-in).
1. Explain how to import data from a file that has multiple varying record types (e.g., COBOL
CopyBook, EBCDIC to ASCII)
D. Describe proper usage of FileSets and DataSets.
1. Explain differences and similarities of FileSet and DataSet (e.g., header data file segments,
internal DS vs external format (FS))
2. Determine which tools can be used to manage FileSets and DataSets (GUI and CLI)
E. Describe use of FTP stage for remote data (e.g., how to parallel, plug-in vs enterprise).
1. Restructure stages (e.g., column import/export)
F. Identify importing/exporting of XML data.
1. XML stage options and usage
2. XPATH and XLS

Section 4 - Parallel Architecture (10%)

A. Given a scenario, demonstrate proper use of data partitioning and collecting.


Sample test

1. Which DataStage EE client application is used to manage roles for DataStage projects?
A. Director
B. Manager
C. Designer
D. Administrator

Answer: D

2. Importing metadata from data modeling tools like ERwin is accomplished by which facility?
A. MetaMerge
B. MetaExtract
C. MetaBrokers
D. MetaMappers

Answer: C
3. Which two statements are true of writing intermediate results between parallel jobs to
persistent data sets? (Choose two.)
A. Datasets are pre-indexed.
B. Datasets are stored in native internal format.
C. Datasets retain data partitioning and sort order.
D. Datasets can only use RCP when a schema file is specified.

Answer: B AND C

4. You are reading customer data using a Sequential File stage and sorting it by customer ID
using the Sort stage. Then the sorted data is to be sent to an Aggregator stage which will
count the number of records for each customer.

Which partitioning method is more likely to yield optimal performance without violating the
business requirements?

A. Entire
B. Random
C. Round Robin
D. Hash by customer ID

Answer:D

5. A customer wants to create a parallel job to append to an existing Teradata table with an
input file of over 30 gigabytes. The input data also needs to be transformed and combined
with two additional flat files. The first has State codes and is about 1 gigabyte in size. The
second file is a complete view of the current data which is roughly 40 gigabytes in size. Each
of these files will have a one to one match and ultimately be combined into the original file.

Which DataStage stage will communicate with Teradata using the maximum parallel
performance to write the results to an existing Teradata table?

A. Teradata API
B. Teradata Enterprise
C. Teradata TPump
D. Teradata MultiLoad

Answer:B

6. Which column attribute could you use to avoid rejection of a record with a NULL when it is
written to a nullable field in a target Sequential File?
A. null field value
B. bytes to skip
C. out format
D. pad char

Answer:A

7. You are reading customer records from a sequential file. In addition to the customer ID, each
record has a field named Rep ID that contains the ID of the company representative assigned
to the customer. When this field is blank, you want to retrieve the customers representative
from the REP table.

Which stage has this functionality?

A. Join Stage
B. Merge Stage
C. Lookup Stage
D. No stage has this functionality.
Answer:C

8. You want to ensure that you package all the jobs that are used in a Job Sequence for
deployment to a production server.

Which command line interface utility will let you search for jobs that are used in a specified
Job Sequence?

A. dsjob
B. dsinfo
C. dsadmin
D. dssearch

Answer:D

9. Your job is running in a grid environment consisting of 50 computers each having two
processors. You need to add a job parameter that will allow you to run the job using different
sets of resources and and computers on different job runs.

Which environment variable should you add to your job parameters?

A. APT_CONFIG_FILE
B. APT_DUMP_SCORE
C. APT_EXECUTION_MODE
D. APT_RECORD_COUNTS

Answer:A

10. Which two statements are valid about Job Templates? (Choose two.)
A. Job Templates can be created from any parallel job or Job Sequence.
B. Job Templates should include recommended environment variables including
APT_CONFIG_FILE.
C. Job Templates are stored on the DataStage development server where they can be shared
among developers.
D. The locatation where Job Templates are stored can be changed within DataStage Designer
Tools - Options menu.

Answer: A and B

Section 1

A
B
C
D

Section 2

Identify the method of importing metadata.

A
B

import the cobol copybooks metadata(column structure) on to the datastage


In the manager you can do Import -> Table Definitions -> Cobol File Definitions

C
D

Potrebbero piacerti anche