Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
v7.5
Objectives Sample / assessment tests Training resources
A. Given a scenario, explain the process of importing/exporting data to/from framework (e.g., sequential file,
external source/target).
1. Explain use of various file stages (e.g., file, CFF, fileset, dataset) and where appropriate to use
2. If USS, define the native file format (e.g., EBCDIC, VSDM)
B. Given a scenario, describe proper use of a sequential file.
1. Read in parallel (e.g., reader per node, multiple files)
2. Handle various formats (e.g., fix vs variable, delimeted vs nondelimited, etc.)
3. Describe how to import and export nullable data
4. Explain how to identify and capture rejected records (e.g., log counts, using reject link, options for
rejection)
C. Given a scenario, describe proper usage of CFF (native not plug-in).
1. Explain how to import data from a file that has multiple varying record types (e.g., COBOL
CopyBook, EBCDIC to ASCII)
D. Describe proper usage of FileSets and DataSets.
1. Explain differences and similarities of FileSet and DataSet (e.g., header data file segments,
internal DS vs external format (FS))
2. Determine which tools can be used to manage FileSets and DataSets (GUI and CLI)
E. Describe use of FTP stage for remote data (e.g., how to parallel, plug-in vs enterprise).
1. Restructure stages (e.g., column import/export)
F. Identify importing/exporting of XML data.
1. XML stage options and usage
2. XPATH and XLS
1. Which DataStage EE client application is used to manage roles for DataStage projects?
A. Director
B. Manager
C. Designer
D. Administrator
Answer: D
2. Importing metadata from data modeling tools like ERwin is accomplished by which facility?
A. MetaMerge
B. MetaExtract
C. MetaBrokers
D. MetaMappers
Answer: C
3. Which two statements are true of writing intermediate results between parallel jobs to
persistent data sets? (Choose two.)
A. Datasets are pre-indexed.
B. Datasets are stored in native internal format.
C. Datasets retain data partitioning and sort order.
D. Datasets can only use RCP when a schema file is specified.
Answer: B AND C
4. You are reading customer data using a Sequential File stage and sorting it by customer ID
using the Sort stage. Then the sorted data is to be sent to an Aggregator stage which will
count the number of records for each customer.
Which partitioning method is more likely to yield optimal performance without violating the
business requirements?
A. Entire
B. Random
C. Round Robin
D. Hash by customer ID
Answer:D
5. A customer wants to create a parallel job to append to an existing Teradata table with an
input file of over 30 gigabytes. The input data also needs to be transformed and combined
with two additional flat files. The first has State codes and is about 1 gigabyte in size. The
second file is a complete view of the current data which is roughly 40 gigabytes in size. Each
of these files will have a one to one match and ultimately be combined into the original file.
Which DataStage stage will communicate with Teradata using the maximum parallel
performance to write the results to an existing Teradata table?
A. Teradata API
B. Teradata Enterprise
C. Teradata TPump
D. Teradata MultiLoad
Answer:B
6. Which column attribute could you use to avoid rejection of a record with a NULL when it is
written to a nullable field in a target Sequential File?
A. null field value
B. bytes to skip
C. out format
D. pad char
Answer:A
7. You are reading customer records from a sequential file. In addition to the customer ID, each
record has a field named Rep ID that contains the ID of the company representative assigned
to the customer. When this field is blank, you want to retrieve the customers representative
from the REP table.
A. Join Stage
B. Merge Stage
C. Lookup Stage
D. No stage has this functionality.
Answer:C
8. You want to ensure that you package all the jobs that are used in a Job Sequence for
deployment to a production server.
Which command line interface utility will let you search for jobs that are used in a specified
Job Sequence?
A. dsjob
B. dsinfo
C. dsadmin
D. dssearch
Answer:D
9. Your job is running in a grid environment consisting of 50 computers each having two
processors. You need to add a job parameter that will allow you to run the job using different
sets of resources and and computers on different job runs.
A. APT_CONFIG_FILE
B. APT_DUMP_SCORE
C. APT_EXECUTION_MODE
D. APT_RECORD_COUNTS
Answer:A
10. Which two statements are valid about Job Templates? (Choose two.)
A. Job Templates can be created from any parallel job or Job Sequence.
B. Job Templates should include recommended environment variables including
APT_CONFIG_FILE.
C. Job Templates are stored on the DataStage development server where they can be shared
among developers.
D. The locatation where Job Templates are stored can be changed within DataStage Designer
Tools - Options menu.
Answer: A and B
Section 1
A
B
C
D
Section 2
A
B
C
D