Sei sulla pagina 1di 12

Browse In-DB Tool

The Browse In-DB tool lets you view your data at any point in an In-DB workflow. Use the Browse In-DB tool
as you build an In-DB workflow to ensure data is passing through the way you intend it to. It allows you to
view results at any point in the In-DB workflow.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Each Browse In-DB tool triggers a database query and can impact performance.

Configure the tool


Browse first 100 records: Specify the number of records to display in the Browse In-DB window when the
workflow is run. The default of 100 records can be changed. Entering 0 displays the maximum of 2 billion
records.

Enable caching: The Browse In-DB tool will cache the number of records returned when the workflow is run.
This option will default checked, but can be turned off. If the database connection or query (including the
number of records to browse) does not change then the query will not be re-run, but instead the records that
display in the window are pulled from the cache.

Clear Cache: Click the Clear Cache button to clear the data from the cache.

The tool has a display limitation of 2 billion records. If there are more than 2 billion records in the data stream,
there will be a message in the Data View statistics. All records will be written to the desired file type when
exporting from the Browse.

Connect In-DB Tool


The Connect In-DB tool creates an in-database connection in a workflow. Use the tool to connect to a new or
existing connection.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. In the Configuration window, click the Connection Name drop-down arrow and selection an option:

o Manage Connections: Create a new connection or use an existing connection.


o Open File Connection: Browse to a saved database connection file.

2. Once the connection is configured, Table or Query displays the name of the selected database table.

3. (Optional) Click Query Builder to select tables and construct queries. See Choose Table or Specify Query
Window.

Add a new In-DB connection


Use an existing In-DB connection

See Manage In-DB Connections to learn how to manage In-DB connections.

Filter In-DB Tool


The Filter In-DB tool filters records with a basic filter or with a custom expression using the native language of
the database, such as SQL. Use the Filter In-DB tool to query records and return records that meet the specified
criteria.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Tool complexity

While most In-DB tools do not require SQL commands, this tool requires SQL for more advanced processing.

Configure the tool


1. Select the appropriate filter type.

o Basic Filter: Use the basic filter to construct a simple query on a single field in the database.

1. Use the drop down to select the column to filter on.

2. Use the drop down to select an operator to use.

Operator Meaning
= Equals
!= Does not equal
> Is greater than
>= Is greater than or equal to
< Is less than
<= Is less than or equal to
IS NULL Is a missing or unknown value
IS NOT NULL Is not a missing or unknown value
Operator Meaning
LIKE Is similar to a specified pattern in a column

3. Type in the value to complete the query.

o Custom Filter: The Custom Filter acts as a SQL WHERE query. Use the custom filter to
construct a more complex expression or to query from multiple fields in the database.

1. Use Insert Fields to pick from available fields to construct your expression.

2. Type the rest of the query in the box using the native language of the database.

If a query is constructed using the Basic Filter, a read-only query displays in the Custom Filter
area. If the Custom Filter option is then selected, the query becomes editable.

2. Validate expression here at runtime: An option that sends a query to the database to report errors
contained in the expression in the results window of this tool.

By default, errors are reported in the results window of downstream tools.

View the output


T anchor: Records that meet the specified criteria.

F anchor: Records that do not meet the specified criteria.

Join In-DB Tool


The Join In-DB tool combines two In-DB data streams based on common fields by performing an outer or inner
join. Use this tool to blend two database tables.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

How to use this tool


1. Each Input will have a drop down list of its fields. Select the Join Fields for each input using the
Left and Right dropdowns. Designer automatically selects a join field from an input if the same
field name is already selected from a different input.

2. If multiple join fields are desired, an additional row of join fields can be configured. Simply click
on the drop down to choose additional join field per input.

3. To delete a join field, click on the number on the left hand side and click the delete button on
the right.
The Join tool restricts what field types can be joined together. Mismatching data types can result in error
messages.

Possible error messages

4. Select the Join Type:

Inner Join: Contains only the records from the Left input that joined to
records in the Right input.

Left Outer Join: Contains all records from the Left input including the
records that joined with the Right input.

Right Outer Join: Contains all records from the Right input including the
records that joined with the Left input.

Full Outer Join: Contains all records from both the Left and Right inputs.

Data Stream In Tool


Bring data from Designer into the In-DB workflow.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


Connection Name: Click the drop-down to create a new connection, manage an existing connection, or select a
connection from the list of connections that have already been set up.
 Manage Connections: Click to create, edit, or delete a connection.

 Open File Connection: Click to browse to a saved database connection file.

Creation Mode: Select the appropriate option for writing the data. Choices include:

 Create Temporary Table: Writes to a temporary table that is available until the end of the session. This
option is useful for building In-DB predictive macros because it holds the metadata in place temporarily.
If this option is selected, the Table Name field is disabled and displays “[a unique temporary table name
is generated on each run]".

 Create New Table: Creates a new table. Will not overwrite an existing table.

If an HDFS Avro option is selected, the avro.schema.literal fails at 4000 characters and an error will
occur on the table creation. Try reducing the character length of the column names, or selecting fewer
columns.

 Overwrite Table (Drop): Completely drops the existing table and creates a new one.

Table Name: Enter the name of the database table to create or update. If Create Temporary Table is selected, the
Table Name field is disabled and displays “[a unique temporary table name is generated on each run]".

Oracle pemissions

To use this tool with Oracle, you must have permissions to write to the tempspace assigned to GLOBAL
TEMPORARY. Contact your Oracle database administrator.

For more information, see Connect In-DB Tool.

Data Stream Out Tool


The Data Stream Out tool streams data from an In-DB workflow out to Designer.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


Sort records before streaming out: Optional setting that when checked, the incoming data is sorted by the field
specification below.

1. Name: Select the field to sort on.


2. Order: Choose either Ascending or Descending.

3. Manipulate the sorting order by using the up, down, and delete buttons on the right.

Connect a standard tool to the output of the Data Stream Out tool.

Dynamic Input In-DB Tool


The Dynamic Input In-DB tool takes In-DB Connection Name and Query fields from a standard data stream
and inputs them into an In-DB data stream.

The Dynamic Input In-DB tool is used in conjunction with the Dynamic Output In-DB Tool when creating an
In-DB macro for predictive analysis. The Dynamic Input In-DB tool can take In-DB Connection Name and
Query fields from a standard data stream and input them back into an In-DB data stream.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


Connection Name Field: The alias for the In-DB connection to the database .

Query / Query Alias List Field: The In-DB query created at this point in the workflow.

Dynamic Output In-DB Tool


The Dynamic Output In-DB tool outputs information about the In-DB workflow to a standard workflow for
Predictive In-DB.

The Dynamic Output In-DB tool is used in conjunction with the Dynamic Input In-DB Tool when creating an
In-DB macro for predictive analysis. The Dynamic Output In-DB tool can take the metadata from the In-DB
query and pass it into a standard workflow with predictive tools.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. Output Fields: Select the type of data you want to output.

o Query: The In-DB query created at this point in the workflow.


o Connection Name: The alias for the In-DB connection to the database.

o Connection Data Source: The database type.

o Input Connection String: The information about the database that is needed to be able to
establish a connection to it.

o Output Connection String: The information about the database that is needed to be able to
establish a connection to it.

o In-DB XML: XML representation of the In-DB query.

o Record Info XML: XML representation of the database fields.

o Query Alias List: Displays each segment of the query in the form of a common table expression.

o Last Query Alias: The last alias from the Query Alias List.

Possible Errors
If either the Input Connection String, or the Output Connection String fields are selected for output, the
following error may display:

To use this tool select an appropriate data source and select the "Allow Decryption of Password" Password
Encryption option in Manage In-DB Connections.

To resolve the error, modify the original connection string by changing the Password Encryption option to
"Allow Decryption of Password" so that the password is decrypted in the metadata.

Macro Input In-DB Tool


The Macro Input In-DB tool creates an In-DB input connection on a macro and populates it with placeholder
values.

An input anchor will appear on the macro tool for each Macro Input In-DB tool used in the macro workflow in
the order they are brought onto the canvas (left to right, or up to down). The input anchors can be re-ordered in
the Interface Designer Window from the Tree view.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. Specify the Template Input (For Test as a Standard Workflow). The Template Input allows the macro workflow to
be a functioning workflow and determines the field requirements for end users of the macro tool.
o Connection Name: Specify the database to connect to. Click to select the type of connection to
create.

 Manage Connections: Click to make edits to a connection that has already been set up or to
create a new connection. See Manage In-DB Connections .

 Open File Connections: Click to browse to a file.

o Table or Query: Displays the name of the selected table in the database once the connection has been
configured. Click Query Builder to easily select tables and construct queries in the Choose Table or
Specify Query Window.

You must have administrator privileges to create In-DB system connections.

2. Input Name: The Input Name will be visible to the end user when they configure the macro tool.

3. Anchor Abbreviation: This optional parameter will display an abbreviation on the input anchor of the macro tool.

4. Show Field Map: When this option is checked, and the macro tool is added to a workflow, the end user will be
asked to select the fields that match up with the selected template Input.

Macro Output In-DB Tool


The Macro Output In-DB tool creates an In-DB output connection on a macro that can be used with In-DB
workflows.

An output anchor will appear on the macro tool for each Macro Output In-DB tool used in the macro workflow
in the order they are brought onto the canvas (left to right, or up to down). The output anchors can be re-ordered
in the Interface Designer Window from the Tree view.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. Output Name:The Output Name will be visible to the end user when they configure the macro tool.

2. Anchor Abbreviation: This optional parameter will display an abbreviation on the output anchor of the macro
tool.

Sample In-DB Tool


The Sample In-DB tool limits the In-DB stream to a number or percentage of records. There is an option to base
the sampled records on a designated field sort order. Use the Sample In-DB tool to limit the amount of data
records to optimize runtime and throughput.

Configure the tool


1. Use the drop down list to specify the type of sample. Choices are:

o Number: Returns the number of records specified.

o Percent: Returns the percent of records specified. Selecting this option requires the data to pass thru
the tool twice: once to calculate the count of records and then to return the specified percent of
records.

2. Set the number or percentage of records to include in the sample.

3. Sample records based on order: When checked, the records will be sorted in-database before the number or
percent of records is chosen to produce the results. To configure the order, the data should be sorted using the
Fields table.

o Name: Select the field to sort on.

o Order: Choose either Ascending or Descending.

o Manipulate the sorting order by using the up, down, and delete buttons on the right.

4. Select In-DB Tool


5. The Select In-DB tool selects, deselects, renames, and reorders fields in an In-DB workflow.
6. Use the Select In-DB tool to limit the amount of fields in an In-Database data stream. In many cases,
limiting the amount of data passing through a workflow can significantly improve performance. Other
common use cases for the Select In-DB tool include renaming and reordering columns.
7. In-Database enables blending and analysis against large sets of data without moving the data out of a
database and can provide significant performance improvements over traditional analysis methods. For
more about the In-Database tool category, see In-Database Overview.
8. Configure the tool
9. Use the table to modify the incoming data stream. Each row in the table represents a column in the data.
10. Select, deselect, and reorder columns
11. Rename a column or add a description
12. View more options

Summarize In-DB Tool


The Summarize In-DB tool summarizes data by grouping, summing, counting, counting distinct fields, and
more. The output contains only the result of the calculations.
In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


Data fields from the input appear in the Fields section. Click to select the field to perform summaries on (Shift +
click to select multiple fields to execute the same summary).

 Use the Select menu to the right to Select to make multiple field selections Choices include:

o All: all fields are selected to apply to actions.

o None: deselects all fields.

o Numeric: only numeric fields are selected (integers, fixed decimals, floats, doubles) to apply to Actions.

o String: only string fields are selected to apply to Actions.

o Spatial: only spatial fields are selected to apply to Actions.

With the field(s) selected, click the Add button.

Make the selection and it will appear in the Actions section. Different summary functions are available
depending on the type of data field selected.

 Summarize functions include:

 Group by: Combines database records with identical values in a specified field into a single record. All
of the resulting data from the records in a group are then summarized. (any non-blob or spatial object
has this option)

If no Group by field is specified, the entire file will be summarized.

 Count: Count of records in the group.

 Count Distinct: Count of unique records in the group.

 Count Non Null: Count of unique records in the group that are not [Null]. A Null field means there is no value set
for this field; this is different from having a zero or an empty string.

 Count Null: Count of unique records on the group that are [Null].

 Min: Returns minimum value.

 Max:Returns the maximum value.

 Numeric summarize functions include:

o Sum: Returns the sum value for the group. The sum is calculated by adding all of the values of a group.
o Average: Calculates an average value for the group. The average is calculated by taking the sum of all
values divided by the total number of values.

o Standard Deviation: Calculates the standard deviation for the group. Standard Deviation is a
measurement variability used in statistics.

o Variance: Calculates the Variance for the group. The variance is calculated by taking the Standard
Deviation and multiplying it times itself, StdDev^2.

From the Actions section, you can select the field and use the up, down, and delete buttons to specify field order
for the output.

Rename a field by typing a new field name into the Output Field Name column.

Properties: Additional properties need specification for certain actions. Actions with additional properties
specification include: Concatenate Strings and Finance actions.

Union In-DB Tool


The Union In-DB tool combines two or more In-DB data streams with similar data structures based on field
names or field positions. In the output, each column will contain the data from each input.

In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. Choose the preferred configuration mode. Choices are:

o Auto Config by Name: Aligns fields by field name.

o Auto Config by Position: Aligns fields by their field order in the stream.

2. When Fields are Different: Select how to handle nonconforming data fields from the dropdown. Choices are:

o Error - Stop Processing: will throw an error in the Results window, and end the schema.

o Output All Fields: All fields will be included. Null values will populate empty fields.

o Output Common Subset of Fields: Only the fields that each input has in common will be outputted.

Write Data In-DB Tool


Use the In-DB stream to create or update a table directly in the database.
In-Database enables blending and analysis against large sets of data without moving the data out of a database
and can provide significant performance improvements over traditional analysis methods. For more about the
In-Database tool category, see In-Database Overview.

Configure the tool


1. Creation Mode: Select the appropriate option for writing the data. Choices include:

o Append Existing: Appends all the data to an existing table. Output will consist of Records Before +
Records After.

o Delete Data & Append: Deletes all the original records from the table and then appends the data into
the existing table.

o Overwrite Table (Drop): Completely drops the existing table and creates a new one.

o Create New Table: Creates a new table. Will not overwrite an existing table.

o Create Temporary Table: Writes to a temporary table that is available until the end of the session. This
option is useful for building In-DB predictive macros because it holds the metadata in place temporarily.
If this option is selected, the Table Name field is disabled and displays “[a unique temporary table name
is generated on each run]".

2. Table Name: Enter the name of the database table to create or update.

3. Append Fields Mapping: this area becomes active when Append Existing or Delete Data & Append is chosen
above.

o Choose the preferred configuration mode. Choices are:

 Auto Config by Name: Aligns fields by field name.

 Auto Config by Position: Aligns fields by their field order in the stream.

o When Fields are Different: Select how to handle nonconforming data fields from the options using the
drop-down.

 Error - Stop Processing: will throw an error in the Results window, and end processing.

 Output Applicable Fields: Applicable fields will be included. Null values will populate empty
fields.

Potrebbero piacerti anche