Sei sulla pagina 1di 8

Chapter 7

Filter Transformation

This chapter includes the following topics:


♦ Overview, 190
♦ Filter Condition, 192
♦ Creating a Filter Transformation, 193
♦ Tips, 195
♦ Troubleshooting, 196

189
Overview
Transformation type:
Active
Connected

You can filter rows in a mapping with the Filter transformation. You pass all the rows from a
source transformation through the Filter transformation, and then enter a filter condition for
the transformation. All ports in a Filter transformation are input/output, and only rows that
meet the condition pass through the Filter transformation.
In some cases, you need to filter data based on one or more conditions before writing it to
targets. For example, if you have a human resources target containing information about
current employees, you might want to filter out employees who are part-time and hourly.
The mapping in Figure 7-1 passes the rows from a human resources table that contains
employee data through a Filter transformation. The filter only allows rows through for
employees that make salaries of $30,000 or higher.

Figure 7-1. Sample Mapping with a Filter Transformation

190 Chapter 7: Filter Transformation


Figure 7-2 shows the filter condition used in the mapping in Figure 7-1 on page 190:

Figure 7-2. Specifying a Filter Condition in a Filter Transformation

With the filter of SALARY > 30000, only rows of data where employees that make salaries
greater than $30,000 pass through to the target.
As an active transformation, the Filter transformation may change the number of rows passed
through it. A filter condition returns TRUE or FALSE for each row that passes through the
transformation, depending on whether a row meets the specified condition. Only rows that
return TRUE pass through this transformation. Discarded rows do not appear in the session
log or reject files.
To maximize session performance, include the Filter transformation as close to the sources in
the mapping as possible. Rather than passing rows you plan to discard through the mapping,
you then filter out unwanted data early in the flow of data from sources to targets.
You cannot concatenate ports from more than one transformation into the Filter
transformation. The input ports for the filter must come from a single transformation. The
Filter transformation does not allow setting output default values.

Overview 191
Filter Condition
You use the transformation language to enter the filter condition. The condition is an
expression that returns TRUE or FALSE. For example, if you want to filter out rows for
employees whose salary is less than $30,000, you enter the following condition:
SALARY > 30000

You can specify multiple components of the condition, using the AND and OR logical
operators. If you want to filter out employees who make less than $30,000 and more than
$100,000, you enter the following condition:
SALARY > 30000 AND SALARY < 100000

You do not need to specify TRUE or FALSE as values in the expression. TRUE and FALSE
are implicit return values from any condition you set. If the filter condition evaluates to
NULL, the row is assumed to be FALSE.
Enter conditions using the Expression Editor, available from the Properties tab of the Filter
transformation. The filter condition is case sensitive. Any expression that returns a single
value can be used as a filter. You can also enter a constant for the filter condition. The
numeric equivalent of FALSE is zero (0). Any non-zero value is the equivalent of TRUE. For
example, if you have a port called NUMBER_OF_UNITS with a numeric datatype, a filter
condition of NUMBER_OF_UNITS returns FALSE if the value of NUMBER_OF_UNITS
equals zero. Otherwise, the condition returns TRUE.
After entering the expression, you can validate it by clicking the Validate button in the
Expression Editor. When you enter an expression, validate it before continuing to avoid
saving an invalid mapping to the repository. If a mapping contains syntax errors in an
expression, you cannot run any session that uses the mapping until you correct the error.

192 Chapter 7: Filter Transformation


Creating a Filter Transformation
Creating a Filter transformation requires inserting the new transformation into the mapping,
adding the appropriate input/output ports, and writing the condition.

To create a Filter transformation:

1. In the Designer, switch to the Mapping Designer and open a mapping.


2. Click Transformation > Create.
Select Filter transformation, and enter the name of the new transformation. The naming
convention for the Filter transformation is FIL_TransformationName. Click Create, and
then click Done.
3. Select and drag all the ports from a source qualifier or other transformation to add them
to the Filter transformation.
After you select and drag ports, copies of these ports appear in the Filter transformation.
Each column has both an input and an output port.
4. Double-click the title bar of the new transformation.
5. Click the Properties tab.
A default condition appears in the list of conditions. The default condition is TRUE (a
constant with a numeric value of 1).

Open Button

6. Click the Value section of the condition, and then click the Open button.
The Expression Editor appears.

Creating a Filter Transformation 193


7. Enter the filter condition you want to apply.
Use values from one of the input ports in the transformation as part of this condition.
However, you can also use values from output ports in other transformations.
8. Click Validate to check the syntax of the conditions you entered.
You may have to fix syntax errors before continuing.
9. Click OK.
10. Select the Tracing Level, and click OK to return to the Mapping Designer.
11. Click Repository > Save to save the mapping.

194 Chapter 7: Filter Transformation


Tips
Use the Filter transformation early in the mapping.
To maximize session performance, keep the Filter transformation as close as possible to the
sources in the mapping. Rather than passing rows that you plan to discard through the
mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Use the Source Qualifier transformation to filter.


The Source Qualifier transformation provides an alternate way to filter rows. Rather than
filtering rows from within a mapping, the Source Qualifier transformation filters rows when
read from a source. The main difference is that the source qualifier limits the row set extracted
from a source, while the Filter transformation limits the row set sent to a target. Since a source
qualifier reduces the number of rows used throughout the mapping, it provides better
performance.
However, the Source Qualifier transformation only lets you filter rows from relational sources,
while the Filter transformation filters rows from any type of source. Also, note that since it
runs in the database, you must make sure that the filter condition in the Source Qualifier
transformation only uses standard SQL. The Filter transformation can define a condition
using any statement or transformation function that returns either a TRUE or FALSE value.
For more information about setting a filter for a Source Qualifier transformation, see “Source
Qualifier Transformation” on page 445.

Tips 195
Troubleshooting
I imported a flat file into another database (Microsoft Access) and used SQL filter queries to
determine the number of rows to import into the Designer. But when I import the flat file into
the Designer and pass data through a Filter transformation using equivalent SQL
statements, I do not import as many rows. Why is there a difference?
You might want to check two possible solutions:
♦ Case sensitivity. The filter condition is case sensitive, and queries in some databases do not
take this into account.
♦ Appended spaces. If a field contains additional spaces, the filter condition needs to check
for additional spaces for the length of the field. Use the RTRIM function to remove
additional spaces.

How do I filter out rows with null values?


To filter out rows containing null values or spaces, use the ISNULL and IS_SPACES
functions to test the value of the port. For example, if you want to filter out rows that contain
NULLs in the FIRST_NAME port, use the following condition:
IIF(ISNULL(FIRST_NAME),FALSE,TRUE)

This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and
the row should be discarded. Otherwise, the row passes through to the next transformation.
For more information about the ISNULL and IS_SPACES functions, see “Functions” in the
Transformation Language Reference.

196 Chapter 7: Filter Transformation

Potrebbero piacerti anche