Sunday, May 29, 2011

Filter Transformation complete Overview


Use the Filter transformation to filter out rows in a mapping. As an active transformation, the Filter transformation may change the number of rows passed through it. The Filter transformation allows rows that meet the specified filter condition to pass through. It drops rows that do not meet the condition. You can filter data based on one or more conditions.

A filter condition returns TRUE or FALSE for each row that the Integration Service evaluates, depending on whether a row meets the specified condition. For each row that returns TRUE, the Integration Services pass through the transformation. For each row that returns FALSE, the Integration Service drops and writes a message to the session log.

The following mapping passes the rows from a human resources table that contains employee data through a Filter transformation. The filter allows rows through for employees that make salaries of $30,000 or higher.

You cannot concatenate ports from more than one transformation into the Filter transformation. The input ports for the filter must come from a single transformation.

Tip: Place the Filter transformation as close to the sources in the mapping as possible to maximize session performance. Rather than passing rows you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Filter Transformation Components
You can create a Filter transformation in the Transformation Developer or the Mapping Designer. A Filter transformation contains the following tabs:

Transformation. Enter the name and description of the transformation. The naming convention for a Filter transformation is FIL_TransformationName. You can also make the transformation reusable.

Ports. Create and configure ports.

Properties. Configure the filter condition to filter rows. Use the Expression Editor to enter the filter condition. You can also configure the tracing level to determine the amount of transaction detail reported in the session log file.

Metadata Extensions. Create a non-reusable metadata extension to extend the metadata of the transformation transformation. Configure the extension name, datatype, precision, and value. You can also promote metadata extensions to reusable extensions if you want to make it available to all transformation transformations.

Configuring Filter Transformation Ports
You can create and modify ports on the Ports tab.
You can configure the following properties on the Ports tab:

Port name. Name of the port.

Datatype, precision, and scale. Configure the datatype and set the precision and scale for each port.

Port type. All ports are input/output ports. The input ports receive data and output ports pass data.

Default values and description. Set default value for ports and add description.
Filter Condition
The filter condition is an expression that returns TRUE or FALSE. Enter conditions using the Expression Editor available on the Properties tab.
Any expression that returns a single value can be used as a filter. For example, if you want to filter out rows for employees whose salary is less than $30,000, you enter the following condition:
SALARY > 30000
You can specify multiple components of the condition, using the AND and OR logical operators. If you want to filter out employees who make less than $30,000 and more than $100,000, you enter the following condition:
SALARY > 30000 AND SALARY < 100000
You can also enter a constant for the filter condition. The numeric equivalent of FALSE is zero (0). Any non-zero value is the equivalent of TRUE. For example, the transformation contains a port named NUMBER_OF_UNITS with a numeric datatype. You configure a filter condition to return FALSE if the value of NUMBER_OF_UNITS equals zero. Otherwise, the condition returns TRUE.
You do not need to specify TRUE or FALSE as values in the expression. TRUE and FALSE are implicit return values from any condition you set. If the filter condition evaluates to NULL, the row is treated as FALSE.
Note: The filter condition is case sensitive. 
Filtering Rows with Null Values
To filter rows containing null values or spaces, use the ISNULL and IS_SPACES functions to test the value of the port. For example, if you want to filter out rows that contain NULL value in the FIRST_NAME port, use the following condition:
IIF(ISNULL(FIRST_NAME),FALSE,TRUE)
This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and the row should be discarded. Otherwise, the row passes through to the next transformation.

Performance Tuning

Use the Filter transformation early in the mapping.
To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Use the Source Qualifier transformation to filter.
The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance.
However, the Source Qualifier transformation only lets you filter rows from relational sources, while the Filter transformation filters rows from any type of source. Also, note that since it runs in the database, you must make sure that the filter condition in the Source Qualifier transformation only uses standard SQL. The Filter transformation can define a condition using any statement or transformation function that returns either a TRUE or FALSE value. 


No comments:

Post a Comment