Sunday, May 29, 2011

Filter Transformation complete Overview

Use the Filter transformation to filter out rows in a mapping. As an active transformation, the Filter transformation may change the number of rows passed through it. The Filter transformation allows rows that meet the specified filter condition to pass through. It drops rows that do not meet the condition. You can filter data based on one or more conditions.

A filter condition returns TRUE or FALSE for each row that the Integration Service evaluates, depending on whether a row meets the specified condition. For each row that returns TRUE, the Integration Services pass through the transformation. For each row that returns FALSE, the Integration Service drops and writes a message to the session log.

The following mapping passes the rows from a human resources table that contains employee data through a Filter transformation. The filter allows rows through for employees that make salaries of $30,000 or higher.

You cannot concatenate ports from more than one transformation into the Filter transformation. The input ports for the filter must come from a single transformation.

Tip: Place the Filter transformation as close to the sources in the mapping as possible to maximize session performance. Rather than passing rows you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Filter Transformation Components

You can create a Filter transformation in the Transformation Developer or the Mapping Designer. A Filter transformation contains the following tabs:

Transformation. Enter the name and description of the transformation. The naming convention for a Filter transformation is FIL_TransformationName. You can also make the transformation reusable.

Ports. Create and configure ports.

Properties. Configure the filter condition to filter rows. Use the Expression Editor to enter the filter condition. You can also configure the tracing level to determine the amount of transaction detail reported in the session log file.

Metadata Extensions. Create a non-reusable metadata extension to extend the metadata of the transformation transformation. Configure the extension name, datatype, precision, and value. You can also promote metadata extensions to reusable extensions if you want to make it available to all transformation transformations.

Configuring Filter Transformation Ports

You can create and modify ports on the Ports tab.

You can configure the following properties on the Ports tab:

Port name. Name of the port.

Datatype, precision, and scale. Configure the datatype and set the precision and scale for each port.

Port type. All ports are input/output ports. The input ports receive data and output ports pass data.

Default values and description. Set default value for ports and add description.

Filter Condition

The filter condition is an expression that returns TRUE or FALSE. Enter conditions using the Expression Editor available on the Properties tab.

Any expression that returns a single value can be used as a filter. For example, if you want to filter out rows for employees whose salary is less than $30,000, you enter the following condition:

SALARY > 30000

You can specify multiple components of the condition, using the AND and OR logical operators. If you want to filter out employees who make less than $30,000 and more than $100,000, you enter the following condition:

SALARY > 30000 AND SALARY < 100000

You can also enter a constant for the filter condition. The numeric equivalent of FALSE is zero (0). Any non-zero value is the equivalent of TRUE. For example, the transformation contains a port named NUMBER_OF_UNITS with a numeric datatype. You configure a filter condition to return FALSE if the value of NUMBER_OF_UNITS equals zero. Otherwise, the condition returns TRUE.

You do not need to specify TRUE or FALSE as values in the expression. TRUE and FALSE are implicit return values from any condition you set. If the filter condition evaluates to NULL, the row is treated as FALSE.

Note: The filter condition is case sensitive.

Filtering Rows with Null Values

To filter rows containing null values or spaces, use the ISNULL and IS_SPACES functions to test the value of the port. For example, if you want to filter out rows that contain NULL value in the FIRST_NAME port, use the following condition:

IIF(ISNULL(FIRST_NAME),FALSE,TRUE)

This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and the row should be discarded. Otherwise, the row passes through to the next transformation.

Performance Tuning

Use the Filter transformation early in the mapping.

To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Use the Source Qualifier transformation to filter.

The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance.

However, the Source Qualifier transformation only lets you filter rows from relational sources, while the Filter transformation filters rows from any type of source. Also, note that since it runs in the database, you must make sure that the filter condition in the Source Qualifier transformation only uses standard SQL. The Filter transformation can define a condition using any statement or transformation function that returns either a TRUE or FALSE value.

Source Qualifier Transformation Complete Overview

Transformation type:

Active

Connected

For more information about Application Source Qualifier transformations, see the corresponding help topic for your PowerExchange product.

When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier transformation represents the rows that the Integration Service reads when it runs a session.

Use the Source Qualifier transformation to complete the following tasks:

Join data originating from the same source database. You can join two or more tables with primary key-foreign key relationships by linking the sources to one Source Qualifier transformation.

Filter rows when the Integration Service reads source data. If you include a filter condition, the Integration Service adds a WHERE clause to the default query.

Specify an outer join rather than the default inner join. If you include a user-defined join, the Integration Service replaces the join information specified by the metadata in the SQL query.

Specify sorted ports. If you specify a number for sorted ports, the Integration Service adds an ORDER BY clause to the default SQL query.

Select only distinct values from the source. If you choose Select Distinct, the Integration Service adds a SELECT DISTINCT statement to the default SQL query.

Create a custom query to issue a special SELECT statement for the Integration Service to read source data. For example, you might use a custom query to perform aggregate calculations.

Transformation Datatypes

The Source Qualifier transformation displays the transformation datatypes. The transformation datatypes determine how the source database binds data when the Integration Service reads it. Do not alter the datatypes in the Source Qualifier transformation. If the datatypes in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping invalid when you save it.

Target Load Order

You specify a target load order based on the Source Qualifier transformations in a mapping. If you have multiple Source Qualifier transformations connected to multiple targets, you can designate the order in which the Integration Service loads data into the targets.

If one Source Qualifier transformation provides data for multiple targets, you can enable constraint-based loading in a session to have the Integration Service load data based on target table primary and foreign key relationships.

Datetime Values

When you use a datetime value or a datetime parameter or variable in the SQL query, change the date format to the format used in the source. The Integration Service passes datetime values to source systems as strings in the SQL query. The Integration Service converts a datetime value to a string, based on the source database.

The following table describes the datetime formats for each database type:


Source	Date Format
DB2	YYYY-MM-DD-HH24:MI:SS
Informix	YYYY-MM-DD HH24:MI:SS
Microsoft SQL Server	MM/DD/YYYY HH24:MI:SS
ODBC	YYYY-MM-DD HH24:MI:SS
Oracle	MM/DD/YYYY HH24:MI:SS
Sybase	MM/DD/YYYY HH24:MI:SS
Teradata	YYYY-MM-DD HH24:MI:SS

Some databases require you to identify datetime values with additional punctuation, such as single quotation marks or database specific functions. For example, to convert the $$$SessStartTime value for an Oracle source, use the following Oracle function in the SQL override:

to_date (‘$$$SessStartTime’, ‘mm/dd/yyyy hh24:mi:ss’)

For Informix, use the following Informix function in the SQL override to convert the $$$SessStartTime value:

DATETIME ($$$SessStartTime) YEAR TO SECOND

For information about database specific functions, see the database documentation.

Parameters and Variables

You can use parameters and variables in the SQL query, user-defined join, source filter, and pre- and post-session SQL commands of a Source Qualifier transformation. Use any parameter or variable type that you can define in the parameter file. You can enter a parameter or variable within the SQL statement, or you can use a parameter or variable as the SQL query. For example, you can use a session parameter, $ParamMyQuery, as the SQL query, and set $ParamMyQuery to the SQL statement in a parameter file.

The Integration Service first generates an SQL query and expands each parameter or variable. It replaces each mapping parameter, mapping variable, and workflow variable with its start value. Then it runs the query on the source database.

When you use a string mapping parameter or variable in the Source Qualifier transformation, use a string identifier appropriate to the source system. Most databases use a single quotation mark as a string identifier. For example, to use the string parameter $$IPAddress in a source filter for a Microsoft SQL Server database table, enclose the parameter in single quotes as follows: ‘$$IPAddress.’

When you use a datetime mapping parameter or variable, or when you use the built-in variable $$$SessStartTime, change the date format to the format used in the source. The Integration Service passes datetime values to source systems as strings in the SQL query.

Tip: To ensure the format of a datetime parameter or variable matches that used by the source, validate the SQL query.

Configure the following Source Qualifier transformation properties on the Properties tab:

Option

Description

SQL Query

Defines a custom query that replaces the default query the Integration Service uses to read data from sources represented in this Source Qualifier transformation. A custom query overrides entries for a custom join or a source filter.

User-Defined Join

Specifies the condition used to join data from multiple sources represented in the same Source Qualifier transformation.

Source Filter

Specifies the filter condition the Integration Service applies when querying rows.

Number of Sorted Ports

Indicates the number of columns used when sorting rows queried from relational sources. If you select this option, the Integration Service adds an ORDER BY to the default query when it reads source rows. The ORDER BY includes the number of ports specified, starting from the top of the transformation.
When selected, the database sort order must match the session sort order.

Tracing Level

Sets the amount of detail included in the session log when you run a session containing this transformation.

Select Distinct

Specifies if you want to select only unique rows. The Integration Service includes a SELECT DISTINCT statement if you choose this option.

Pre-SQL

Pre-session SQL commands to run against the source database before the Integration Service reads the source.

Post-SQL

Post-session SQL commands to run against the source database after the Integration Service writes to the target.

Output is Deterministic

Relational source or transformation output that does not change between session runs when the input data is consistent between runs. When you configure this property, the Integration Service does not stage source data for recovery if transformations in the pipeline always produce repeatable data.

Output is Repeatable

Relational source or transformation output that is in the same order between session runs when the order of the input data is consistent. When output is deterministic and output is repeatable, the Integration Service does not stage source data for recovery.

Warning: If you configure a transformation as repeatable and deterministic, it is your responsibility to ensure that the data is repeatable and deterministic. If you try to recover a session with transformations that do not produce the same data between the session and the recovery, the recovery process can result in corrupted data.

Option	Description
SQL Query	Defines a custom query that replaces the default query the Integration Service uses to read data from sources represented in this Source Qualifier transformation. A custom query overrides entries for a custom join or a source filter.
User-Defined Join	Specifies the condition used to join data from multiple sources represented in the same Source Qualifier transformation.
Source Filter	Specifies the filter condition the Integration Service applies when querying rows.
Number of Sorted Ports	Indicates the number of columns used when sorting rows queried from relational sources. If you select this option, the Integration Service adds an ORDER BY to the default query when it reads source rows. The ORDER BY includes the number of ports specified, starting from the top of the transformation. When selected, the database sort order must match the session sort order.
Tracing Level	Sets the amount of detail included in the session log when you run a session containing this transformation.
Select Distinct	Specifies if you want to select only unique rows. The Integration Service includes a SELECT DISTINCT statement if you choose this option.
Pre-SQL	Pre-session SQL commands to run against the source database before the Integration Service reads the source.
Post-SQL	Post-session SQL commands to run against the source database after the Integration Service writes to the target.
Output is Deterministic	Relational source or transformation output that does not change between session runs when the input data is consistent between runs. When you configure this property, the Integration Service does not stage source data for recovery if transformations in the pipeline always produce repeatable data.
Output is Repeatable	Relational source or transformation output that is in the same order between session runs when the order of the input data is consistent. When output is deterministic and output is repeatable, the Integration Service does not stage source data for recovery.

Default Query
For relational sources, the Integration Service generates a query for each Source Qualifier transformation when it runs a session. The default query is a SELECT statement for each source column used in the mapping. In other words, the Integration Service reads only the columns that are connected to another transformation.

Although there are many columns in the source definition, only three columns are connected to another transformation. In this case, the Integration Service generates a default query that selects only those three columns:
SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME
FROM CUSTOMERS

If any table name or column name contains a database reserved word, you can create and maintain a file, reswords.txt, containing reserved words. When the Integration Service initializes a session, it searches for reswords.txt in the Integration Service installation directory. If the file exists, the Integration Service places quotes around matching reserved words when it executes SQL against the database. If you override the SQL, you must enclose any reserved word in quotes.
When generating the default query, the Designer delimits table and field names containing the following characters with double quotes:
/ + - = ~ ` ! % ^ & * ( ) [ ] { } ' ; ? , < > \ | <space>
Viewing the Default Query
You can view the default query in the Source Qualifier transformation.
To view the default query:

1.

From the Properties tab, select SQL Query.

The SQL Editor displays the default query the Integration Service uses to select source data.

2.

Click Generate SQL.

3.

Click Cancel to exit.

Note: If you do not cancel the SQL query, the Integration Service overrides the default query with the custom SQL query.
Do not connect to the source database. You only connect to the source database when you enter an SQL query that overrides the default query.
You must connect the columns in the Source Qualifier transformation to another transformation or target before you can generate the default query.
Overriding the Default Query
You can alter or override the default query in the Source Qualifier transformation by changing the default settings of the transformation properties. Do not change the list of selected ports or the order in which they appear in the query. This list must match the connected transformation output ports.
When you edit transformation properties, the Source Qualifier transformation includes these settings in the default query. However, if you enter an SQL query, the Integration Service uses only the defined SQL statement. The SQL Query overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation.
Note: When you override the default SQL query, you must enclose all database reserved words in quotes.
Joining Source Data
Use one Source Qualifier transformation to join data from multiple relational tables. These tables must be accessible from the same instance or database server.
When a mapping uses related relational sources, you can join both sources in one Source Qualifier transformation. During the session, the source database performs the join before passing data to the Integration Service. This can increase performance when source tables are indexed.
Tip: Use the Joiner transformation for heterogeneous sources and to join flat files.
Default Join
When you join related tables in one Source Qualifier transformation, the Integration Service joins the tables based on the related keys in each table.
This default join is an inner equijoin, using the following syntax in the WHERE clause:
Source1.column_name = Source2.column_name
The columns in the default join must have:

A primary key-foreign key relationship

Matching datatypes

For example, you might see all the orders for the month, including order number, order amount, and customer name. The ORDERS table includes the order number and amount of each order, but not the customer name. To include the customer name, you need to join the ORDERS and CUSTOMERS tables. Both tables include a customer ID, so you can join the tables in one Source Qualifier transformation.

When you include multiple tables, the Integration Service generates a SELECT statement for all columns used in the mapping. In this case, the SELECT statement looks similar to the following statement:
SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY, CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE, CUSTOMERS.EMAIL, ORDERS.ORDER_ID, ORDERS.DATE_ENTERED, ORDERS.DATE_PROMISED, ORDERS.DATE_SHIPPED, ORDERS.EMPLOYEE_ID, ORDERS.CUSTOMER_ID, ORDERS.SALES_TAX_RATE, ORDERS.STORE_ID
FROM CUSTOMERS, ORDERS
WHERE CUSTOMERS.CUSTOMER_ID=ORDERS.CUSTOMER_ID
The WHERE clause is an equijoin that includes the CUSTOMER_ID from the ORDERS and CUSTOMER tables.
Custom Joins
If you need to override the default join, you can enter contents of the WHERE clause that specifies the join in the custom query. If the query performs an outer join, the Integration Service may insert the join syntax in the WHERE clause or the FROM clause, depending on the database syntax.
You might need to override the default join under the following circumstances:

Columns do not have a primary key-foreign key relationship.

The datatypes of columns used for the join do not match.

You want to specify a different type of join, such as an outer join.

Heterogeneous Joins
To perform a heterogeneous join, use the Joiner transformation. Use the Joiner transformation when you need to join the following types of sources:

Join data from different source databases

Join data from different flat file systems

Join relational sources and flat files

Creating Key Relationships
You can join tables in the Source Qualifier transformation if the tables have primary key-foreign key relationships. However, you can create primary key-foreign key relationships in the Source Analyzer by linking matching columns in different tables. These columns do not have to be keys, but they should be included in the index for each table.
Tip: If the source table has more than 1,000 rows, you can increase performance by indexing the primary key-foreign keys. If the source table has fewer than 1,000 rows, you might decrease performance if you index the primary key-foreign keys.
For example, the corporate office for a retail chain wants to extract payments received based on orders. The ORDERS and PAYMENTS tables do not share primary and foreign keys. Both tables, however, include a DATE_SHIPPED column. You can create a primary key-foreign key relationship in the metadata in the Source Analyzer.
Note, the two tables are not linked. Therefore, the Designer does not recognize the relationship on the DATE_SHIPPED columns.
You create a relationship between the ORDERS and PAYMENTS tables by linking the DATE_SHIPPED columns. The Designer adds primary and foreign keys to the DATE_SHIPPED columns in the ORDERS and PAYMENTS table definitions.
The following figure shows a relationship between two tables:

If you do not connect the columns, the Designer does not recognize the relationships.
The primary key-foreign key relationships exist in the metadata only. You do not need to generate SQL or alter the source tables.
Once the key relationships exist, use a Source Qualifier transformation to join the two tables. The default join is based on DATE_SHIPPED.

Adding an SQL Query

The Source Qualifier transformation provides the SQL Query option to override the default query. You can enter an SQL statement supported by the source database. Before entering the query, connect all the input and output ports you want to use in the mapping.

When you edit the SQL Query, you can generate and edit the default query. When the Designer generates the default query, it incorporates all other configured options, such as a filter or number of sorted ports. The resulting query overrides all other options you might subsequently configure in the transformation.

You can use a parameter or variable as the SQL query or include parameters and variables within the query. When including a string mapping parameter or variable, use a string identifier appropriate to the source system. For most databases, you need to enclose the name of a string parameter or variable in single quotes.

When you include a datetime value or a datetime mapping parameter or variable in the SQL query, change the date format to match the format used by the source. The Integration Service converts a datetime value to a string based on the source system. For more information about date conversion, see Datetime Values.

When creating a custom SQL query, the SELECT statement must list the port names in the order in which they appear in the transformation.

When you override the default SQL query for a session configured for pushdown optimization, the Integration Service creates a view to represent the SQL override. It then runs an SQL query against this view to push the transformation logic to the database.

If you edit the SQL query, you must enclose all database reserved words in quotes.

To override the default query:

1.	Open the Source Qualifier transformation, and click the Properties tab.

2.	Click the Open button in the SQL Query field.

The SQL Editor dialog box appears.

3.	Click Generate SQL.

The Designer displays the default query it generates when querying rows from all sources included in the Source Qualifier transformation.

4.	Enter a query in the space where the default query appears.

Every column name must be qualified by the name of the table, view, or synonym in which it appears. For example, if you want to include the ORDER_ID column from the ORDERS table, enter ORDERS.ORDER_ID. You can double-click column names appearing in the Ports window to avoid typing the name of every column.

You can use a parameter or variable as the query, or you can include parameters and variables in the query.

Enclose string mapping parameters and variables in string identifiers. Alter the date format for datetime mapping parameters and variables when necessary.

5.	Select the ODBC data source containing the sources included in the query.

6.	Enter the user name and password to connect to this database.

7.	Click Validate.

The Designer runs the query and reports whether its syntax was correct.

8.	Click OK to return to the Edit Transformations dialog box. Click OK again to return to the Designer.

Tip: You can resize the Expression Editor. Expand the dialog box by dragging from the borders. The Designer saves the new size for the dialog box as a client setting.

Entering a User-Defined Join

Entering a user-defined join is similar to entering a custom SQL query. However, you only enter the contents of the WHERE clause, not the entire query. When you perform an outer join, the Integration Service may insert the join syntax in the WHERE clause or the FROM clause of the query, depending on the database syntax.

When you add a user-defined join, the Source Qualifier transformation includes the setting in the default SQL query. However, if you modify the default query after adding a user-defined join, the Integration Service uses only the query defined in the SQL Query property of the Source Qualifier transformation.

You can use a parameter or variable as the user-defined join or include parameters and variables within the join. When including a string mapping parameter or variable, use a string identifier appropriate to the source system. For most databases, you need to enclose the name of a string parameter or variable in single quotes.

When you include a datetime parameter or variable, you might need to change the date format to match the format used by the source. The Integration Service converts a datetime parameter and variable to a string based on the source system. For more information about automatic date conversion, see Datetime Values.

To create a user-defined join:

1.	Create a Source Qualifier transformation containing data from multiple sources or associated sources.

2.	Open the Source Qualifier transformation, and click the Properties tab.

3.	Click the Open button in the User Defined Join field.

The SQL Editor dialog box appears.

4.	Enter the syntax for the join.

Do not enter the keyword WHERE at the beginning of the join. The Integration Service adds this keyword when it queries rows.

Enclose string mapping parameters and variables in string identifiers. Alter the date format for datetime mapping parameters and variables when necessary.

Click OK to return to the Edit Transformations dialog box, and then click OK to return to the Designer.

Using Sorted Ports
When you use sorted ports, the Integration Service adds the ports to the ORDER BY clause in the default query. The Integration Service adds the configured number of ports, starting at the top of the Source Qualifier transformation. You might use sorted ports to improve performance when you include any of the following transformations in a mapping:

Aggregator. When you configure an Aggregator transformation for sorted input, you can send sorted data by using sorted ports. The group by ports in the Aggregator transformation must match the order of the sorted ports in the Source Qualifier transformation.

Joiner. When you configure a Joiner transformation for sorted input, you can send sorted data by using sorted ports. Configure the order of the sorted ports the same in each Source Qualifier transformation.

Note: You can also use the Sorter transformation to sort relational and flat file data before Aggregator and Joiner transformations.
Use sorted ports for relational sources only. When using sorted ports, the sort order of the source database must match the sort order configured for the session. The Integration Service creates the SQL query used to extract source data, including the ORDER BY clause for sorted ports. The database server performs the query and passes the resulting data to the Integration Service. To ensure data is sorted as the Integration Service requires, the database sort order must be the same as the user-defined session sort order.
When you configure the Integration Service for data code page validation and run a workflow in Unicode data movement mode, the Integration Service uses the selected sort order to sort character data.
When you configure the Integration Service for relaxed data code page validation, the Integration Service uses the selected sort order to sort all character data that falls in the language range of the selected sort order. The Integration Service sorts all character data outside the language range of the selected sort order according to standard Unicode sort ordering.
When the Integration Service runs in ASCII mode, it ignores this setting and sorts all character data using a binary sort order. The default sort order depends on the code page of the Integration Service.
The Source Qualifier transformation includes the number of sorted ports in the default SQL query. However, if you modify the default query after choosing the Number of Sorted Ports, the Integration Service uses only the query defined in the SQL Query property.
To use sorted ports:

1.

In the Mapping Designer, open a Source Qualifier transformation, and click the Properties tab.

2.

Click in Number of Sorted Ports and enter the number of ports you want to sort.

The Integration Service adds the configured number of columns to an ORDER BY clause, starting from the top of the Source Qualifier transformation.
The source database sort order must correspond to the session sort order.
Tip: Sybase supports a maximum of 16 columns in an ORDER BY clause. If the source is Sybase, do not sort more than 16 columns.

3.

Click OK.

Select Distinct
If you want the Integration Service to select unique values from a source, use the Select Distinct option. You might use this feature to extract unique customer IDs from a table listing total sales. Using Select Distinct filters out unnecessary data earlier in the data flow, which might improve performance.
By default, the Designer generates a SELECT statement. If you choose Select Distinct, the Source Qualifier transformation includes the setting in the default SQL query.
For example, in the Source Qualifier transformation in Joining Source Data, you enable the Select Distinct option. The Designer adds SELECT DISTINCT to the default query as follows:
SELECT DISTINCT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY, CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.EMAIL, ORDERS.ORDER_ID, ORDERS.DATE_ENTERED, ORDERS.DATE_PROMISED, ORDERS.DATE_SHIPPED, ORDERS.EMPLOYEE_ID, ORDERS.CUSTOMER_ID, ORDERS.SALES_TAX_RATE, ORDERS.STORE_ID
FROM
CUSTOMERS, ORDERS
WHERE
CUSTOMERS.CUSTOMER_ID=ORDERS.CUSTOMER_ID
However, if you modify the default query after choosing Select Distinct, the Integration Service uses only the query defined in the SQL Query property. In other words, the SQL Query overrides the Select Distinct setting.
To use Select Distinct:

1.

Open the Source Qualifier transformation in the mapping, and click on the Properties tab.

2.

Check Select Distinct, and Click OK.

Overriding Select Distinct in the Session
You can override the transformation level option to Select Distinct when you configure the session in the Workflow Manager.
To override the Select Distinct option:

1.

In the Workflow Manager, open the Session task, and click the Mapping tab.

2.

Click the Transformations view, and click the Source Qualifier transformation under the Sources node.

3.

In the Properties settings, enable Select Distinct, and click OK.

Adding Pre- and Post-Session SQL Commands

You can add pre- and post-session SQL commands on the Properties tab in the Source Qualifier transformation. You might want to use pre-session SQL to write a timestamp row to the source table when a session begins.

The Integration Service runs pre-session SQL commands against the source database before it reads the source. It runs post-session SQL commands against the source database after it writes to the target.

You can override the SQL commands in the Transformations view on the Mapping tab in the session properties. You can also configure the Integration Service to stop or continue when it encounters errors running pre- or post-session SQL commands.

Use the following guidelines when you enter pre- and post-session SQL commands in the Source Qualifier transformation:

Use any command that is valid for the database type. However, the Integration Service does not allow nested comments, even though the database might.

You can use parameters and variables in source pre- and post-session SQL commands, or you can use a parameter or variable as the command. Use any parameter or variable type that you can define in the parameter file.

Use a semicolon (;) to separate multiple statements. The Integration Service issues a commit after each statement.

The Integration Service ignores semicolons within /*...*/.

If you need to use a semicolon outside of comments, you can escape it with a backslash (\). When you escape the semicolon, the Integration Service ignores the backslash, and it does not use the semicolon as a statement separator.

The Designer does not validate the SQL.

Note: You can also enter pre- and post-session SQL commands on the Properties tab of the target instance in a mapping.

Saturday, May 28, 2011

Interview Question&ANSWERS-2

1.What is the advantages of converting stored procedures into Informatica mappings?
Ans:-
Informatica is intermediate tool which talks to database. If you use the stored proc , the informatica needs to talks to Database for each record since Informatica process record by record. Making a call to database every time is costly.It is good to avoid stored proc as much as possible unless it is very necessary.If you write the same code in informatica, the informatica server will execute the code.

2.How to list Top 10 salary, without using Rank Transmission?
Ans:-
BY USING SORTER TRANSFORMATION USING SORTED PORT AS SAL AND FILTER TRANSFORMATION TO GET FIRST 10 RECORDS

3.I want load the data into target with out 1st and last record.

BY USING FIRST FUNCTION, TO GET THE 1ST RECORD,TO GET THE LAST RECORD ,

SIMPLY USEAGGREGATE TRANSFORAMATION IN THIS DO NOT APPLY GROUP BY FUNCTION

BY DEFALUT IT WILL GENERATE LAST RECORD

OR

USE RANK TRANS

IN =>PROPERTIES  NO OF TOP RANKS:1

NO OF BOTTOM  RANKS:LAST RECORD

4. can u apply SCD2 on flat file tgt? if yes wat is the procedure?

No, you can't apply SCD2 on flat files.

5. we have 6 records in source , i need 2nd record in one target and 5th record in one target or 2nd & 5th record in same target.

Another way is to use rank t/r and load data to target by

filtering 2 and 5 rows.

or

We can use Router t/r also.

create as many conditions u want.

say, if rownum=2 then flow the data to target 1

if rownum=5 then flow to another target

6. Re: If u r giving different parameter file names in workflow and session for a single mapping parameter, then which one will be executed? The one u given in workflow or the one u given in the session?

workflow parameter file

7. How to update records in Target, without using Update Strategy?

YES WITHOUT A pk WE CANT UPDATE THE REC u can write sql query at targetlevel.

in target table properties there isa option updateoverride.

IN SESSION PROPERTIES WE HAVE ONE OPTION TREAT ALL ROWS AS... THERE U SELECT

UPDATE OPTION AND IN THE TARGET PROPERTIES SELECT UPDTAE OPTION.

8. A TABLE CONTAINS SOME NULL VALUES . HOW TO GET (NOT APPLICABLE(NA))

IN PLACE OF THAT NULL VALUE IN TARGET .?

in a ports tab select particular column write N/A

in the Default value text box for the particular column

9. How to load time dimension?

Run the procedure to load the ttime dimension.Its not loaded frequently,but

once or twice a year.

10. What is the main data object present inbetween sourceand target.

The main data object present inbetween source and target is staging layer only,

Staging layer will do eliminate the inconsistency data and gives the result data

object

11.Can we use unconnected lookup as dynamic lookup?

NO

Unconnected lookup will return one port only. But dynamiclookup will return

more than one port and it update andinsert the targt while session runs.

12. LOOKUP Condition is nothing but a Join condition? What type of join

condition it,by default ? Using the LookUP Condition How many types of

relational conditionswe can make ?

lkp is always behave like left outer join..it ll give you all matched

records as well as ummatched records which is not present in base table...and

those unmatched records is ll be null in case of unconnected lkp trn.....

13.  have two flat files.. containing same type of data i want to load it

to dwh..how many source qualifires i need

If the 2 flat files have the same structure,then we can go for filelist

concept ininformatica.

only one source qualifier is needed and the source should be either of the

flat files.

14.when will we use unconnected & connected lookup? Howit will effect on the

performance of mapping?

Connected Lookup

Receives input values directly from the pipeline.

We can use a dynamic or static cache

Supports user-defined default values

Unconnected Lookup

Receives input values from the result of a :LKP expression

in another transformation.

We can use a static cache

Does not support user-defined default values

When you compared both basically connected lookup will return more values

and unconnected returns one value. conn lookup is in the same pipeline of

source and it

will accept dynamic caching. Unconn lookup don't have that faclity but

in some special cases we can use Unconnected. if output of one lookup is going

as input of another lookup this unconnected lookups are favourableMoreover if the

mapping requires multiple lookup using samelookup conditions then its better

to use unconnected lookup& call the lookup wherever required.This also

prevents the mapping from being complex.

15. how do u use sequence created in oracle in informatica?

 Explain with an simple example

Using Stored Porcedure Transformation we can call the Sequence Generator

16. in which situations do u go for sequence generator ?

In the Fallowing Situations we use Sequence Generator

Transformation

1)Creating Primary Key Values

2)Replacing The Missing Key Values

3)When You Apply the Cycle Through a Sequential Range Of

Numbers

17. write a query for how to eliminate the duplicate rows

without using distinct?

DELETE FROM EMP WHERE ROWID NOT IN (SELECT MAX(ROWID)FROM

EMP)GROUP BY EMPNO;

18. source is a flat file empname, empno, sal ram, 101, 1,000 sam, 102, 2,000 ques: my target needs the data to be loaded as sal -1000 and 2000 excluding commas target empname, empno, sal ram, 101, 1000 sam, 102, 2000 how to implement this?

OPEN THE SOURCE FILE-->EDIT-->REPLACE ALL , to NULL.

USE FIXED WITH LENGTH WHILE IMPORTING TO SOURCE AND .....

HOPE IT WILL HELP

19. wht is cdc?how to use it in creation of mappings?

CDC stands for change data capture

CDC defines Whenever data is changed in OLTP Systems Only

that data Will be captured and loaded into our Target

SCD's works internally based on cdc logic

basically cdc's implented by using effective date

20. how can we store previous session logs

Go to Session-->right click -->Select Edit Task then Goto -->Config Object
then set the property

Save Session Log By --Runs

Save Session Log for These Runs --->To Number of Historical Session logs you want

21.How to read rejected data or bad data from bad file and reload it to target?

Correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.

21. In my source table 1000 rec's are there. I want to load 501 rec to 1000 rec into my Target table ?

select * from tab_name where rownum<=1000

minus

select * from tab_name where rownum<=500;

hi connect a sequence generator next to source qualifier into the expression.Use the filter trans to filter rows like where nextval>500.

22. I have a cobol program with a sub program. How ca i find that it is a dynamic call? or static call..?

1st u check the stmt call 'sub-name' or call 'ws-name'...

if it is call 'sub-name' then it is STATIC CALL

OR

If it is call 'ws-name' then it is DYNAMIC CALL.

Labels

Sunday, May 29, 2011

Filter Transformation complete Overview

Source Qualifier Transformation Complete Overview

Transformation Datatypes

Target Load Order

Datetime Values

Parameters and Variables

Default Query

Viewing the Default Query

Overriding the Default Query

Joining Source Data

Default Join

Custom Joins

Heterogeneous Joins

Creating Key Relationships

Using Sorted Ports

Select Distinct

Overriding Select Distinct in the Session