Sunday, May 22, 2011

Q7. How can we implement Aggregation operation without using an Aggregator Transformation in Informatica. Ans. We will use the very basic concept of the Expression Transformation that at a time we can access the previous row data as well as the currently processed data in an expression transformation. What we need is simple Sorter, Expression and Filter transformation to achieve aggregation at Informatica level.

Q7. How can we implement Aggregation operation without using an Aggregator Transformation in Informatica.

Ans.
We will use the very basic concept of the Expression Transformation that at a time we can access the previous row data as well as the currently processed data in an expression transformation. What we need is simple Sorter, Expression and Filter transformation to achieve aggregation at Informatica level.

General Idea of Aggregation without Aggregator Transformation

Let us take an example: Suppose we want to find the SUM of SALARY for Each Department of the Employee Table. The SQL query for this would be:
SELECT DEPTNO,SUM(SALARY) FROM EMP_SRC GROUP BY DEPTNO;
If we need to implement this in Informatica, it would be very easy as we would obviously go for an Aggregator Transformation. By taking the DEPTNO port as GROUP BY and one output port as SUM(SALARY the problem can be solved easily.
Now the trick is to use only Expression to achieve the functionality of Aggregator expression. We would use the very funda of the expression transformation of holding the value of an attribute of the previous tuple over here.

But wait... why would we do this? Aren't we complicating the thing here?

Yes, we are. But as it appears, in many cases, it might have an performance benefit (especially if the input is already sorted or when you know input data will not violate the order, like you are loading daily data and want to sort it by day). Remember Informatica holds all the rows in Aggregator cache for aggregation operation. This needs time and cache space and this also voids the normal row by row processing in Informatica. By removing the Aggregator with an Expression, we reduce cache space requirement and ease out row by row processing. The mapping below will show how to do this
Mapping for Aggregation with Expression and Sorter only Image: Aggregation with Expression and Sorter 1

Sorter (SRT_SAL) Ports Tab

Now I am showing a sorter here just illustrate the concept. If you already have sorted data from the source, you need not use this thereby increasing the performance benefit.
Expression (EXP_SAL) Ports Tab
Image: Expression Ports Tab Properties

Sorter (SRT_SAL1) Ports Tab


Expression (EXP_SAL2) Ports Tab


Filter (FIL_SAL) Properties Tab

No comments:

Post a Comment