Thursday, February 28, 2013

Multiplying rows with Java transformation

Problem:


One of the lacking features of PowerCenter is a standard component allowing to multiply rows on some condition. Let's say for an input row containing a list of IDs you need to transform it to one row per request, e.g.:
First, let's take a look at Java Transformation. The solution is discussed below.

Overview:


The PowerCenter Client uses the Java Development Kit (JDK) to compile the Java code and generate byte code for the transformation. When you run a session with a Java transformation, the Integration Service uses the Java Runtime Environment (JRE) to execute the byte code and process input rows and generate output rows. A Java Transformation can be active or passive. The type is set when a transformation is created and it cannot be changed later.

A Java Transformation always has one input group and one output group:


A Java Transformation can have input ports, output ports and input/output ports (treated as pass-through ports):

The ports can also be added to a Java transformation by dragging and dropping ports from another transformation (they are created as input/output ports):

The default code window will look like this:


Here are available methods desctibed:

Method Description
commit generates a transaction
failSession throws an exception with an error message and fails the session
generateRow generates an output row for active Java transformations
getInRowType returns the input type of the current row in the transformation
incrementErrorCount increases the error count for the session
isNull checks the value of an input column for NULL
logError writes an error message to the session log
logInfo writes an informational message to the session log
rollback generates a rollback transaction
setNull sets the value of an output column to NULL
setOutRowType sets the update strategy for output rows

All you need to add is basically like this:


for (int i = 1; i <= in_COUNT; i++) {
    generateRow();
}

Of course in_COUNT should contain the pre-calculated number of output ports needed.

Solution:


So, how can we achieve the list of RequestIds as stated in the beginning? Here's the overview of the key part of the mapping:

First, let's prepare some data for the transformation. In this example I've used a dummy variable port as the data source - the incoming data will be clearly visible:


Two output ports are being prepared:

  • RequestIdList (123,456,789,)
  • RequestCount (3)
Note the extra comma in the end. It will come in handy later. Than, there's the Java transformation with the following ports:


The RequestIdList and the sequence id will be needed downstream to extract appropriate substring. Here's the code:

RequestSeq values are created here. It is used in the last expression:

RequestId port will have the appropriate id extracted from the list. Here's the ugly code for that:

IIF(RequestSeq=1,
SUBSTR(RequestIdList,
RequestSeq,
INSTR(RequestIdList,',',RequestSeq-1,RequestSeq)
-1),
SUBSTR(RequestIdList, 
INSTR(RequestIdList,',',RequestSeq,RequestSeq-1)+1,
INSTR(RequestIdList,',',RequestSeq,RequestSeq)-INSTR(RequestIdList,',',RequestSeq,RequestSeq-1)-1
)
)

Each time a new sequence is started, substring from first character to the first occurence of separator is taken. Later on tere is a calculation to determine the index of separator and the length the next separator. This is where adding the comma (which in this case is the separator) is handy - for the last substring the formula doesn't change.

That's it! But... there is a nicer way to achieve it too.

Alternate solution:


As splitting the list is not convenient, we can use the already created Java transformation to handle this part as well. All we need in this case is the input string and the number of Ids. Here's the modified transformation:


Here's the code for reference:


String[] temp;
String delimiter = ",";
temp = RequestIdList.split(delimiter);

for (int i = 0; i < RequestCount; i++) {
    RequestSeq=i;
    Request= temp[i];
    generateRow();
}

Or, even simplier (without any need for pre-calculating the count):


String[] temp;
String delimiter = ",";
temp = RequestIdList.split(delimiter);

for (int i = 0; i < temp.lenght; i++) {
    RequestSeq=i;
    Request= temp[i];
    generateRow();
}



This solution is by far simpllier to implement and maintain.

1 comment:

  1. Another nice explanation with an example of splitting string to a set of strings: http://blogs.hexaware.com/informatica-way/java-transformation

    ReplyDelete