Thursday, February 28, 2013

Multiplying rows with Java transformation

Problem:


One of the lacking features of PowerCenter is a standard component allowing to multiply rows on some condition. Let's say for an input row containing a list of IDs you need to transform it to one row per request, e.g.:
First, let's take a look at Java Transformation. The solution is discussed below.

Overview:


The PowerCenter Client uses the Java Development Kit (JDK) to compile the Java code and generate byte code for the transformation. When you run a session with a Java transformation, the Integration Service uses the Java Runtime Environment (JRE) to execute the byte code and process input rows and generate output rows. A Java Transformation can be active or passive. The type is set when a transformation is created and it cannot be changed later.

A Java Transformation always has one input group and one output group:


A Java Transformation can have input ports, output ports and input/output ports (treated as pass-through ports):

The ports can also be added to a Java transformation by dragging and dropping ports from another transformation (they are created as input/output ports):

The default code window will look like this:


Here are available methods desctibed:

Method Description
commit generates a transaction
failSession throws an exception with an error message and fails the session
generateRow generates an output row for active Java transformations
getInRowType returns the input type of the current row in the transformation
incrementErrorCount increases the error count for the session
isNull checks the value of an input column for NULL
logError writes an error message to the session log
logInfo writes an informational message to the session log
rollback generates a rollback transaction
setNull sets the value of an output column to NULL
setOutRowType sets the update strategy for output rows

All you need to add is basically like this:


for (int i = 1; i <= in_COUNT; i++) {
    generateRow();
}

Of course in_COUNT should contain the pre-calculated number of output ports needed.

Solution:


So, how can we achieve the list of RequestIds as stated in the beginning? Here's the overview of the key part of the mapping:

First, let's prepare some data for the transformation. In this example I've used a dummy variable port as the data source - the incoming data will be clearly visible:


Two output ports are being prepared:

  • RequestIdList (123,456,789,)
  • RequestCount (3)
Note the extra comma in the end. It will come in handy later. Than, there's the Java transformation with the following ports:


The RequestIdList and the sequence id will be needed downstream to extract appropriate substring. Here's the code:

RequestSeq values are created here. It is used in the last expression:

RequestId port will have the appropriate id extracted from the list. Here's the ugly code for that:

IIF(RequestSeq=1,
SUBSTR(RequestIdList,
RequestSeq,
INSTR(RequestIdList,',',RequestSeq-1,RequestSeq)
-1),
SUBSTR(RequestIdList, 
INSTR(RequestIdList,',',RequestSeq,RequestSeq-1)+1,
INSTR(RequestIdList,',',RequestSeq,RequestSeq)-INSTR(RequestIdList,',',RequestSeq,RequestSeq-1)-1
)
)

Each time a new sequence is started, substring from first character to the first occurence of separator is taken. Later on tere is a calculation to determine the index of separator and the length the next separator. This is where adding the comma (which in this case is the separator) is handy - for the last substring the formula doesn't change.

That's it! But... there is a nicer way to achieve it too.

Alternate solution:


As splitting the list is not convenient, we can use the already created Java transformation to handle this part as well. All we need in this case is the input string and the number of Ids. Here's the modified transformation:


Here's the code for reference:


String[] temp;
String delimiter = ",";
temp = RequestIdList.split(delimiter);

for (int i = 0; i < RequestCount; i++) {
    RequestSeq=i;
    Request= temp[i];
    generateRow();
}

Or, even simplier (without any need for pre-calculating the count):


String[] temp;
String delimiter = ",";
temp = RequestIdList.split(delimiter);

for (int i = 0; i < temp.lenght; i++) {
    RequestSeq=i;
    Request= temp[i];
    generateRow();
}



This solution is by far simpllier to implement and maintain.

Trigger as a root cause of ODBC error

Scenario:
One table fed from two different real-time sources. The requirement is to keep only the latest version of a given row in the target.

Solution:
Two real-time workflows with one target with an instead-of trigger to take care of insert-and-updates.

Error:
Code: WRT_8229
Message: FnName: Execute -- [Informatica][ODBC SQL Server Wire Protocol driver]1

Additional symptoms:
All rows reported as "Rejected" with row details logged in session log. However - this was the big surprise - all rows have been properly delivered to the database. With the insert/update logic correctly executed!

Cause:
As part of the trigger definition there was a statement to display the affected row number:
print @@rowcount
The message was treated by ODBC driver as low-severity error. And since all the rows were processed one-by-one, for each and every one the appropiate message ("1") has been returned to ODBC, which added the FnName: Execute -- [Informatica][ODBC SQL Server Wire Protocol driver] information resulting in the above error.

After the "print @@rowcount" has been removed, all statistics were reported correctly.

Monday, February 25, 2013

Looking for a port in transformation

If you are looking for a particular port in a transformation (e.g. to link it to another transformation) you can select the desired transformation and press the first letter of the port name. In the below example, hitting "f" will loop through the higlighted ports:


Unfortunatelly, this doesn't work for Sources or Targets. So, the question is:

How to search Sources and Targets for ports?

The root cause of the trick not working for Sources and Targets is that in fact this works for the first visible column. In case of transformations it Name column - but for sources and targets by default Key is the first column:


It's quite easy to change it. Right-clicking a transformation brings up the menu, where you can choose the Set options...:


This will bring up a box where you can choose and set the order of the columns displayed for a given transformation type. In this particular example, you can choose to display the Name column as the first one for the Target transformation type:

Apply to take advantage of the ability to search through ports in Target transformations. In this example you can see that the Name column is the first one displayed. Hitting 'f' key will now loop over the highlighted ports:


Thursday, February 21, 2013

NewLookupRow vs Update Dynamic Cache Condition vs Associated Expression

There are some neat new features in Informatica PowerCenter 9.1 Lookup Transformation: Update Dynamic Cache Condition and Associated Expression.



 I wanted to try and use them. Here are the details.

SCENARIO:
Each incoming row has a column named 'Version'. We want to check the dynamic cache in lookup to find out if it's newer or older than the cached one. If it is a new version, we want to store it - otherwise we want to discard it. The versions come in random order.

SOLUTION:
I set the Update Dynamic Cache Condition to check if incoming Version is larger that existing Version. I also set the Associated Expression on Sequence to be equal to itself.
NOTE: There is some strange behavior here - you can't simply check the 'Ignore in comparison' box. You need to swich to some other port, check the box, than swich to Associated Expression again.
I also marked it to be ignored in comparison (which shouldn't make any difference, in fact).


What are the results? First version has been inserted to DB, but it never got updated. What was wrong? I forgot to check the Insert Else Update box:


Now everything started to run smoothly. However I wondered what is the logic behind Update Dynamic Cache Condition vs NewLookupRow. After few experiments I came with the following.

CONCLUSION:

If Insert Else Update property is checked:
NewLookupRow =/= 0 when both are TRUE: Update Dynamic Cache Condition resolves to TRUE AND one of the ports is different than its Associated Port or the result of the Associated Expression, i.e.:

  • first, the Associated Expression is executed
  • second, the AE result is compared to port value
If one of the conditions mentioned above is false (eg. UDCC resolves to false OR there are no port differences) the NewLookupRow will be equal to 0.
In other words, it's not the AE that needs to be TRUE, but the result of comparison between port value vs result of AE.

If Insert Else Update property is NOT checked, NewLookupRow is equal to 1 for a new row and 0 for any existing - despite the result of Update Dynamic Cache Comparison or difference in ports.


FATAL ERROR : An unexpected condition occured in file odbcdriver.cpp line 511

I've got a mapping populating number of MS SQL Server tables. For each of the targets I've got two pipelines: Insert and Update. Upstream, there is a dynamic lookup checking if a paritcular row already exists and fetching its Identity Sequence. Then there is a router that splits the flow into two mentioned:

  • Insert flow does not use the Sequence anymore - the port is not connected as the column uses Identity property so the DB takes care of creating new value
  • Update flow uses the Sequence fetched by the lookup to update the correct target row.
I got the following behavior when I tried to execute the workflow:
  • Session has been initiated
  • Source row have been read
  • Just when I expected target rows to show up, the session failed. The last two lines in the sesssion log were like this:
INFO {Timestamp} {Node} {Thread} {Message Code} Start loading table [TableName] at:
FATAL  {Timestamp} {Node} *********** FATAL ERROR : An unexpected condition occured in file [/export/home/builds/pc9x_root/910HF/build/powrmart/common/odl/msodbc/odbcdriver.cpp] line [511]. Aborting the DTM process.  Contact Informatica Global Customer Support. ***********


It took me a while before finding the root cause. I've been working a lot lately with Sybase and I got used to using 'numeric identity' for a port type when the underlying table column was set to Identity. So after creating the target definition, I switched the port type from 'bigint' to 'numeric identity'. Here is the column definition in DB:
CREATE TABLE [dbo].[TableName] (
   [Sequence] [bigint] IDENTITY(1,1) NOT NULL, ... )


There seems to be an issue with that, because after swiching back to bigint, the workflow run fine.



Tuesday, February 19, 2013

FATAL ERROR while running session with XML GENERATOR and pass-through ports

While running a session with XML Generator using few pass-throug ports I got a fatal error. Here are the last two log entries:


Message Code: XMLW_31037
Message: XMLW_31037 Received all the data for all the XML groups of target [xml_.....]. Combining XML groups into final DOM tree...

Message: *********** FATAL ERROR : An unexpected condition occured in file [/export/home/build_root/861HotFix_build/powrmart/server/dmapper/widget/xml/writer/midxmlout.cpp] line [273]. Aborting the DTM process. Contact Informatica Global Customer Support. ***********


Recreating XML Generator from a scratch and adding the same set of pass-throug ports did not help. Then I tried a lucky guess and changed my bigint ports to integers:
Before:

After:
Effect:

It worked. I still don't know why, though...

Monday, February 18, 2013

XML Generator pass-through ports

I'm using XML Generator transformation to create XML using a number of input ports. I'd also like to use an audit table that would contain all input ports along with the generated XML. You can't link however expression ports used to create XML with XML Generator output. XML Generator is an active transformation and will give the "Concatenation disallowed..." error message:
To achieve that you need to send the input ports through the XML Generator. Here's what needs to be done. The below adhrers to XML Parser as well.

To create a pass-through port in a midstream XML transformation:

  1. Edit the XML Transformation and open the Midstream XML Generator (or Parser) tab (the last one)
    The DataOutput or DataInput port appears (depending on the transformation type)
  2. Add needed additional output ports and define appropiate data types
  3. Click XML Editor button:
  4. In the XML Editor right-click the XML definition and choose Add a Reference Port
  5. Select all required ports and click OK
  6. Save the changes and close the editor. In the transformation you should now see the pass-through ports available:
Now it's possible to send everything downstream in one flow.

Wednesday, February 6, 2013

Single quotes in expressions

How to add single quotes in expressions? If you need the expresssion to log some execution parameters, for example? I sometimes do it so it would be easy to re-run a stored procedure manually after error.

However I could not find a proper escape character that would allow single quotes inside the expression. The workaround is to use:
|| CGR(39) ||
inside the expression. Not very convenient... Here are exapmles found at https://community.informatica.com/docs/DOC-1694


EXAMPLE A:

Source data as follows:
'don't call'
Problem: How do you use the IIF function to compare the "don't" string when it contains an apostrophe. The following expression will not work:
IIF (INPUT_STRING='don't call', RESULT1,RESULT2)

Solution: The following tests where the INPUT_STRING contains the word "don't":
IIF (INPUT_STRING='don' || CHR(39) || 't', RESULT1,RESULT2)

EXAMPLE B:

The input is:
Hello World
The output should be:
'Hello World'
Problem: How do you write a string with 'Hello World' using the single quotes as a literal in an expression?

Solution: The following returns the literal phrase 'Hello World':
CHR(39)||'Hello World'||CHR(39)

Disable SQL override parser

Entering a valid query in Lookup SQL Override might result in session failure with "Contact Informatica Global Support" message. You copy the query from log and it runs great in your SQL client, so why does it fail?
The reason for this might be the subquery.

For some reason the parser fails to check the query and refuses to run it. You may try to disable parser in one of two ways:


To use nested loops in a Lookup SQL override do the following:
  1. Add the lookupOverrideParsingSetting custom property to the Integration Service in the Administration Console and set the value as 1:
    LookupOverrideParsingSetting=1
  2. Restart the Integration Service.
An alternative solution is to enter the custom property on session level:
  1. Open the session settings and choose the Config Object tab.
  2. In the Advanced Properties section choose Custom Properties.
  3. Ensure to us the lower case 'l' instead of an upper case 'L'. Otherwise it won't work.
lookupOverrideParsingSetting=1


For full explanation refer to KB doc id 15532.