Monday, May 13, 2013

Stage Mapping Generator

The Problem

Recently, I've been given a sample flat file with a simple task: prepare a 1:1 mapping to load the file to database. Sounds easy? Sounds familiar? If it does sound familiar, than you probably know it's not easy. As I had to create the appropriate table, the problem was to decide what datatype, length and precision should I choose for each column. What if there's 80 columns? What if there's 10.000 rows in the flat file? Finally, what if you've got 160 files?

Than you've got a lot of work. This led me to a solution I'd like to share: 

Stage Mapping Generator

A simple tool, that would do the whole job. So here it is. All you need to do is put the flat files in one place:


Run the tool and import automaticaly created mappings:


Great, job done! Here is a sample mapping created with SMG - with all port names and datatypes resolved automatically and all string ports trimmed:



Configuration: what can you do?

To make the most out of it, the tool comes with a configuration file. It's possible to use it without any advanced setup, but there are some tricks it allows to make it more powerful. Here is the sample configuraion. Most of the parameters are self-explanatory:

#Full folder path containing source files. All the files from the location below will be analyzed and there will be a mapping created for each file found.
SourceDir: c:\SMG\FlatFiles

#Source file code page. You can set the proper code page used in source files.
SrcFileCodePage: windows-1250

#Full folder path for target xml files. Output XML file (or files) will be created there.
TargetDir: c:\SMG\xml

#Create one output file for all mappings [0/1]. Depending on choice made here there can be one XML file with all the content created (as depicted above) or there can be one XML per each source file created.
CombinedOutput: 0

#Column separator used in flat files:
ColumnSeparator: ,

#Thousand separator used in flat files:
ThousandSeparator: 

#Decimal separator used in flat files:
DecimalSeparator: .

#Import column names from the first row? [1/0]. If there is a header in each of the files, it will be used for the port names.
FirstRowHasNames: 0

#Number of rows used to determine datatype (0 for all). It's possible to limit the number of rows tested for each file. It can be used to limit the memory use and to increase performance.
DatatypeSampleCount: 0

#Trimm incoming strings [0/1]. Enable this if LTRIM/RTRIM should be added to all string columns.
AddTrimming: 1

#Convert DATE fields to date/time format ($$DateFmt parameter and file header  required!). Using this will cause all columns with "Date" in their names to be converted to datetime.
ConvertDates: 1

#Repository description - this can be obtained with exporting sample mapping from repository and checking XML file. This allows identifying appropriate folder.[Repository]
RepositoryVersion: 181.90
RepositoryDescription: <REPOSITORY NAME="DEV" VERSION="181" CODEPAGE="MS1250" DATABASETYPE="Microsoft SQL Server">
RepositoryFolderDescription: <FOLDER NAME="MYFOLDER" GROUP="" OWNER="" SHARED="SHARED" DESCRIPTION="" PERMISSIONS="rwx---r--" UUID="">

#Parameters and variables to be used in mapping:
#syntax: NAME:DATATYPE,DEFAULTVALUE,ISEXPRESSIONVARIABLE,ISPARAM,PRECISION,SCALE
#example: $$UserName:string,,NO,YES,50,0
#date example: $$Date:date/time,,NO,YES,29,9

[MappingVarParam]
$$BusinessDate: string,,NO,YES,12,0
$$DateFmt: string,MM/DD/YYYY,NO,YES,12,0

#Columns to be added to each source:
#Syntax: ColumnName: DATATYPE//PRECISION//SCALE//EXPRESSION
#Use // as separator
#Example: COL1: date/time//29//9//SUBSTR(Price,1,3)

#This section makes it possible to add any additional ports that should be added to each mapping.
[AdditionalColumns]
BusinessDate: date/time//29//9//TO_DATE($$BusinessDate,$$DateFmt)

Links to the tool:

Stage Mapping Generator DEMO
Stage Mapping Generator - full version

Final remarks

All targets created with the Stage Mapping Generator are MS SQL Server. However you can easily change the type on Target Developer if needed.

Feel free to contact me if you'd have any questions or comments!

Wednesday, May 8, 2013

[ODBC SQL Server Wire Protocol driver] Invalid cursor state.

Using Stored Procedure as a aource of data can be a litlle bit tricky. I've created a simple mapping with Stored Procedure invoked in Sql Query of Source Qualifier:

I've created a worflow with 5 sessions - each invoking one of below stored procedures:


------> SCENARIO 1 <------
CREATE PROCEDURE sp_test_mg
AS
BEGIN


DECLARE @result TABLE (batchFile varchar(255), ClientName varchar(255))


INSERT INTO @result
SELECT TOP 10 Batchfile, ClientName FROM test_table


SELECT * FROM @result
END
GO


------> SCENARIO 2 <------
CREATE PROCEDURE sp_test_mg_cur
AS
BEGIN


DECLARE @result TABLE (batchFile varchar(255), ClientName varchar(255))


DECLARE @batchFile varchar(255), @ClientName varchar(255)


INSERT INTO @result
EXEC sp_test_mg


DECLARE test_cursor CURSOR STATIC FOR
SELECT * FROM @result;

OPEN test_cursor
FETCH NEXT FROM test_cursor INTO @batchFile, @ClientName
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @batchFile, @ClientName
FETCH NEXT FROM test_cursor INTO @batchFile, @ClientName
END
CLOSE test_cursor
DEALLOCATE test_cursor


END
GO


------> SCENARIO 3 <------
CREATE PROCEDURE sp_test_mg_tab
AS
BEGIN
SELECT TOP 10 Batchfile, ClientName FROM test_table
END
GO


------> SCENARIO 4 <------
CREATE PROCEDURE sp_test_mg_tab_var
AS
BEGIN
DECLARE @result TABLE (batchFile varchar(255), ClientName varchar(255))


INSERT INTO @result
EXEC sp_test_mg


SELECT * FROM @result
END
GO


------> SCENARIO 5 <------
CREATE PROCEDURE sp_test_mg_temp_tab
AS
BEGIN


CREATE TABLE #result (batchFile varchar(255), ClientName varchar(255))


INSERT INTO #result
EXEC sp_test_mg


SELECT * FROM #result
END
GO


Now all of executions except for solution number 3 fail with an error message: [ODBC SQL Server Wire Protocol driver] Invalid cursor state. What is the issue?

My first thought was that for some reason PowerCenter can't read from temporary structures. The 3rd scenario is the only one using real table. But the reason - and the solution is totally different: SET NOCOUNT ON!

Adding SET NOCOUNT ON to each stored proc makes everything work fine, with one little remark: in Scenario 2 (with the coursor) only one row is read.