The Problem
Recently, I've been given a sample flat file with a simple task: prepare a 1:1 mapping to load the file to database. Sounds easy? Sounds familiar? If it does sound familiar, than you probably know it's not easy. As I had to create the appropriate table, the problem was to decide what datatype, length and precision should I choose for each column. What if there's 80 columns? What if there's 10.000 rows in the flat file? Finally, what if you've got 160 files?
Than you've got a lot of work. This led me to a solution I'd like to share:
Stage Mapping Generator
A simple tool, that would do the whole job. So here it is. All you need to do is put the flat files in one place:
Run the tool and import automaticaly created mappings:
Great, job done! Here is a sample mapping created with SMG - with all port names and datatypes resolved automatically and all string ports trimmed:
#Full folder path containing source files. All the files from the location below will be analyzed and there will be a mapping created for each file found.
SourceDir: c:\SMG\FlatFiles
#Source file code page. You can set the proper code page used in source files.
SrcFileCodePage: windows-1250
#Full folder path for target xml files. Output XML file (or files) will be created there.
TargetDir: c:\SMG\xml
#Create one output file for all mappings [0/1]. Depending on choice made here there can be one XML file with all the content created (as depicted above) or there can be one XML per each source file created.
CombinedOutput: 0
#Column separator used in flat files:
ColumnSeparator: ,
#Thousand separator used in flat files:
ThousandSeparator:
#Decimal separator used in flat files:
DecimalSeparator: .
#Import column names from the first row? [1/0]. If there is a header in each of the files, it will be used for the port names.
FirstRowHasNames: 0
#Number of rows used to determine datatype (0 for all). It's possible to limit the number of rows tested for each file. It can be used to limit the memory use and to increase performance.
DatatypeSampleCount: 0
#Trimm incoming strings [0/1]. Enable this if LTRIM/RTRIM should be added to all string columns.
AddTrimming: 1
#Convert DATE fields to date/time format ($$DateFmt parameter and file header required!). Using this will cause all columns with "Date" in their names to be converted to datetime.
ConvertDates: 1
#Repository description - this can be obtained with exporting sample mapping from repository and checking XML file. This allows identifying appropriate folder.[Repository]
RepositoryVersion: 181.90
RepositoryDescription: <REPOSITORY NAME="DEV" VERSION="181" CODEPAGE="MS1250" DATABASETYPE="Microsoft SQL Server">
RepositoryFolderDescription: <FOLDER NAME="MYFOLDER" GROUP="" OWNER="" SHARED="SHARED" DESCRIPTION="" PERMISSIONS="rwx---r--" UUID="">
#Parameters and variables to be used in mapping:
#syntax: NAME:DATATYPE,DEFAULTVALUE,ISEXPRESSIONVARIABLE,ISPARAM,PRECISION,SCALE
#example: $$UserName:string,,NO,YES,50,0
#date example: $$Date:date/time,,NO,YES,29,9
[MappingVarParam]
$$BusinessDate: string,,NO,YES,12,0
$$DateFmt: string,MM/DD/YYYY,NO,YES,12,0
#Columns to be added to each source:
#Syntax: ColumnName: DATATYPE//PRECISION//SCALE//EXPRESSION
#Use // as separator
#Example: COL1: date/time//29//9//SUBSTR(Price,1,3)
#This section makes it possible to add any additional ports that should be added to each mapping.
[AdditionalColumns]
BusinessDate: date/time//29//9//TO_DATE($$BusinessDate,$$DateFmt)
Stage Mapping Generator - full version
Feel free to contact me if you'd have any questions or comments!
Run the tool and import automaticaly created mappings:
Great, job done! Here is a sample mapping created with SMG - with all port names and datatypes resolved automatically and all string ports trimmed:
Configuration: what can you do?
To make the most out of it, the tool comes with a configuration file. It's possible to use it without any advanced setup, but there are some tricks it allows to make it more powerful. Here is the sample configuraion. Most of the parameters are self-explanatory:#Full folder path containing source files. All the files from the location below will be analyzed and there will be a mapping created for each file found.
SourceDir: c:\SMG\FlatFiles
#Source file code page. You can set the proper code page used in source files.
SrcFileCodePage: windows-1250
#Full folder path for target xml files. Output XML file (or files) will be created there.
TargetDir: c:\SMG\xml
#Create one output file for all mappings [0/1]. Depending on choice made here there can be one XML file with all the content created (as depicted above) or there can be one XML per each source file created.
CombinedOutput: 0
#Column separator used in flat files:
ColumnSeparator: ,
#Thousand separator used in flat files:
ThousandSeparator:
#Decimal separator used in flat files:
DecimalSeparator: .
#Import column names from the first row? [1/0]. If there is a header in each of the files, it will be used for the port names.
FirstRowHasNames: 0
#Number of rows used to determine datatype (0 for all). It's possible to limit the number of rows tested for each file. It can be used to limit the memory use and to increase performance.
DatatypeSampleCount: 0
#Trimm incoming strings [0/1]. Enable this if LTRIM/RTRIM should be added to all string columns.
AddTrimming: 1
#Convert DATE fields to date/time format ($$DateFmt parameter and file header required!). Using this will cause all columns with "Date" in their names to be converted to datetime.
ConvertDates: 1
#Repository description - this can be obtained with exporting sample mapping from repository and checking XML file. This allows identifying appropriate folder.[Repository]
RepositoryVersion: 181.90
RepositoryDescription: <REPOSITORY NAME="DEV" VERSION="181" CODEPAGE="MS1250" DATABASETYPE="Microsoft SQL Server">
RepositoryFolderDescription: <FOLDER NAME="MYFOLDER" GROUP="" OWNER="" SHARED="SHARED" DESCRIPTION="" PERMISSIONS="rwx---r--" UUID="">
#Parameters and variables to be used in mapping:
#syntax: NAME:DATATYPE,DEFAULTVALUE,ISEXPRESSIONVARIABLE,ISPARAM,PRECISION,SCALE
#example: $$UserName:string,,NO,YES,50,0
#date example: $$Date:date/time,,NO,YES,29,9
[MappingVarParam]
$$BusinessDate: string,,NO,YES,12,0
$$DateFmt: string,MM/DD/YYYY,NO,YES,12,0
#Columns to be added to each source:
#Syntax: ColumnName: DATATYPE//PRECISION//SCALE//EXPRESSION
#Use // as separator
#Example: COL1: date/time//29//9//SUBSTR(Price,1,3)
#This section makes it possible to add any additional ports that should be added to each mapping.
[AdditionalColumns]
BusinessDate: date/time//29//9//TO_DATE($$BusinessDate,$$DateFmt)
Links to the tool:
Stage Mapping Generator DEMOStage Mapping Generator - full version
Final remarks
All targets created with the Stage Mapping Generator are MS SQL Server. However you can easily change the type on Target Developer if needed.Feel free to contact me if you'd have any questions or comments!