Discussion:
SSIS string to int when a preceeding letter
(too old to reply)
dbuchanan
2009-07-22 21:10:01 UTC
Permalink
My data source is an excel file. One column named [employee Id] should
contain numeric data, however every month the excel file is received with
about ten out of 50,000 records having the [employee Id] data proceeded with
a specific letter.

Prior to now the data was manually imported into the data table.

I am now writing an SSIS package to import that data and wish to automate
the correction.

What kind of data flow transformation task do I use to correct this data?

I attempted putting a case statement into the expression column of the
Derived Column Transformation Editor, but that was not accepted.

This is what I tried:

case when left([Employee Id],1) = 'N' then right([Employee Id],len([Employee
Id]) - 1)else [Employee Id] end

---------Note-----------------------
I did a search on the word ‘conditional’ with respect to SSIS on MSDN and
found this page:
http://msdn.microsoft.com/en-us/library/ms141680(SQL.90).aspx

Although the page contained some expression syntax and a lot of explanation
there was nothing on that page to give me a clue about where these
expressions were to be entered or what the expressions meant. There was also
no provision for supplying feedback on the page.
----------------------------------------

What is the correct way of solving my problem of removing a leading letter
from the records that contain leading letters?

thank you.
--
dbuchanan
Todd C
2009-07-23 13:11:02 UTC
Permalink
Ah, Excel. A four letter word for dba's and database developers! All the
business users love it because it is so flexible, and we hate it for the very
same reason!

Try this:
Use an OLE DB connector for the Excel file instead of the stock Excel
Connector. Based on the .xls or .xlsx extension, you will need either the
"Jet 4.0" provider, or the "MS Office 12 Access" provider. Specify the file
name and path in the Server Name property, then click on the All button. At
the very top, in the Extended Properties, put in the following as appropriate:

Excel 8.0; IMEX=1; HDR=Yes
or
Excel 12.0; IMEX=1;Hdr=Yes

Note: the IMEX=1 switch will tell Excel that intermixed data should come in
as string type (WSTR)

Put in a Data Conversion transform to change *ONE* column to the proper
numeric data type. Then configure the Error Output of the task to "Redirect
Rows" to a Derived Column Transform. In there, strip out the first character
of the field.
Have a Union All transform that brings the two flows back together.

Note: If you have multiple columns that you want to scrub in this manner,
you will need a set of Conversion/Derived Column/Union All transforms for
each column. The reason is because just because ONE column is bad, another
column in the same row may NOT be bad. Just because Column A needs to have a
character stripped off, you may not want to strip (a numeral) character from
Column B.

HTH

By the way: there is no CASE construct in SSIS. However, there is an IIF
construct. It looks like this: <True/False statement> ? < True part> : <False
part>
And there is no LEFT( ) function either!

Your logic might look like this:

SUBSTRING(<MyColumn>,1,1) = "N" ? SUBSTRING(<MyColumn>, 2, 255) : <MyColumn>

BUT, that can ONLY be used on string data types.
What happens when the "N" becomes "X" next year?
=====
Todd C
Post by dbuchanan
My data source is an excel file. One column named [employee Id] should
contain numeric data, however every month the excel file is received with
about ten out of 50,000 records having the [employee Id] data proceeded with
a specific letter.
Prior to now the data was manually imported into the data table.
I am now writing an SSIS package to import that data and wish to automate
the correction.
What kind of data flow transformation task do I use to correct this data?
I attempted putting a case statement into the expression column of the
Derived Column Transformation Editor, but that was not accepted.
case when left([Employee Id],1) = 'N' then right([Employee Id],len([Employee
Id]) - 1)else [Employee Id] end
---------Note-----------------------
I did a search on the word ‘conditional’ with respect to SSIS on MSDN and
http://msdn.microsoft.com/en-us/library/ms141680(SQL.90).aspx
Although the page contained some expression syntax and a lot of explanation
there was nothing on that page to give me a clue about where these
expressions were to be entered or what the expressions meant. There was also
no provision for supplying feedback on the page.
----------------------------------------
What is the correct way of solving my problem of removing a leading letter
from the records that contain leading letters?
thank you.
--
dbuchanan
Loading...