Monday, March 26, 2012

Import Human-Readable text file into SQL Server 2000

Hello,

I am receiving a text file that is produced from a mainframe that is
out of my control. I am attempting to find a (hopefully clean) way to
import it into a SQL Server database in an automated fashion. I am
not really concerned about how many tables it requires or what the
schema looks like as long as the data remains related and ends up in
its respective fields (I will probably use scratch tables for this).

The data is given to me in a format that is meant to be printed out
and read by human eyes (in a text file). The format looks something
like this:

Begin File:
------------------------
1234 1234 1234 1234 XYZ Company 01/01/2003
......More stuff related to XYZ company for a couple of lines ......
......(this stuff can easily be parsed by position)......

MCARD VISA AMEX DISC
------------------------
TOTAL 11111.11 4444.44 5555.55 30.01
TRANS FEE .20 .20 .15 .15
TRANS AMOUNT 2222.22 888.89 833.33 4.50
DISC .0165 .0165 .0365 .0355
------------------------

ANOTHER HEADER

.........More stuff related to XYZ Company............

End File:

Well, this isn't the exact format, but just an example. The point is
that all of the data in each column is related and should end up in
the same record which is related to the parent record of XYZ Company
(or all in a single record in a single table if that is the closest I
can get).

Also, the rows are not always present. For example, if TRANS FEE
doesn't apply to anything in the row, then the entire row will
collapse and TRANS AMOUNT would be the next line after TOTAL.

I was looking at the bcp utility and dts, but dts doesn't seem to have
the performance capabilities (or reliability for that matter) I am
looking for. Bcp seems like it might work if there is some advanced
formatting commands that I can't find in the documentation - Anyone?

The best I can come up with is to use a high level language such as C#
or VB.NET to parse the text file into another text file that is comma
delimited, and then use the bcp utility (or bulk insert) to import it
into SQL Server where I can then use TSQL to manipulate it how I want.
I am trying to eliminate the high level language parse and just go
straight from file to database. Does anybody know an easier route?

TIAI would probably still go with the high level language. You have multiple
pieces of information in different formats relating to the same bigger
object. It looks as though you have a fixed length part at the top and then
break into a pivot table in the middle. I don't know any format file that
is going to decipher that for you.

--
--

Allan Mitchell (Microsoft SQL Server MVP)
MCSE,MCDBA
www.SQLDTS.com
I support PASS - the definitive, global community
for SQL Server professionals - http://www.sqlpass.org

"Sven" <sstorhaug@.webuniverse.net> wrote in message
news:67ca584a.0309302103.1eb4b10e@.posting.google.c om...
> Hello,
> I am receiving a text file that is produced from a mainframe that is
> out of my control. I am attempting to find a (hopefully clean) way to
> import it into a SQL Server database in an automated fashion. I am
> not really concerned about how many tables it requires or what the
> schema looks like as long as the data remains related and ends up in
> its respective fields (I will probably use scratch tables for this).
> The data is given to me in a format that is meant to be printed out
> and read by human eyes (in a text file). The format looks something
> like this:
>
> Begin File:
> -----------------------
--
> 1234 1234 1234 1234 XYZ Company 01/01/2003
> .....More stuff related to XYZ company for a couple of lines ......
> .....(this stuff can easily be parsed by position)......
> MCARD VISA AMEX DISC
> -----------------------
--
> TOTAL 11111.11 4444.44 5555.55 30.01
> TRANS FEE .20 .20 .15 .15
> TRANS AMOUNT 2222.22 888.89 833.33 4.50
> DISC .0165 .0165 .0365 .0355
> -----------------------
--
> ANOTHER HEADER
> .........More stuff related to XYZ Company............
> End File:
>
> Well, this isn't the exact format, but just an example. The point is
> that all of the data in each column is related and should end up in
> the same record which is related to the parent record of XYZ Company
> (or all in a single record in a single table if that is the closest I
> can get).
> Also, the rows are not always present. For example, if TRANS FEE
> doesn't apply to anything in the row, then the entire row will
> collapse and TRANS AMOUNT would be the next line after TOTAL.
> I was looking at the bcp utility and dts, but dts doesn't seem to have
> the performance capabilities (or reliability for that matter) I am
> looking for. Bcp seems like it might work if there is some advanced
> formatting commands that I can't find in the documentation - Anyone?
> The best I can come up with is to use a high level language such as C#
> or VB.NET to parse the text file into another text file that is comma
> delimited, and then use the bcp utility (or bulk insert) to import it
> into SQL Server where I can then use TSQL to manipulate it how I want.
> I am trying to eliminate the high level language parse and just go
> straight from file to database. Does anybody know an easier route?
> TIA

No comments:

Post a Comment