Friday, March 23, 2012

Import flat file into SQL Server 2005 Express

I am new to SQL Server, and migrating part of an Access application to
SSE. I am trying to insert a comma delimited file into SSE 2005. I am
able to run a BULK INSERT statement on a simple file, specifying the
field (,) and row (\n) terminators. I can also do the same with a
format file.

Here is the problem. My csv file has 185 columns, with a mixture of
datatypes. Sometimes, a text field will contain the field delimiter as
part of the string. In this case (and only in this case) there will be
double quotes around the string to indicate that the comma is part of
the field, and not a delimiter.

Is there any way to indicate that there is a text delimiter that is
only present some of the time?

If not, any suggestions on getting the data into SSE?

Many thanks for your input.

Cheryl(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

I am new to SQL Server, and migrating part of an Access application to
SSE. I am trying to insert a comma delimited file into SSE 2005. I am
able to run a BULK INSERT statement on a simple file, specifying the
field (,) and row (\n) terminators. I can also do the same with a
format file.
>
Here is the problem. My csv file has 185 columns, with a mixture of
datatypes. Sometimes, a text field will contain the field delimiter as
part of the string. In this case (and only in this case) there will be
double quotes around the string to indicate that the comma is part of
the field, and not a delimiter.


So a file could look like this:

2,34,Enter Sandman,Pat Boone
9,34,Zabadak,"Dave, Dee, Dozy, Mich & Tich"
8,981,"Rebel, Rebel",David Bowie

There is now way to get BULK INSERT to handle this file in that shape.
If I were faced with this file, I would write Perl script that replaced
the commas outside the "" with a different delimiter and then removed the
"". And it would not be trivial.

Most other people would probably try to write a package in Integration
Services, but I have never used Integration Services myself. And for your
part - SQL Express does not come with Integration Services, I believe.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||In message <Xns98B096E86FA00Yazorman@.127.0.0.1>, Erland Sommarskog
<esquel@.sommarskog.sewrites

Quote:

Originally Posted by

(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

>I am new to SQL Server, and migrating part of an Access application to
>SSE. I am trying to insert a comma delimited file into SSE 2005. I am
>able to run a BULK INSERT statement on a simple file, specifying the
>field (,) and row (\n) terminators. I can also do the same with a
>format file.
>>
>Here is the problem. My csv file has 185 columns, with a mixture of
>datatypes. Sometimes, a text field will contain the field delimiter as
>part of the string. In this case (and only in this case) there will be
>double quotes around the string to indicate that the comma is part of
>the field, and not a delimiter.


>
>So a file could look like this:
>
2,34,Enter Sandman,Pat Boone
9,34,Zabadak,"Dave, Dee, Dozy, Mich & Tich"
8,981,"Rebel, Rebel",David Bowie
>
>There is now way to get BULK INSERT to handle this file in that shape.
>If I were faced with this file, I would write Perl script that replaced
>the commas outside the "" with a different delimiter and then removed the
>"". And it would not be trivial.
>
>Most other people would probably try to write a package in Integration
>Services, but I have never used Integration Services myself. And for your
>part - SQL Express does not come with Integration Services, I believe.


Two things to add, both useful options if the amount of data is small.
First, the import filters in MS Access are better than those in SQL
Server. If the data will fit into an Access table that might just do the
trick. Second, spreadsheets have more flexible parsing options than
databases. It may be possible to load the data into a spreadsheet. That
allows different algorithms to be applied to different rows.

Lastly, text files can be opened and read by VBA code in any of the
office languages, or any of the .NET languages. Either could be used,
but writing code to cope with all of the possible options may take time.

--
Bernard Peek
back in search of cognoscenti|||Erland Sommarskog (esquel@.sommarskog.se) writes:

Quote:

Originally Posted by

So a file could look like this:
>
2,34,Enter Sandman,Pat Boone
9,34,Zabadak,"Dave, Dee, Dozy, Mich & Tich"
8,981,"Rebel, Rebel",David Bowie
>
There is now way to get BULK INSERT to handle this file in that shape.
If I were faced with this file, I would write Perl script that replaced
the commas outside the "" with a different delimiter and then removed the
"". And it would not be trivial.


In addition to Bernard's post, is not Excel able to read that format?
In such case open in Except, and save as a tab-delimited file and importing
that should be a breeze. (Assuming, of course, there are no tabs in the
data!)

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Thank you both for your suggestions. Yes, I was thinking that BULK
INSERT was not going to be able to handle this. I had thought about
dumping the file into an Access table first, but the file could be very
large (200,000+ rows). I am going to try the Excel spreadsheet idea.

Erland Sommarskog wrote:

Quote:

Originally Posted by

Erland Sommarskog (esquel@.sommarskog.se) writes:

Quote:

Originally Posted by

So a file could look like this:

2,34,Enter Sandman,Pat Boone
9,34,Zabadak,"Dave, Dee, Dozy, Mich & Tich"
8,981,"Rebel, Rebel",David Bowie

There is now way to get BULK INSERT to handle this file in that shape.
If I were faced with this file, I would write Perl script that replaced
the commas outside the "" with a different delimiter and then removed the
"". And it would not be trivial.


>
In addition to Bernard's post, is not Excel able to read that format?
In such case open in Except, and save as a tab-delimited file and importing
that should be a breeze. (Assuming, of course, there are no tabs in the
data!)
>
--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
>
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

|||(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

Thank you both for your suggestions. Yes, I was thinking that BULK
INSERT was not going to be able to handle this. I had thought about
dumping the file into an Access table first, but the file could be very
large (200,000+ rows). I am going to try the Excel spreadsheet idea.


200000+ rows? Then Access is probably a better bet. Doesn't Excel stop
at 65536 rows?

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Erland Sommarskog wrote:

Quote:

Originally Posted by

(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

>Thank you both for your suggestions. Yes, I was thinking that BULK
>INSERT was not going to be able to handle this. I had thought about
>dumping the file into an Access table first, but the file could be very
>large (200,000+ rows). I am going to try the Excel spreadsheet idea.


>
200000+ rows? Then Access is probably a better bet. Doesn't Excel stop
at 65536 rows?


I think the latest version of Excel may have a higher row limit - which
only increases the tendency of newbies to misuse Excel as a "database".|||You are correct - Excel has a limit on the number of rows. I thought
about that after I sent the reply. So now I am looking at Access.

Here is my next question. I want to use OPENROWSET in a procedure to
get the data from Access into SSE. My code looks something like this:

INSERT INTO sse_table1 Select GFF.*
FROM OPENROWSET(
'Microsoft.Jet.OLEDB.4.0',
'path to mdb';'admin';'',
'Select * FROM access_table1
) as GFF

This works great. However, the location of the access database is only
known at runtime. I can pass the path as a parameter to the stored
procedure, but using it as a variable in OPENROWSET fails. Code looks
like this

CREATE PROCEDURE [dbo].[spImportBillingFile]
@.strTableLocation varchar(255),
@.btSuccess bit OUTPUT
AS
BEGIN

DECLARE @.strConnect varchar(255)
SET @.strConnect = @.strTableLocation

INSERT INTO tbl_ups_eInvoice_tmpData Select GFF.*
FROM OPENROWSET(
'Microsoft.Jet.OLEDB.4.0',
' + @.strConnect + ';'admin';'',
'Select * FROM tbl_ups_eInvoice_tmpData'
) as GFF

Set @.btSuccess = 1

END

Does OPENROWSET not allow a variable to be used?

Thanks again for the help.

Ed Murphy wrote:

Quote:

Originally Posted by

Erland Sommarskog wrote:
>

Quote:

Originally Posted by

(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

Thank you both for your suggestions. Yes, I was thinking that BULK
INSERT was not going to be able to handle this. I had thought about
dumping the file into an Access table first, but the file could be very
large (200,000+ rows). I am going to try the Excel spreadsheet idea.


200000+ rows? Then Access is probably a better bet. Doesn't Excel stop
at 65536 rows?


>
I think the latest version of Excel may have a higher row limit - which
only increases the tendency of newbies to misuse Excel as a "database".

|||(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

INSERT INTO tbl_ups_eInvoice_tmpData Select GFF.*
FROM OPENROWSET(
'Microsoft.Jet.OLEDB.4.0',
' + @.strConnect + ';'admin';'',
'Select * FROM tbl_ups_eInvoice_tmpData'
) as GFF
>
Set @.btSuccess = 1
>
END
>
Does OPENROWSET not allow a variable to be used?


No. Either you have to use dynamic SQL, or define a linked server on the
fly. The former is probably simpler. Look at
http://www.sommarskog.se/dynamic_sql.html#OPENQUERY for a similar example
on how to deal with the nested strings.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Thanks for all the info - I figured out how to use the dynamic sql.

Erland Sommarskog wrote:

Quote:

Originally Posted by

(cabrenner@.optonline.net) writes:

Quote:

Originally Posted by

INSERT INTO tbl_ups_eInvoice_tmpData Select GFF.*
FROM OPENROWSET(
'Microsoft.Jet.OLEDB.4.0',
' + @.strConnect + ';'admin';'',
'Select * FROM tbl_ups_eInvoice_tmpData'
) as GFF

Set @.btSuccess = 1

END

Does OPENROWSET not allow a variable to be used?


>
No. Either you have to use dynamic SQL, or define a linked server on the
fly. The former is probably simpler. Look at
http://www.sommarskog.se/dynamic_sql.html#OPENQUERY for a similar example
on how to deal with the nested strings.
>
>
--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
>
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

sql

No comments:

Post a Comment