Friday, March 30, 2012

Import of SPSS-files

Hello

i'm trying to find out if SSIS is a right toolbox for importing of many SPSS-files into the large market research data warehouse.

A simplified version of SPSS-datasheet looks like this:

- many columns (up to 3000). some of them are interview attributes (InterviewID, Date, etc.) and variables, representing survey questions

- relative few rows with "cases" or "interviews"

- in cells there are answers to the questions (in columns) by respondent (in rows)

As first step i would like to unpivot the dataset in the following way:

- Pass-Through: all interview attributes, that shouldn't be unpivoted

- Input Columns: rest 2990 columns

- Destination Column: Question

- Pivot Key Value Column Name: Answer

It works allright, except that i has to manually define destination column for all 2990 input columns, which takes a lot of time (multiplied by the number of SPSS-files i want to import). Is there a way to automate this (default value for destination column and/or scripting?)

Many thanks for your help!

If you hook your data flow up to an OLE DB destination, there is an option to create the table to match the incoming meta data. This works well and I use it all of the time. Just click the "New..." button next to the table name drop down box.|||

Thanks for your tip. But I think i cannot upload the data without unpivoting it first. I run into 1024 columns pro table restriction of SQL Server.

Again, is there a way to set up "destination column" for all input columns of unpivot transformation automatically?

Many thanks.

|||

Denis Zorenko wrote:

Thanks for your tip. But I think i cannot upload the data without unpivoting it first. I run into 1024 columns pro table restriction of SQL Server.

Again, is there a way to set up "destination column" for all input columns of unpivot transformation automatically?

Many thanks.

Oh, sorry, I misunderstood. No, there is not. Perhaps you could write your own application to build an SSIS package programmatically, but using the out-of-the-box functionality will not allow you to do what you wish.|||

Many thanks Phil!

Does that mean that the script component won't do the job?

|||Sure. The script component could do it, but it'll be a long, drawn out challenge, I'd presume. Well, the problem is that the script component would need to know its output columns upfront, so you'd still have work ahead of you.|||

Well, the output columns are known. They are InterviewID, Question, Answer.

"InterviewID" is a column of the datasheet (pass-through). Its name could be passed as variable.

"Question" contains all other column names (pivot key value column name)

"Answer" contains the values of the datasheet cells (destination column)

What would be the best way to start out?

Many thanks

|||

Check this post and see if it works for you:

http://agilebi.com/cs/blogs/jwelch/archive/2007/05/18/dynamically-pivoting-columns-to-rows.aspx

Seems like it would be a good fit for your scenario.

|||Thanks. That will help me to start out.

No comments:

Post a Comment