Friday, March 30, 2012

Import Puzzle

Hello -

I have three feeds from sources around the world, each coming in at a
separate time.
These feeds move into a large table (3GB) that is queried by managers.
The feeds are loaded sequentially, and then the previous day's feed
rows are deleted from the table (this is done so that the user's
application is never without data).

The issue is that the import takes a lot of time, as do the deletes.
These is hurting performance significantly. I attempted to fix the
problem by creating separate tables for each feed. I then created a
view with the original table's name and used UNION ALL's. My intention
was that as each feed came in, I'd alter the view with the new table's
name, and then truncate the older table. This met both goals of
concurrency and import/delete speed.

Unfortunately, this view seems to ignore the indexes on the underlying
tables, which devastates performance. I can't index the view, since
altering it makes the index less useful.

I'm looking for a different strategy for loading and deleting the
data, all without disruption to the applications. I'd appreciate any
suggestions...woodyb@.hotmail.com (Buck Woody) wrote in message news:<d4e6e94d.0311250951.76030a77@.posting.google.com>...
> Hello -
> I have three feeds from sources around the world, each coming in at a
> separate time.
> These feeds move into a large table (3GB) that is queried by managers.
> The feeds are loaded sequentially, and then the previous day's feed
> rows are deleted from the table (this is done so that the user's
> application is never without data).
? delete so you never be w/o data ?

> The issue is that the import takes a lot of time, as do the deletes.
> These is hurting performance significantly. I attempted to fix the
This is probably due to your managers querying the data and applying
shared read locks -- while you're trying to insert/delete.

> problem by creating separate tables for each feed. I then created a
> view with the original table's name and used UNION ALL's. My intention
> was that as each feed came in, I'd alter the view with the new table's
> name, and then truncate the older table. This met both goals of
> concurrency and import/delete speed.
Did you put NOLOCK on the view?

> Unfortunately, this view seems to ignore the indexes on the underlying
> tables, which devastates performance. I can't index the view, since
> altering it makes the index less useful.
Which version of MSSQLSERVER are you running? Only 2000 has indexed
views.

> I'm looking for a different strategy for loading and deleting the
> data, all without disruption to the applications. I'd appreciate any
> suggestions...
Ideally you would have 2 separate tables: "real data" and a nice
pretty one for your managers. But if you don't have space etc...,
make sure you put NOLOCKS or "set transaction isolation level read
uncommitted" on every stored procedure and view your managers use.

Hey don't sweat the small stuff. Everything is small stuff. Good
luck.
-- Louis|||"Buck Woody" <woodyb@.hotmail.com> wrote in message
news:d4e6e94d.0311250951.76030a77@.posting.google.c om...
> Hello -
> I have three feeds from sources around the world, each coming in at a
> separate time.
> These feeds move into a large table (3GB) that is queried by managers.
> The feeds are loaded sequentially, and then the previous day's feed
> rows are deleted from the table (this is done so that the user's
> application is never without data).
> The issue is that the import takes a lot of time, as do the deletes.
> These is hurting performance significantly. I attempted to fix the
> problem by creating separate tables for each feed. I then created a
> view with the original table's name and used UNION ALL's. My intention
> was that as each feed came in, I'd alter the view with the new table's
> name, and then truncate the older table. This met both goals of
> concurrency and import/delete speed.
> Unfortunately, this view seems to ignore the indexes on the underlying
> tables, which devastates performance. I can't index the view, since
> altering it makes the index less useful.
> I'm looking for a different strategy for loading and deleting the
> data, all without disruption to the applications. I'd appreciate any
> suggestions...

The two parts that often slow down inserts, updates and deletes are the
indexes and transaction log.

For transactions. If data protection is not required, since you can just
import the feeds again and they are useless after one day, then maybe you
could use a # temp table. The tempdb is more ram based so it works much
faster and has less transaction logs.

For indexes, why not drop the indexes before you do the import and delete
and put them back later. This is much faster than having the (b-tree)
indexes do their balancing tricks after each row is added.

If you can't drop the indexes then consider a low fill factor on the indexes
so they have plenty of space the grow without rebalancing.|||Quick thought off the top of my head.

Generate each table (x3). Create a view of these and generate a new
table from this to combine the data. Then create a final table for
your guys to work with. You can fully control the indexes that is on
the final table. You can mess about with the data in each of the three
sets. You will see the performance issues when creating the combined
table, but the users won't. The move from that into the one to be
reported on should be reasonably OK in comparison as you are simply
moving bulk data, not combining three sets and moving them at the same
time.

HTH

louisducnguyen@.hotmail.com (louis nguyen) wrote in message news:<b0e9d53.0311251501.3e74e293@.posting.google.com>...
> woodyb@.hotmail.com (Buck Woody) wrote in message news:<d4e6e94d.0311250951.76030a77@.posting.google.com>...
> > Hello -
> > I have three feeds from sources around the world, each coming in at a
> > separate time.
> > These feeds move into a large table (3GB) that is queried by managers.
> > The feeds are loaded sequentially, and then the previous day's feed
> > rows are deleted from the table (this is done so that the user's
> > application is never without data).
> ? delete so you never be w/o data ?
> > The issue is that the import takes a lot of time, as do the deletes.
> > These is hurting performance significantly. I attempted to fix the
> This is probably due to your managers querying the data and applying
> shared read locks -- while you're trying to insert/delete.
> > problem by creating separate tables for each feed. I then created a
> > view with the original table's name and used UNION ALL's. My intention
> > was that as each feed came in, I'd alter the view with the new table's
> > name, and then truncate the older table. This met both goals of
> > concurrency and import/delete speed.
> Did you put NOLOCK on the view?
> > Unfortunately, this view seems to ignore the indexes on the underlying
> > tables, which devastates performance. I can't index the view, since
> > altering it makes the index less useful.
> Which version of MSSQLSERVER are you running? Only 2000 has indexed
> views.
> > I'm looking for a different strategy for loading and deleting the
> > data, all without disruption to the applications. I'd appreciate any
> > suggestions...
> Ideally you would have 2 separate tables: "real data" and a nice
> pretty one for your managers. But if you don't have space etc...,
> make sure you put NOLOCKS or "set transaction isolation level read
> uncommitted" on every stored procedure and view your managers use.
> Hey don't sweat the small stuff. Everything is small stuff. Good
> luck.
> -- Louis

No comments:

Post a Comment