Tuesday, March 27, 2012

History/archive database

Anybody have any favorite approach to recommend regarding archiving?
We have an OLTP database that we'd like to keep small in the interest
of speed, and on the other hand need to keep virtually all records
around for seven years for auditing and tax purposes. So, the obvious
thing is to have a process that removes closed or otherwise obsolete
logical entities from the OLTP database, and some other process that
moves them to the archive, before or after being removed from OLTP.
(I have done this before and understand the many problems involved in
trying to declare something "closed or otherwise obsolete", but am
looking for discussion, latest ideas, best practices, etc)
We've had suggestions on one side to keep only the hottest records on
the OLTP side, and then try to make it invisible when a query needs to
access something "historical" (like yesterday). I'm more accustomed
to end-of-year processes to clean stuff out. But, we may also want to
duplicate (or replicate) stuff out on a daily or intra-day basis, in
order to run as many reports as possible on the archive instead of the
OLTP, potentially even current YTD reports.
So, many questions, like should the archive database share the OLTP
schema, or be a DW-style star, or something else? Thinking seems to
be that for now, yes we'll keep the archive on the same schema, and
sometime in the next year or three we'll make the transition to a
star. ETL should be custom-written DTS/SQL code rather than, say,
merge replication. Latency should be about 24 hours from OLTP to
archive for duplication, and about once a year for removal from OLTP.
Sound reasonable, or does anything here sound like a mistake?
Thanks.
Josh
Josh,
It's pretty tough to give enough info in a newsgroup post to warrant the
best advise for something like this. There are many things to consider and
knowing more about how your environment is set up and works on a daily basis
will play a big role in how to do this. Some things to keep in mind though.
Smaller tables do not necessarily mean faster. With proper indexing size
does not matter as much for a true OLTP system. Maintenance gets trickier as
the size increases in general. One thing you may want to look at is SQL2005
and the partitioning features. You don't say how the rows are determined to
be "Closed" but partitioning may give you some options that will change your
mind.
Andrew J. Kelly SQL MVP
"jxstern" <jxstern@.nowhere.xyz> wrote in message
news:hgf7o1pasmbsi3um88a6cmq6511d447dd6@.4ax.com...
> Anybody have any favorite approach to recommend regarding archiving?
> We have an OLTP database that we'd like to keep small in the interest
> of speed, and on the other hand need to keep virtually all records
> around for seven years for auditing and tax purposes. So, the obvious
> thing is to have a process that removes closed or otherwise obsolete
> logical entities from the OLTP database, and some other process that
> moves them to the archive, before or after being removed from OLTP.
> (I have done this before and understand the many problems involved in
> trying to declare something "closed or otherwise obsolete", but am
> looking for discussion, latest ideas, best practices, etc)
> We've had suggestions on one side to keep only the hottest records on
> the OLTP side, and then try to make it invisible when a query needs to
> access something "historical" (like yesterday). I'm more accustomed
> to end-of-year processes to clean stuff out. But, we may also want to
> duplicate (or replicate) stuff out on a daily or intra-day basis, in
> order to run as many reports as possible on the archive instead of the
> OLTP, potentially even current YTD reports.
> So, many questions, like should the archive database share the OLTP
> schema, or be a DW-style star, or something else? Thinking seems to
> be that for now, yes we'll keep the archive on the same schema, and
> sometime in the next year or three we'll make the transition to a
> star. ETL should be custom-written DTS/SQL code rather than, say,
> merge replication. Latency should be about 24 hours from OLTP to
> archive for duplication, and about once a year for removal from OLTP.
> Sound reasonable, or does anything here sound like a mistake?
> Thanks.
> Josh
>
|||On Wed, 23 Nov 2005 09:56:07 -0500, "Andrew J. Kelly"
<sqlmvpnooospam@.shadhawk.com> wrote:
>It's pretty tough to give enough info in a newsgroup post to warrant the
>best advise for something like this. There are many things to consider and
>knowing more about how your environment is set up and works on a daily basis
>will play a big role in how to do this. Some things to keep in mind though.
>Smaller tables do not necessarily mean faster. With proper indexing size
>does not matter as much for a true OLTP system. Maintenance gets trickier as
>the size increases in general. One thing you may want to look at is SQL2005
>and the partitioning features. You don't say how the rows are determined to
>be "Closed" but partitioning may give you some options that will change your
>mind.
Andrew,
Thanks for the comments.
I'm busy trying to identify all the info that I can in our
environment, so couldn't even give it all yet in any case.
On size, I agree that a well-designed database should be relatively
insensitive to size, and I sure hope that's true, because in this our
new application we've been running it against about a 3gb database,
but as we're ramping up production it should start growing about
1gb/month now, and 5gb/month soon, so if we're going to be able to
even store one year's data in the main OLTP database, it better scale
gracefully from 3gb to about 60gb! My history concerns are in this
context, that we need to start planning now on how to hold multi-60gb
years of data. Certainly partitioning is going to be called for,
views right now in SQL2K, the new partitioned tables (I suppose) when
we get to SQL2005.
Should we keep it all in something as similar as possible to the OLTP
schema, or in a DW schema, or in a special archive schema? Questions
like that.
Just looking for odd hints and war stories.
Happy Thanksgiving to all!
Josh

No comments:

Post a Comment