Monday, March 19, 2012

High CPU Utilization on Periodic basis

My db is supporting a high traffic website. It seems that
every 2-3 days, the total cpu utilization suddenly shoots
up to 100% and stays there until I restart sql server.
I've used perf monitor to log total cpu, sql cpu, sql
threads, etc over a period of days and it is relatively
stable with plenty of resources until this sudden spike
occurs. This is not gradual, and I do not see any sudden
spike in memory. Spike is only in cpu, and also # of sql
threads (goes up to around 100). Average total cpu
normally is around 50%.
I've run the blocker script, but there do not seem to be
any blocks going on. I've run profiler to try to catch any
never-ending sp calls or sp hogs, but haven't been able to
catch it either.
It may be that we just need to add more resources, but the
fact that this occurs on a semi-regular schedule is what
is troubling. We don't appear to be getting a sudden
increase in traffic during the problm. Could this be
symptom of an attack? It doesn't seem to be a slow leak-
type, since the graphs show a sudden spike, not a ramp up.
What is unusual, is that the sp that shows up in profiler
as a the most predominant hog during the abnormal high cpu
utlization condition, doesn't normally show up under
normal conditions.What is your system setup (Win Vers, SQL Vers, Service Packs)|||some thoughts...
* cpu spikes like this are very common. I do a lot of tuning and see this
regulalry...
* you would not expect to see blocking during a CPU spike you would expect
to see lower CPU if there are significant waits due to blocking...
* you point out that a particular proc seems to be run during the spike?
Have you looked at the cpu utilization for this proc?
* many times... when I tune things like this... it's not a single big,
expensive proc that causes the spike... but a piece of app code that is
causing a large number of relatively inexpensive procs to be run VERY
frequently. Don't rule that option out by looking just for procs that are
consuming a lot of CPU...
* all the symptoms you're outlined are consistent with a pretty run of the
mill application induced spike. Nothing that seems to strongly suggest an
attack of any kind. Don't rule that out of course... but I'd focus on
profiler to see what sql is using most of the CPU...
--
Brian Moran
Principal Mentor
Solid Quality Learning
SQL Server MVP
http://www.solidqualitylearning.com
"Cynthia" <anonymous@.discussions.microsoft.com> wrote in message
news:0cae01c3f592$333ef460$3a01280a@.phx.gbl...
> My db is supporting a high traffic website. It seems that
> every 2-3 days, the total cpu utilization suddenly shoots
> up to 100% and stays there until I restart sql server.
> I've used perf monitor to log total cpu, sql cpu, sql
> threads, etc over a period of days and it is relatively
> stable with plenty of resources until this sudden spike
> occurs. This is not gradual, and I do not see any sudden
> spike in memory. Spike is only in cpu, and also # of sql
> threads (goes up to around 100). Average total cpu
> normally is around 50%.
> I've run the blocker script, but there do not seem to be
> any blocks going on. I've run profiler to try to catch any
> never-ending sp calls or sp hogs, but haven't been able to
> catch it either.
> It may be that we just need to add more resources, but the
> fact that this occurs on a semi-regular schedule is what
> is troubling. We don't appear to be getting a sudden
> increase in traffic during the problm. Could this be
> symptom of an attack? It doesn't seem to be a slow leak-
> type, since the graphs show a sudden spike, not a ramp up.
> What is unusual, is that the sp that shows up in profiler
> as a the most predominant hog during the abnormal high cpu
> utlization condition, doesn't normally show up under
> normal conditions.
>|||Turn off autogrow for dbs, log files and especially temp db.....manage
growth manually.
"Cynthia" <anonymous@.discussions.microsoft.com> wrote in message
news:0cae01c3f592$333ef460$3a01280a@.phx.gbl...
> My db is supporting a high traffic website. It seems that
> every 2-3 days, the total cpu utilization suddenly shoots
> up to 100% and stays there until I restart sql server.
> I've used perf monitor to log total cpu, sql cpu, sql
> threads, etc over a period of days and it is relatively
> stable with plenty of resources until this sudden spike
> occurs. This is not gradual, and I do not see any sudden
> spike in memory. Spike is only in cpu, and also # of sql
> threads (goes up to around 100). Average total cpu
> normally is around 50%.
> I've run the blocker script, but there do not seem to be
> any blocks going on. I've run profiler to try to catch any
> never-ending sp calls or sp hogs, but haven't been able to
> catch it either.
> It may be that we just need to add more resources, but the
> fact that this occurs on a semi-regular schedule is what
> is troubling. We don't appear to be getting a sudden
> increase in traffic during the problm. Could this be
> symptom of an attack? It doesn't seem to be a slow leak-
> type, since the graphs show a sudden spike, not a ramp up.
> What is unusual, is that the sp that shows up in profiler
> as a the most predominant hog during the abnormal high cpu
> utlization condition, doesn't normally show up under
> normal conditions.
>|||I agree that you should manage growth by doing it manually (or pre arranged
scheduled task) but you should leave auto grow on just in case. Hopefully
you won't ever use it if you manage it correctly but it is a good fail safe.
--
Andrew J. Kelly
SQL Server MVP
"Tiffany" <tiffany.edwards@.vodafone.net> wrote in message
news:evJbGui9DHA.3404@.TK2MSFTNGP09.phx.gbl...
> Turn off autogrow for dbs, log files and especially temp db.....manage
> growth manually.
>
> "Cynthia" <anonymous@.discussions.microsoft.com> wrote in message
> news:0cae01c3f592$333ef460$3a01280a@.phx.gbl...
> > My db is supporting a high traffic website. It seems that
> > every 2-3 days, the total cpu utilization suddenly shoots
> > up to 100% and stays there until I restart sql server.
> >
> > I've used perf monitor to log total cpu, sql cpu, sql
> > threads, etc over a period of days and it is relatively
> > stable with plenty of resources until this sudden spike
> > occurs. This is not gradual, and I do not see any sudden
> > spike in memory. Spike is only in cpu, and also # of sql
> > threads (goes up to around 100). Average total cpu
> > normally is around 50%.
> >
> > I've run the blocker script, but there do not seem to be
> > any blocks going on. I've run profiler to try to catch any
> > never-ending sp calls or sp hogs, but haven't been able to
> > catch it either.
> >
> > It may be that we just need to add more resources, but the
> > fact that this occurs on a semi-regular schedule is what
> > is troubling. We don't appear to be getting a sudden
> > increase in traffic during the problm. Could this be
> > symptom of an attack? It doesn't seem to be a slow leak-
> > type, since the graphs show a sudden spike, not a ramp up.
> >
> > What is unusual, is that the sp that shows up in profiler
> > as a the most predominant hog during the abnormal high cpu
> > utlization condition, doesn't normally show up under
> > normal conditions.
> >
>|||the bad thing about profiler is that it won't catch infinite loop
problems.
you can get the sqlstatement started event (which doesn't show you how
much it cpu it is using), and you'll never get the sqlstatement ended
event because it's in an infinite loop. the ended event is the one that
will show you how much cpu the statement or sp used.
try checking the current activity in enterprise manager (or check
sysprocesses) to find the spid(s) that are using the high cpu and go
from there.
i've had the exact same problem as you (although it was a random
event). i simply killed the problematic spid and everything returned to
normal.
Cynthia wrote:
> My db is supporting a high traffic website. It seems that
> every 2-3 days, the total cpu utilization suddenly shoots
> up to 100% and stays there until I restart sql server.
> I've used perf monitor to log total cpu, sql cpu, sql
> threads, etc over a period of days and it is relatively
> stable with plenty of resources until this sudden spike
> occurs. This is not gradual, and I do not see any sudden
> spike in memory. Spike is only in cpu, and also # of sql
> threads (goes up to around 100). Average total cpu
> normally is around 50%.
> I've run the blocker script, but there do not seem to be
> any blocks going on. I've run profiler to try to catch any
> never-ending sp calls or sp hogs, but haven't been able to
> catch it either.
> It may be that we just need to add more resources, but the
> fact that this occurs on a semi-regular schedule is what
> is troubling. We don't appear to be getting a sudden
> increase in traffic during the problm. Could this be
> symptom of an attack? It doesn't seem to be a slow leak-
> type, since the graphs show a sudden spike, not a ramp up.
> What is unusual, is that the sp that shows up in profiler
> as a the most predominant hog during the abnormal high cpu
> utlization condition, doesn't normally show up under
> normal conditions.|||<<
you can get the sqlstatement started event (which doesn't show you how
much it cpu it is using), and you'll never get the sqlstatement ended
event because it's in an infinite loop. the ended event is the one that
will show you how much cpu the statement or sp used.
depends on how you define an infinite loop. Loops just mean you're doing the
same thing over and over. Of course there will be stmt ending events...
you will not see the end event if there is a bug in sql and the command
truly never ends, but that is quite rare. A command may run for a while...
but it will eventually end... and if something is running THAT long...
it's VERY easy to find the culprit in sysprocesses as you point out.
--
Brian Moran
Principal Mentor
Solid Quality Learning
SQL Server MVP
http://www.solidqualitylearning.com
"ch" <ch@.dontemailme.com> wrote in message
news:40337E84.50912EDB@.dontemailme.com...
> the bad thing about profiler is that it won't catch infinite loop
> problems.
> you can get the sqlstatement started event (which doesn't show you how
> much it cpu it is using), and you'll never get the sqlstatement ended
> event because it's in an infinite loop. the ended event is the one that
> will show you how much cpu the statement or sp used.
> try checking the current activity in enterprise manager (or check
> sysprocesses) to find the spid(s) that are using the high cpu and go
> from there.
> i've had the exact same problem as you (although it was a random
> event). i simply killed the problematic spid and everything returned to
> normal.
>
> Cynthia wrote:
> > My db is supporting a high traffic website. It seems that
> > every 2-3 days, the total cpu utilization suddenly shoots
> > up to 100% and stays there until I restart sql server.
> >
> > I've used perf monitor to log total cpu, sql cpu, sql
> > threads, etc over a period of days and it is relatively
> > stable with plenty of resources until this sudden spike
> > occurs. This is not gradual, and I do not see any sudden
> > spike in memory. Spike is only in cpu, and also # of sql
> > threads (goes up to around 100). Average total cpu
> > normally is around 50%.
> >
> > I've run the blocker script, but there do not seem to be
> > any blocks going on. I've run profiler to try to catch any
> > never-ending sp calls or sp hogs, but haven't been able to
> > catch it either.
> >
> > It may be that we just need to add more resources, but the
> > fact that this occurs on a semi-regular schedule is what
> > is troubling. We don't appear to be getting a sudden
> > increase in traffic during the problm. Could this be
> > symptom of an attack? It doesn't seem to be a slow leak-
> > type, since the graphs show a sudden spike, not a ramp up.
> >
> > What is unusual, is that the sp that shows up in profiler
> > as a the most predominant hog during the abnormal high cpu
> > utlization condition, doesn't normally show up under
> > normal conditions.
>|||Here's an update for anyone interested.
I believe the problem was just a combination of high
traffic and the need for some sp optimization. There were
no looping or blocking sps.
Here's how I troubleshot the problem:
I used sql profiler over a period of time to see which sps
were cpu hogs during the normal periods and the high cpu
periods. The profiler brought to my attention one sp
during the normal periods, and another sp during the peak
periods. Both of these sps were each doing both a select
and insert or update on tables that contained many rows.
I then used query analyzer's execution plan together with
profiler to monitor the performance of those sps on an
offline database copy.
I then measured the performance with and without indexes
on the tables (the tables originally did not have indexes
on columns being searched) I saw a tremendous select
performance improvment with the indexes, with negligible
impact to insertion performance.
I added the indexes to the live servers and got fantastic
results. Now, the cpus are probably underutilized as the
total and sql cpu % is well below 10%. It has been running
like this for many days now and the response time of the
website is excellent.
Thanks to all of those of provided feedback.
>--Original Message--
>My db is supporting a high traffic website. It seems that
>every 2-3 days, the total cpu utilization suddenly shoots
>up to 100% and stays there until I restart sql server.
>I've used perf monitor to log total cpu, sql cpu, sql
>threads, etc over a period of days and it is relatively
>stable with plenty of resources until this sudden spike
>occurs. This is not gradual, and I do not see any sudden
>spike in memory. Spike is only in cpu, and also # of sql
>threads (goes up to around 100). Average total cpu
>normally is around 50%.
>I've run the blocker script, but there do not seem to be
>any blocks going on. I've run profiler to try to catch
any
>never-ending sp calls or sp hogs, but haven't been able
to
>catch it either.
>It may be that we just need to add more resources, but
the
>fact that this occurs on a semi-regular schedule is what
>is troubling. We don't appear to be getting a sudden
>increase in traffic during the problm. Could this be
>symptom of an attack? It doesn't seem to be a slow leak-
>type, since the graphs show a sudden spike, not a ramp up.
>What is unusual, is that the sp that shows up in profiler
>as a the most predominant hog during the abnormal high
cpu
>utlization condition, doesn't normally show up under
>normal conditions.
>.
>

No comments:

Post a Comment