Showing posts with label iis. Show all posts
Showing posts with label iis. Show all posts

Friday, March 23, 2012

High reliability == single point of failure?

I am trying to learn about high-reliability clusters (for IIS and Sql in my
case) but there is a fundamental thing I don't understand about the approach
that keeps me from swallowing the high-reliability kool-aid:
Currently, I have a "non-high-reliability" (low-reliability?) system --
i.e., one server that handles IIS, Sql, and data store all at once. This
server uses RAID 1 and has redundant power supplies. Let's assume that the
probability of catastrophic failure of this server is P.
Now I move to a high-reliability solution. Microsoft suggests at least two
machines handling IIS in an active/active configuration, two machines
handling Sql in an active/passive configuration, and one SAN-style data
store running RAID 1 (at minimum) with redundant power supplies. For the
sake of argument, let's assume that all five of these new boxes have similar
complexity both to each other and to the server in my original
low-reliability system, and therefore that the probability of individual
failure of any of these servers is also P.
The weakness of my original low-reliability system is that there is a single
point of failure (the server) which may fail catastrophically with
probability P. But my new high-reliability system also has a single point of
failure (the storage disk array) which also may fail catastrophically with
probability P. What exactly have I gained?
Michael Carr
Most SAN storage arrays have multiple controllers, multiple SAN switches
and multiple disks configured in a redundant RAID array.
Therefore, for your storage array to completely fail, you would need
multiple components of it to fail in order for you to completely lose it.
So, in your example, although the storage array may seem a single point of
failure, due to its redundant nature, it is far less likely to fail
completely than any of your other components.
HTH
Lee
"Michael Carr" <mcarr@.umich.edu> wrote in message
news:uX2oCtewFHA.2792@.tk2msftngp13.phx.gbl...
>I am trying to learn about high-reliability clusters (for IIS and Sql in my
>case) but there is a fundamental thing I don't understand about the
>approach that keeps me from swallowing the high-reliability kool-aid:
> Currently, I have a "non-high-reliability" (low-reliability?) system --
> i.e., one server that handles IIS, Sql, and data store all at once. This
> server uses RAID 1 and has redundant power supplies. Let's assume that the
> probability of catastrophic failure of this server is P.
> Now I move to a high-reliability solution. Microsoft suggests at least two
> machines handling IIS in an active/active configuration, two machines
> handling Sql in an active/passive configuration, and one SAN-style data
> store running RAID 1 (at minimum) with redundant power supplies. For the
> sake of argument, let's assume that all five of these new boxes have
> similar complexity both to each other and to the server in my original
> low-reliability system, and therefore that the probability of individual
> failure of any of these servers is also P.
> The weakness of my original low-reliability system is that there is a
> single point of failure (the server) which may fail catastrophically with
> probability P. But my new high-reliability system also has a single point
> of failure (the storage disk array) which also may fail catastrophically
> with probability P. What exactly have I gained?
> Michael Carr
>
|||Lets focus on the storage array first. A standard SCSI enclosure runs on a
single backplane, single controller system. You can run a multi-channel
controller card to two enclosures set up with RAID 1+0 across the enclosures
but you are still on a single controller. A mid-range SAN typically has two
internal controllers designed to check and supplement each other. You also
have multiple attachments (HBAs) from the host computer to the SAN, multiple
power supplies, and multiple enclosure racks. Basically, a well-designed
SAN implementation seems like a single array but in reality the only
non-redundant part of the SAN is the sheet metal enclosure. Those have a
very low failure rate once they are installed.
You can still drop your main production database through operator error, but
that is where processes and procedure either help or hurt your availability.
BTW, clustering IIS servers is a waste of time and money. Those are
typically set up as a pool of stateless servers where computers can be
dropped offline or brought online and the worst a user sees is a slow
response or maybe a retry. The idea of dedicating a server to each task is
for scalability, stability, and managability.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Michael Carr" <mcarr@.umich.edu> wrote in message
news:uX2oCtewFHA.2792@.tk2msftngp13.phx.gbl...
>I am trying to learn about high-reliability clusters (for IIS and Sql in my
>case) but there is a fundamental thing I don't understand about the
>approach that keeps me from swallowing the high-reliability kool-aid:
> Currently, I have a "non-high-reliability" (low-reliability?) system --
> i.e., one server that handles IIS, Sql, and data store all at once. This
> server uses RAID 1 and has redundant power supplies. Let's assume that the
> probability of catastrophic failure of this server is P.
> Now I move to a high-reliability solution. Microsoft suggests at least two
> machines handling IIS in an active/active configuration, two machines
> handling Sql in an active/passive configuration, and one SAN-style data
> store running RAID 1 (at minimum) with redundant power supplies. For the
> sake of argument, let's assume that all five of these new boxes have
> similar complexity both to each other and to the server in my original
> low-reliability system, and therefore that the probability of individual
> failure of any of these servers is also P.
> The weakness of my original low-reliability system is that there is a
> single point of failure (the server) which may fail catastrophically with
> probability P. But my new high-reliability system also has a single point
> of failure (the storage disk array) which also may fail catastrophically
> with probability P. What exactly have I gained?
> Michael Carr
>
|||"Michael Carr" <mcarr@.umich.edu> wrote in message
news:uX2oCtewFHA.2792@.tk2msftngp13.phx.gbl...
>I am trying to learn about high-reliability clusters (for IIS and Sql in my
>case) but there is a fundamental thing I don't understand about the
>approach that keeps me from swallowing the high-reliability kool-aid:
> Currently, I have a "non-high-reliability" (low-reliability?) system --
> i.e., one server that handles IIS, Sql, and data store all at once. This
> server uses RAID 1 and has redundant power supplies. Let's assume that the
> probability of catastrophic failure of this server is P.
> Now I move to a high-reliability solution. Microsoft suggests at least two
> machines handling IIS in an active/active configuration, two machines
> handling Sql in an active/passive configuration, and one SAN-style data
> store running RAID 1 (at minimum) with redundant power supplies. For the
> sake of argument, let's assume that all five of these new boxes have
> similar complexity both to each other and to the server in my original
> low-reliability system, and therefore that the probability of individual
> failure of any of these servers is also P.
> The weakness of my original low-reliability system is that there is a
> single point of failure (the server) which may fail catastrophically with
> probability P. But my new high-reliability system also has a single point
> of failure (the storage disk array) which also may fail catastrophically
> with probability P. What exactly have I gained?
P, in your case is the Probability of your non-HA environment failing. P',
in this case is the probability of the SAN array failing.
P' is extremely low in that all internal channels of a SAN have multiple
paths, all drives configured with in the SAN are RAID protected at some
level or another, all power supplies and fans are highly redundant.
Basically, in a high-end SAN, it would take several failed components to
cause a failure of the SAN itself. Since the SAN is so vital, it will
normally have monitoring solutions keeping an eye on it, and in most cases,
a modem that dials the vendor to report any failure or prefailure
conditions.
So, the probability of P is much higher than the probability of P'.
Russ Kaufmann
MVP - Windows Server - Clustering
http://www.clusterhelp.com - Cluster Website
http://msmvps.com/clusterhelp - New Blog
http://spaces.msn.com/members/russkaufmann - Old Blog
|||"Geoff N. Hiten" <SRDBA@.Careerbuilder.com> wrote in message
news:OZ4pC$fwFHA.3756@.tk2msftngp13.phx.gbl...

> BTW, clustering IIS servers is a waste of time and money. Those are
> typically set up as a pool of stateless servers where computers can be
> dropped offline or brought online and the worst a user sees is a slow
> response or maybe a retry.
I may not agree with Geoff on this subject depending on the definition of
the cluster referred to in the original post. If Geoff interpretted
Michael's post to mean IIS would be configured in a server cluster using
MSCS, then I agree with him. It is a waste of time and money.
However, if Michael meant NLB clustering, then I do not agree with Geoff on
this subjet. IIS with NLB is a very good solution in that it does
1. Provide horizontal scaling of the application front end.
2. Provides high availability in that users can fail over to a surviving
node in the NLB cluster in the event of a node failure.
Applications that do require state information often maintain that state
using the SQL database or cookies on the client side.
Clustering IIS server through NLB is not a waste of time or money if your
business depends on the availability of your web based applications.
Russ Kaufmann
MVP - Windows Server - Clustering
http://www.clusterhelp.com - Cluster Website
http://msmvps.com/clusterhelp - New Blog
http://spaces.msn.com/members/russkaufmann - Old Blog
|||I was referring to Clustering IIS using MSCS as a waste of time and money.
IIS scale-out shoud be done with NLB clustering or any other load-balancing
system. I am indifferent as to whether NLB or a third-party solution is
used. As always, which method is best for you will depend on your exact
situation and requirements. I have used NLB in the past and have found it
quite useful.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Russ Kaufmann [MVP]" <russ@.exchangemct.com> wrote in message
news:eXxH1RswFHA.2656@.TK2MSFTNGP09.phx.gbl...
> "Geoff N. Hiten" <SRDBA@.Careerbuilder.com> wrote in message
> news:OZ4pC$fwFHA.3756@.tk2msftngp13.phx.gbl...
>
> I may not agree with Geoff on this subject depending on the definition of
> the cluster referred to in the original post. If Geoff interpretted
> Michael's post to mean IIS would be configured in a server cluster using
> MSCS, then I agree with him. It is a waste of time and money.
> However, if Michael meant NLB clustering, then I do not agree with Geoff
> on this subjet. IIS with NLB is a very good solution in that it does
> 1. Provide horizontal scaling of the application front end.
> 2. Provides high availability in that users can fail over to a surviving
> node in the NLB cluster in the event of a node failure.
> Applications that do require state information often maintain that state
> using the SQL database or cookies on the client side.
> Clustering IIS server through NLB is not a waste of time or money if your
> business depends on the availability of your web based applications.
>
> --
> Russ Kaufmann
> MVP - Windows Server - Clustering
> http://www.clusterhelp.com - Cluster Website
> http://msmvps.com/clusterhelp - New Blog
> http://spaces.msn.com/members/russkaufmann - Old Blog
>
sql

Monday, March 19, 2012

high cpu, low speed

i have high cpu problem. i don't know why many .net sqlclinet data
provider and IIS hold so much cpu resource. by the way, memery
condition is normal.

when i restart sqlserver, from windows task manager, i see the cpu time
of sqlserver.exe process is low, but it grows gradually, and in two
days, it can grow to as high as 2:xx:xx.

any one can tell me why .net sqlclinet data provider and IIS hold so
much cpu resource? and why cpu time of sqlserver.exe grows gradually?
thanks a lot.

below is the part of a trace report:

TextDataApplicationNameDurationStartTimeReadsWritesCPU
NULL.Net SqlClient Data Provider189123357:46.529168411040076
NULL.Net SqlClient Data Provider277798310:19.27923583730414
NULL.Net SqlClient Data Provider189504657:42.722503411526282
NULL.Net SqlClient Data Provider189957657:38.2161560740425570
NULLInternet Information Services43009602:52.77042962721518
NULL.Net SqlClient Data Provider277362310:19.16577581019828
NULL.Net SqlClient Data Provider68673308:21.03144731014904
NULLInternet Information Services33395630:11.666142618512188
NULLInternet Information Services60609310:07.49384117111124
NULL.Net SqlClient Data Provider37281323:04.9111621010686
NULL.Net SqlClient Data Provider170658028:33.72410832077626
NULLInternet Information Services64606024:35.3238936207390
NULL.Net SqlClient Data Provider109062338:32.911946207109
NULL.Net SqlClient Data Provider89465642:20.85394306778
NULL.Net SqlClient Data Provider78986006:40.766466706483
NULL.Net SqlClient Data Provider68856331:44.82884206420
NULL.Net SqlClient Data Provider79023344:29.73802006077
NULL.Net SqlClient Data Provider56975000:21.587254356049
NULL.Net SqlClient Data Provider110423338:12.911020565453
NULL.Net SqlClient Data Provider56978000:21.5111749415217
NULL.Net SqlClient Data Provider149207631:42.65649825122
NULL.Net SqlClient Data Provider142872027:40.27825734877
NULL.Net SqlClient Data Provider173951628:27.111265694653
NULL.Net SqlClient Data Provider169073328:26.59107804640
NULL.Net SqlClient Data Provider40656306:52.74912304391
NULL.Net SqlClient Data Provider56973601:17.73792804344
NULL.Net SqlClient Data Provider87886028:25.46157204280
NULL.Net SqlClient Data Provider78981306:40.7160337554279
NULL.Net SqlClient Data Provider28615610:42.01521704172
NULL.Net SqlClient Data Provider180126328:07.98573904138
NULL.Net SqlClient Data Provider26623352:11.54391104048
NULL.Net SqlClient Data Provider42412613:13.33311404046
NULL.Net SqlClient Data Provider45828349:48.71569404014
NULL.Net SqlClient Data Provider59893648:20.44212603983
NULLInternet Information Services38987657:32.910983643923
NULL.Net SqlClient Data Provider40825051:56.11565203922
NULL.Net SqlClient Data Provider44153050:05.41727103906
NULL.Net SqlClient Data Provider49392303:29.66312503890
NULL.Net SqlClient Data Provider35761025:20.51493003797
NULL.Net SqlClient Data Provider35456343:38.81570403782
NULL.Net SqlClient Data Provider37890649:44.61523103689
NULL.Net SqlClient Data Provider36323643:30.21596403673
NULL.Net SqlClient Data Provider43392348:08.21566003672
NULL.Net SqlClient Data Provider22059321:53.91434903645
NULL.Net SqlClient Data Provider184018628:04.68771003637
NULL.Net SqlClient Data Provider40776351:38.71646203595
NULL.Net SqlClient Data Provider40970350:47.01393003389
NULL.Net SqlClient Data Provider65773645:36.83863503376
NULL.Net SqlClient Data Provider44501348:17.71387003342
NULL.Net SqlClient Data Provider40331351:24.91290503266
NULLInternet Information Services60418628:05.5146475413035
NULLInternet Information Services60220337:13.79609252955
NULL.Net SqlClient Data Provider34518623:32.612010402955
NULL.Net SqlClient Data Provider30251314:52.01217902937
NULL.Net SqlClient Data Provider37945357:47.331948102891
NULL.Net SqlClient Data Provider35692347:15.61096402564
NULL.Net SqlClient Data Provider37964019:33.51062602453
NULL.Net SqlClient Data Provider161006329:47.711812422391
NULL.Net SqlClient Data Provider75999648:30.68796232376
NULL.Net SqlClient Data Provider76001348:30.65651302187
NULL.Net SqlClient Data Provider158300030:14.79316502157
NULLInternet Information Services55791032:46.851272111967
NULLInternet Information Services25312653:29.67935541893
NULL.Net SqlClient Data Provider95000027:14.33036801875
NULL.Net SqlClient Data Provider56968607:22.5112528421842
NULLInternet Information Services21875054:04.03855301811
NULL.Net SqlClient Data Provider126831335:28.4162994581782
NULL.Net SqlClient Data Provider141667332:47.97308401674
NULLInternet Information Services55525047:41.07781181639
NULL.Net SqlClient Data Provider88247028:21.84272901638
NULLInternet Information Services25129653:31.553638181626
NULL.Net SqlClient Data Provider88981328:14.44857501625
NULLInternet Information Services19851600:16.159345501580
NULL.Net SqlClient Data Provider56970601:17.77347301576
NULLInternet Information Services52651647:51.24524581565
NULL.Net SqlClient Data Provider37881326:26.96875301467
NULL.Net SqlClient Data Provider81989043:15.75487301312
NULLInternet Information Services37651632:48.275602141297
NULL.Net SqlClient Data Provider30257614:51.91455351801266
NULL.Net SqlClient Data Provider94970357:49.66191501238
NULL.Net SqlClient Data Provider34781323:29.93622901203
NULL.Net SqlClient Data Provider76679630:17.53368101170
NULL.Net SqlClient Data Provider107081039:00.44675901109
NULL.Net SqlClient Data Provider49532648:22.548276101095
NULL.Net SqlClient Data Provider50011048:49.252299511078
NULL.Net SqlClient Data Provider37954657:49.7448401062
NULL.Net SqlClient Data Provider88336028:20.94006501049
NULL.Net SqlClient Data Provider24511033:33.0489701032
NULL.Net SqlClient Data Provider88495328:19.33754801031
NULL.Net SqlClient Data Provider60478303:26.6451490968
NULL.Net SqlClient Data Provider49395303:29.5640980955
NULL.Net SqlClient Data Provider54786019:17.0377930938
NULLInternet Information Services29589006:38.7446491907
NULL.Net SqlClient Data Provider33442300:18.767960891
NULL.Net SqlClient Data Provider86037628:43.9126870858
NULLInternet Information Services40493639:47.8471172843
NULL.Net SqlClient Data Provider68973331:53.8315881796
NULL.Net SqlClient Data Provider50568649:43.33473121796
NULL.Net SqlClient Data Provider50345348:13.1276641764
NULL.Net SqlClient Data Provider72609348:20.5210820702
NULL.Net SqlClient Data Provider24520333:32.9381320688
NULL.Net SqlClient Data Provider28625010:41.9285548686
NULL.Net SqlClient Data Provider173781330:13.5169100659
NULL.Net SqlClient Data Provider34089023:36.953030625
NULL.Net SqlClient Data Provider34144023:36.327820625
NULL.Net SqlClient Data Provider41397050:40.9287224610
NULL.Net SqlClient Data Provider34073323:37.0213300608
NULL.Net SqlClient Data Provider53148348:02.7270060595
NULL.Net SqlClient Data Provider35764025:20.5402862594
NULL.Net SqlClient Data Provider56892033:54.6284070593
declare @.P1 int set @.P1=1033 declare @.P2 int set @.P2=20644 exec
p_splitpage @.sql = 'Select
UserName,Status,Qymc,Qylb,areaCode1,SiteHits,Zycp, CONVERT
(varchar(300), Qyjj) AS Qyjj,QyjjStatus,compPhone,compFax From UserInfo
Where isValid=1 and Right(Status,1).Net SqlClient Data
Provider481308:18.211296955578
NULL.Net SqlClient Data Provider49670348:13.6268720563
NULL.Net SqlClient Data Provider49137349:58.5275230562
NULL.Net SqlClient Data Provider37967019:33.5404290562
NULLInternet Information Services39212632:22.61511612543
NULL.Net SqlClient Data Provider51453048:13.0228450532
NULL.Net SqlClient Data Provider65514048:42.740750531
declare @.P1 int set @.P1=5587 declare @.P2 int set @.P2=55867 exec
p_splitpage @.sql = 'Select
Info.picPath,Info.Info_ID,Info.postUser,Info.infoT ype,Info.infoLevel,Info.infoDirect,Info.showname,I nfo.postDateTime,Info.areaCode,CONVERT
(varchar(300), Info.conten.Net SqlClient Data
Provider94014:55.0139021180516
exec p_splitpage 'Select
ID,UserName,PassWord,Qymc,RegisterTime,name,Phone From UserInfo Where
isValid = 1 and UserName like ''%nick%'' and
SUBSTRING(Status,2,3)=''019'' Order By ID DESC',0,30Internet
Information Services131335:39.814030516
NULL.Net SqlClient Data Provider24104607:30.7216430514
NULL.Net SqlClient Data Provider56918635:46.737300499
NULL.Net SqlClient Data Provider36893650:51.7199720486
NULL.Net SqlClient Data Provider102501628:10.9231642485
NULL.Net SqlClient Data Provider54798619:16.9340860485
NULL.Net SqlClient Data Provider42397013:13.4397300483
NULL.Net SqlClient Data Provider37964012:11.1330690470
NULLInternet Information Services18441037:30.3261545468
NULL.Net SqlClient Data Provider43448606:16.9191610468
NULL.Net SqlClient Data Provider36296652:15.4190060436
NULLInternet Information Services11064001:43.984251422
NULL.Net SqlClient Data Provider37975057:40.8210330419
NULL.Net SqlClient Data Provider5279655:44.9159260406
NULL.Net SqlClient Data Provider33442300:18.7325502392
NULLInternet Information Services11436047:40.4148070390
NULL.Net SqlClient Data Provider39812636:26.138510343
NULL.Net SqlClient Data Provider155228331:21.2164340329
NULL.Net SqlClient Data Provider4187655:55.8164293328
NULL.Net SqlClient Data Provider36928023:08.5144192328
Select Count(ID) From UserInfo Where isValid=1 and Right(Status,1)<>'0'
and hy = '0019'.Net SqlClient Data Provider98308:31.9232480298
SELECT TOP 5 L.title, L.siteUrl FROM Links L INNER JOIN LinksCategory C
ON C.PKID = L.CategoryID INNER JOIN LinksType T ON T.typeID = C.typeID
WHERE (T.typeID = 7).Net SqlClient Data
Provider3057654:41.622720282If i recall correctly, CPUTime is the cumulative time spent. It is NOT
the amount of the CPU that is currently being used, but rather the
amount of cpu time that has been used since the process started.

High CPU utilization on Merge Replication with SQL 2005 Mobile

I have a question for anyone who mas some tips/pointers for optimizing SQL merge replication publications.

The front end web server is running IIS 6.0 on Windows 2003 x86 Server Standard (Server A). The back end database server is running SQL 2000 Standard on Windows 2003 x86 Standard (Server B). The merge replication clients connect via HTTPS over the Internet from a custom C#.NET 2005 application using SQL 2005 Mobile running on Windows Mobile 5.0 (Client).

The publication itself has several filters on it. The entry point uses the user's Windows username to start the filter. Based on the user, it then filters the records in multiple tables. There are 68 articles and 44 filter statements. The filters extend multiple layers deep, in other words they are not all filtering off the HOST_NAME() variable, some tables filter from records in tables that filter from the HOST_NAME() variable. The publication is set to minimize data sent to the clients, and considers a subscription out of date if it has not synced in the last 4 days. All the rowguids are indexed as well.

There are approximately 35 clients actively using the application at any given time. On average, a client will initiate a merge replication 3-4 times per hour from 8am-5pm. Generally, a sync will take between 10 seconds and 2 minutes to complete, with most of them being around 30 seconds on average.

When a client starts a sync, there is a spike to about 50% on the server's CPU graph. If multiple clients attempt to sync at the same time the CPU utilization can be pushed to 100% for extended periods (more than 30 seconds).

I recently completed a project to increase the bandwidth available to the clients, and plan to reduce the number of filters significantly (although this will obviously increase the amount of data going to the clients and the storage needs on the individual devices). I also plan on changing the setting to not minimize the amount of data sent to the clients.

Having said all that, does anyone have any information about how to further optimize merge publications to mobile clients? The next publication will be on SQL 2005 x64 Standard if I can solve the issues in the text environment. I would like to enhance the publication as much as possible to make the end user experience better than it currently is.

Thanks!

You're talking about CPU usage at the publisher, correct?

Can you double check that all columns involved in the merge join filters are indexed as well? If the columns are not indexed, this leads to table scans during syncs which can result in high CPU usage.

Are you also getting conflicts? There's a performance issue (which will be fixed in SP1) that can slow things down due to missing indexes on some conflict tables, but this shouldn't be an issue unless you're getting hundreds and hundreds of conflicts.

ALso, how "deep" are your filters? Do the merge join filters have 1 to many relationships, or many to many (see the @.join_unique_key parameter). The more levels deep you are, or any level that contains a @.join_unique_key = 0, can negatively affect performance.

Regardless, you can always run profiler at the publisher and trace a single subscriber to see which procs are consuming the most time. You can start with RPC:completed or SP:completed, and just grab the duration. You'll quickly see which procs are the problematic one. From there, you can then enable SP:StatmentEnded and enable ExecutionPlan to see exactly what statement and why it's slow.

|||

Indexes were definetly a piece of the puzzle. The @.join_unique_key was also in play, so thanks for putting me on to that one. For anyone else who is using Merge Replication with SQL Mobile, there are 2 very useful articles:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_replmergepartitioned.asp

http://msdn2.microsoft.com/en-us/library/ms147840.aspx

High CPU utilization on Merge Replication with SQL 2005 Mobile

I have a question for anyone who mas some tips/pointers for optimizing SQL merge replication publications.

The front end web server is running IIS 6.0 on Windows 2003 x86 Server Standard (Server A). The back end database server is running SQL 2000 Standard on Windows 2003 x86 Standard (Server B). The merge replication clients connect via HTTPS over the Internet from a custom C#.NET 2005 application using SQL 2005 Mobile running on Windows Mobile 5.0 (Client).

The publication itself has several filters on it. The entry point uses the user's Windows username to start the filter. Based on the user, it then filters the records in multiple tables. There are 68 articles and 44 filter statements. The filters extend multiple layers deep, in other words they are not all filtering off the HOST_NAME() variable, some tables filter from records in tables that filter from the HOST_NAME() variable. The publication is set to minimize data sent to the clients, and considers a subscription out of date if it has not synced in the last 4 days. All the rowguids are indexed as well.

There are approximately 35 clients actively using the application at any given time. On average, a client will initiate a merge replication 3-4 times per hour from 8am-5pm. Generally, a sync will take between 10 seconds and 2 minutes to complete, with most of them being around 30 seconds on average.

When a client starts a sync, there is a spike to about 50% on the server's CPU graph. If multiple clients attempt to sync at the same time the CPU utilization can be pushed to 100% for extended periods (more than 30 seconds).

I recently completed a project to increase the bandwidth available to the clients, and plan to reduce the number of filters significantly (although this will obviously increase the amount of data going to the clients and the storage needs on the individual devices). I also plan on changing the setting to not minimize the amount of data sent to the clients.

Having said all that, does anyone have any information about how to further optimize merge publications to mobile clients? The next publication will be on SQL 2005 x64 Standard if I can solve the issues in the text environment. I would like to enhance the publication as much as possible to make the end user experience better than it currently is.

Thanks!

You're talking about CPU usage at the publisher, correct?

Can you double check that all columns involved in the merge join filters are indexed as well? If the columns are not indexed, this leads to table scans during syncs which can result in high CPU usage.

Are you also getting conflicts? There's a performance issue (which will be fixed in SP1) that can slow things down due to missing indexes on some conflict tables, but this shouldn't be an issue unless you're getting hundreds and hundreds of conflicts.

ALso, how "deep" are your filters? Do the merge join filters have 1 to many relationships, or many to many (see the @.join_unique_key parameter). The more levels deep you are, or any level that contains a @.join_unique_key = 0, can negatively affect performance.

Regardless, you can always run profiler at the publisher and trace a single subscriber to see which procs are consuming the most time. You can start with RPC:completed or SP:completed, and just grab the duration. You'll quickly see which procs are the problematic one. From there, you can then enable SP:StatmentEnded and enable ExecutionPlan to see exactly what statement and why it's slow.

|||

Indexes were definetly a piece of the puzzle. The @.join_unique_key was also in play, so thanks for putting me on to that one. For anyone else who is using Merge Replication with SQL Mobile, there are 2 very useful articles:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_replmergepartitioned.asp

http://msdn2.microsoft.com/en-us/library/ms147840.aspx

Monday, March 12, 2012

high CPU time and sql 2000 getting slow performance

Hi,
I admin a server, dual 3,6 xeon with 4GB ram.
Windows 2000 server + SQL 2000 with SP3a + DNS + IIS
The server have around 800 dynamic sites.
SQL 2000 have 1Gb memory for it.
the problem is:
the CPU time after a SQL restart of service keeps getting higher and higher,
i say in a 24h its aroung 24 also memory starts to grow and stays at 1Gb,
wich is the reserved memory for SQL.
Is it normal the CPU time keeping getting higher? or should refresh? like
dllhost.exe in the webserver.
Is it normal the memory stays at 1Gb or also should refresh?
The problem is with some hours/days sites start to open slow and when i
restart the SQL service, sites start to open in few seconds, but after some
day/days start to be slow again.
Its a very busy server with some portals, a few, others sites dynamic also
but with few hits. But almost 100% of sites uses, except the portals, read
for one of the portals database.
Im not the programmer of the sites, and also i dont make the SQL statements
need for each site, or design the databases.
Tkx in advance,
PVIt sounds to me that the server is overloaded.
Normally if it is a live environment, you have a dedicated server just
for MSSQL.
Normally it is not advisable to change the default memory setting. For
optimum performance, SQL Server should be allowed to take as much as
RAM as it wants for its own use without having to compete for RAM with
other applications.
You can turn on some performance monitors to see how busy your system
is and justify to the owner of the machine it is time to upgrade the
hardware.
Here are some monitors you can put on (for memory only):
Memory Object: Pages/Sec
Memory Object: Available Bytes
SQLServer: Memory Manager: Total Server Memory
SQLServer: Memory Manager: Target Server Memory
Alternatively, you can turn on SQL Profiler if you are more interested
what MSSQL doing all day longs. It has CPU, I/O and memory eventlog
you can trace.
Mel|||The first red flag is to have a SQL Server sharing resources with another
application in the same server. It is not the optimal planning. Having said
that, it gets worse if the application is IIS and it is a very busy server
due to the busy portals.
SQL Server behaves somehow like an only child. It wants all CPU and all
possible memory. If you tell SQL Server that it could use from 0 to 1GB, it
will eventually use 1GB and it will keep it just in case.
According with what you describe, you have two demanding applications that
are competeng for the hardware resources on the same machine. This could get
worse if the IIS load increases.
I would look into the shared load between IIS and the SQL Server, meaning
that if the SQL Server is only 25% of the load, DNS users 15%, and IIS takes
the 60% remaining, it may be wise to start planning about getting another
machine to move either the SQL Server or get a bigger IIS Server to handle an
increasing load.
Let me know if this helps..
"PV" wrote:
> Hi,
> I admin a server, dual 3,6 xeon with 4GB ram.
> Windows 2000 server + SQL 2000 with SP3a + DNS + IIS
> The server have around 800 dynamic sites.
> SQL 2000 have 1Gb memory for it.
>
> the problem is:
> the CPU time after a SQL restart of service keeps getting higher and higher,
> i say in a 24h its aroung 24 also memory starts to grow and stays at 1Gb,
> wich is the reserved memory for SQL.
> Is it normal the CPU time keeping getting higher? or should refresh? like
> dllhost.exe in the webserver.
> Is it normal the memory stays at 1Gb or also should refresh?
> The problem is with some hours/days sites start to open slow and when i
> restart the SQL service, sites start to open in few seconds, but after some
> day/days start to be slow again.
> Its a very busy server with some portals, a few, others sites dynamic also
> but with few hits. But almost 100% of sites uses, except the portals, read
> for one of the portals database.
> Im not the programmer of the sites, and also i dont make the SQL statements
> need for each site, or design the databases.
> Tkx in advance,
> PV|||Hi,
Tkx for the directions needed to be made.
SQL needs a dedicated server for it.
Its SQL 2000 that "decreases server performance" after some hours/days. And,
as you say, starts the "competition" of resources between SQL and IIS mostly.
dispite all that sites, IIS doesnt take much amount of processor, except the
portals, one specially, but not all the time like SQL.
Since its SQL that decreases the performance i wanted to know what i asked,
if high cpu time is normal? and use of memory. the answer is positve i
presume.
Is there a way to "recycle" SQL uses of resources? like there is one tool to
IIS 5.0, but not need by the time being at this server since IIS is ok.
Or its prudent to "force" restart once a day/2days to the SQL? i think this
isnt the right way to resolve things.
PS:
we had one server with 1Gb memory with a xeon 2.0 with 500-600 dynamic sites
and only one BD in SQL(512Mb reserved and CPU utilization restrain to 50%
only), a forum and DNS. And the server worked EXCELLENT with no problems for
months and months without no problem always online and keeping increasing
amount of sites (10) per month. the server never reached the 1Gb use of
memory.
but no PORTALS... neither SQL to the portals.
Tkx again for the answer,
PV
"Edgardo Valdez, MCSD, MCDBA" wrote:
> The first red flag is to have a SQL Server sharing resources with another
> application in the same server. It is not the optimal planning. Having said
> that, it gets worse if the application is IIS and it is a very busy server
> due to the busy portals.
> SQL Server behaves somehow like an only child. It wants all CPU and all
> possible memory. If you tell SQL Server that it could use from 0 to 1GB, it
> will eventually use 1GB and it will keep it just in case.
> According with what you describe, you have two demanding applications that
> are competeng for the hardware resources on the same machine. This could get
> worse if the IIS load increases.
> I would look into the shared load between IIS and the SQL Server, meaning
> that if the SQL Server is only 25% of the load, DNS users 15%, and IIS takes
> the 60% remaining, it may be wise to start planning about getting another
> machine to move either the SQL Server or get a bigger IIS Server to handle an
> increasing load.
> Let me know if this helps..
> "PV" wrote:
> > Hi,
> >
> > I admin a server, dual 3,6 xeon with 4GB ram.
> >
> > Windows 2000 server + SQL 2000 with SP3a + DNS + IIS
> >
> > The server have around 800 dynamic sites.
> >
> > SQL 2000 have 1Gb memory for it.
> >
> >
> > the problem is:
> > the CPU time after a SQL restart of service keeps getting higher and higher,
> > i say in a 24h its aroung 24 also memory starts to grow and stays at 1Gb,
> > wich is the reserved memory for SQL.
> >
> > Is it normal the CPU time keeping getting higher? or should refresh? like
> > dllhost.exe in the webserver.
> >
> > Is it normal the memory stays at 1Gb or also should refresh?
> >
> > The problem is with some hours/days sites start to open slow and when i
> > restart the SQL service, sites start to open in few seconds, but after some
> > day/days start to be slow again.
> >
> > Its a very busy server with some portals, a few, others sites dynamic also
> > but with few hits. But almost 100% of sites uses, except the portals, read
> > for one of the portals database.
> >
> > Im not the programmer of the sites, and also i dont make the SQL statements
> > need for each site, or design the databases.
> >
> > Tkx in advance,
> > PV|||You will need to use SQL Profiler to capture the CPU and memory usage.
As the monitor will also cause some system resources, so you normally
turn it on for a period of time that you think it is a typical business
workload.
After you have the trace, you can either just open in Profiler or even
better import into as a trace table (built-in feature of Profiler -
Save as trace table). From the trace you can then see what processes
have highest CPU/memory usage, using select sql.
As you quoted another server running just fine, it could well be some
poorly written sql scripts/stored procedures are running on the server.
With the trace, you can then identify the root of the problem (if
any). E.g. a stored procedure that keep re-compile for no good reason,
session no closed down properly.
Mel|||Using 1 GB of memory isn't unusual...SQL uses as much memory as it can
get ahold of. It's also not unusual to have high CPU times.
Shut down all unecessary services (including IIS, FTP, etc.), enable
the /3GB switch and monitor your I/O & CPU usage. If you still run
into problems, run Profiler to determine what queries, SPs or other
operations are utilitizing your CPU.
Above all, run SQL Server on its own server. Don't mix in IIS
On Tue, 4 Apr 2006 06:42:05 -0700, PV <PV@.discussions.microsoft.com>
wrote:
>Hi,
>I admin a server, dual 3,6 xeon with 4GB ram.
>Windows 2000 server + SQL 2000 with SP3a + DNS + IIS
>The server have around 800 dynamic sites.
>SQL 2000 have 1Gb memory for it.
>
>the problem is:
>the CPU time after a SQL restart of service keeps getting higher and higher,
>i say in a 24h its aroung 24 also memory starts to grow and stays at 1Gb,
>wich is the reserved memory for SQL.
>Is it normal the CPU time keeping getting higher? or should refresh? like
>dllhost.exe in the webserver.
>Is it normal the memory stays at 1Gb or also should refresh?
>The problem is with some hours/days sites start to open slow and when i
>restart the SQL service, sites start to open in few seconds, but after some
>day/days start to be slow again.
>Its a very busy server with some portals, a few, others sites dynamic also
>but with few hits. But almost 100% of sites uses, except the portals, read
>for one of the portals database.
>Im not the programmer of the sites, and also i dont make the SQL statements
>need for each site, or design the databases.
>Tkx in advance,
>PV