Cluster resource SQL Server (InstanceName) failed to come offline

Cluster resource SQL Server (InstanceName) failed to come offline

Post by JBaile » Sun, 02 Nov 2003 02:05:25


I have a 2 node cluster with 3 instances of SQL. One specific instance, when
I attemtp to take it offline produces the following error:

In cluster administrator is shows the SQL Server Resource as Failed

Event Viewer produces the following:

Source:ClusSvc
Event ID: 1117
Description: Cluster resouce SQL Server(InstanceName) failed to come offline

The SQL service for this instance does stop, and the resource also has no
problems coming back online.
This SQL instance also is non-production, and hasnt had much traffic at all

Any ideas why this is happening?

Thanks,
JBailey
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by BB » Sun, 02 Nov 2003 02:46:32

I have the similar situation on cluster with 8GB of RAM per server.
2 instances, each is configured to use 3.5 GB (/PAE, /3GB, AWE enabled).
There are ~350 databases on each instance.
When I take an instance offline it gives me same message.

I believe it's related to the Pending Timeout interval set in the cluster
admin. My understanding is that it just takes longer for SQL to shutdown
than the cluster is willing to wait for it... One time I detached most of
the databases and then the problem disappeared. It returned when I
re-attached the DBs back. Unfortunately, it's a production environment, I
can't play much with it. :-(

B.



when
offline
all

 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by Geoff N. H » Sun, 02 Nov 2003 04:33:54

I have a similar problem with both a clustered and a non-clustered system.
Both have 32GB of RAM and most of it is allocated for SQL. The timeout
definitely needs extending for large memory systems.

--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com




.




no
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by Allan Hir » Tue, 11 Nov 2003 10:09:49

It is not recommended to change the timeout without
consulting PSS first. Things should work without having
to modify. If you're having problems, contact support.

The delays are due to the nature of AWE/PAE since you are
grabbing the memory and have to allocate or deallocate
it. It's not dynamic. So especially in a failover, you
need to wait for SQL Server to grab the memory.
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by Brad Bake » Tue, 10 Aug 2004 01:44:06

Hi -



Recently I added some additional memory to a sever thats part of an active
passive cluster (for a total of 8GB). After which we started receiving an
error message when trying to move the cluster over to the passive server:



Source:ClusSvc
Event ID: 1117
Description: Cluster resouce SQL Server(InstanceName) failed to come offline



I did some searching online but the only thing I could find was a thread on
google news from October 2003 (link included below). I am wondering if
anyone else has also experienced this particular problem and if so how they
may have solved it.



Google Groups Link:

http://www.yqcomputer.com/ %23i%24buE9nDHA.3256%40tk2msftngp13.phx.gbl&rnum=2&prev=/groups%3Fq%3D%2522event%2520id%25201117%2522%26hl%3Den%26lr%3D%26ie%3DUTF-8%26sa%3DN%26tab%3Dwg



Thanks!

Brad Baker
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by SQL M » Tue, 10 Aug 2004 02:10:26

Anything in the SQL Error log?

Did you re-configure SQL to use the additional RAM?

--
--------------------------------
Mike Epprecht, Microsoft SQL Server MVP
Johannesburg, South Africa
Mobile: +27-82-552-0268
IM: XXXX@XXXXX.COM

MVP Program: http://www.yqcomputer.com/

Blog: http://www.yqcomputer.com/



offline
on
they
http://www.yqcomputer.com/ %23i%24buE9nDHA.3256%40tk2msftngp13.phx.gbl&rnum=2&prev=/groups%3Fq%3D%2522event%2520id%25201117%2522%26hl%3Den%26lr%3D%26ie%3DUTF-8%26sa%3DN%26tab%3Dwg
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by Brad Bake » Tue, 10 Aug 2004 03:42:20

Anything in the SQL Error log?

Did you re-configure SQL to use the additional RAM?

Brad
 
 
 

Cluster resource SQL Server (InstanceName) failed to come offline

Post by uttamk » Wed, 11 Aug 2004 13:43:13

The error message indicates that the SQL Server recource fail to come offline. Instead of moving the group, try taking the SQL Server resource offline and see the results/errors ? If it goes offline then I would take all
resources in the SQL group offline, move the group to the other node and bring one resource online at a time -- starting with the resources that are not dependent on any resources i.e the disks first followed by sql ip
resoruce, sql network name and then sql server. Also, I will uncheck "restart" property for the resources to troubleshoot this issue (set it back to default after issue is resolved).

NOTE: Since AWE is enabled, max server memory is set to 7GB and passive node has 4GB, when you move the SQL Group to the passive node, SQL server instance will acquire almost all of the available memory
and leave ONLY up to 128MB fo memory free. For more info, please refer to SQL Server BOL topic "Managing AWE Memory".

Best Regards,

Uttam Parui
Microsoft Corporation

This posting is provided "AS IS" with no warranties, and confers no rights.

Are you secure? For information about the Strategic Technology Protection Program and to order your FREE Security Tool Kit, please visit http://www.yqcomputer.com/

Microsoft highly recommends that users with Internet access update their Microsoft software to better protect against viruses and security vulnerabilities. The easiest way to do this is to visit the following websites:
http://www.yqcomputer.com/
http://www.yqcomputer.com/