Defunct Drives Issue

Defunct Drives Issue

Post by Joel » Wed, 30 Mar 2005 08:03:16


We've been struggling with a problem for a while now. If anyone has has a
similiar issue, I'd appreciate it if you could share it here as it may lead
me to a solution... The Cluster houses SQL and IIS (bad I know, but it
shouldn't cause the problems we see)

We have the following Cluster hardware:

2 IBM x345 Servers
1 IBM ServeRAID 4MX (RAID 5)

Basically, whenever we have both machines connected to the cluster, at some
point (sometimes days, sometimes weeks) a failure will occur where the
Clustered drives (Data and Quorum) will become defunct. Bringing them back
online and restarting, etc works fine (but this takes a while and always
with risk).

I've been working with IBM for months now to try to troubleshoot this but
nothing has helped to make this a "highly available" environment.
 
 
 

Defunct Drives Issue

Post by Geoff N. H » Wed, 30 Mar 2005 12:14:02

From your description it looks like this is a SCSI cluster. Can you give a
complete description of the SCSI device as well as the physical(RAID) and
logical(LUN) disk layouts for the cluster. I have an idea where your
problem might be, but I need more information to be sure.

Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator

 
 
 

Defunct Drives Issue

Post by Joel » Thu, 31 Mar 2005 00:04:00

Thanks Geoff,

Here's what I think you are looking for:

1.)Both servers are equipped with 2x18GB (Array A) mirrored. They are
connected to the internal channel of the IBM4MX SCSI. This is the logical C:
and D: drives.
The external Channel 1 of the Raid controller connects to the shared scsi.

2.)Array B = 2x18 GB mirrored = Q drive (Quorum) slots 13-14. (Physical
device is a shared IBM SCSI storage array)

3)Array C= 5x18 Raid 5 = S drive (Shared) slots 0-4 (Physical device is
shared IBM SCSI storage array)

Summarized:

LUNs= Q: and S: [Storage Array-Arrays B&C]

C: and D: internal Server [Array A]
 
 
 

Defunct Drives Issue

Post by Geoff N. H » Fri, 01 Apr 2005 22:58:44

Looks like there is a problem sharing a controller between the clustered
resource and the local disk resources. Make the vendor show you where this
is a certified cluster solution. I don't thing shared controllers is
supported.

Any way you slice it, you will get very poor performance from a SCSI storage
array in a clustered environment using RAID5 containers. Clustering
requires that the controllers operate in direct-write mode (no write cache)
so RAID5 is extremely slow.


Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
 
 
 

Defunct Drives Issue

Post by R P » Wed, 13 Apr 2005 23:39:25

his looks identical in concept to HP prepackaged cluster setup that I am
using in an extremely similar fashion (Internal Mirrors are C: only). Have
not have any problems similar to that in 7 months of running.

"Geoff N. Hiten" < XXXX@XXXXX.COM > wrote in message
news: XXXX@XXXXX.COM ...
this
storage
cache)
logical
scsi.
give
your
has
may
where
while