e have had a support case for this open for close to a month now, but no
real reply from Red Hat... Are there any kernel NFS gurus here? (If not,
what would be a better forum?) Could you please comment on this issue, and
especially the proposed (quite small) patch below. Thanks in advance...
My initial service request text:
Summary: unlink doesn't work when multiple clients access same files on nfs
This problem requires relatively good understanding of kernel internals
dcache and NFS), so please assign to somebody who actually knows that.
We have a big problem with how the NFS client code in RHEL3 works. (No, this
isn't the same problem as in our two earlier requests, using UDP has cured
that. So read on.) The problem led to a huge unexpected increase in disk
for us after we moved an application from Solaris to RHEL3. We have now been
forced to move it partially back to Solaris servers.
The problem occurs when using a distributed application (ClearCase) where
applications on several client machines (RHEL or Solaris) talk to each other
using RPC and access the same files that are located on an NFS server. It
easily be reproduces without having ClearCase, though.
The scenario is as follows: Let's say we have two NFS client machines, A and
B. A runs RHEL3, B runs any Unix. They both mount a file system from an NFS
server (doesn't matter what kind, we have reproduced the problem with EMC
Celerra, NetApp and Solaris NFS servers).
What happens is this:
0) Let F be a filename on the NFS file system. Initially this file does not
1) The application on the RHEL3 machine A does a stat() on F. The NFS client
in the kernel sends a LOOKUP request to the NFS server, which obviously
returns failure. The stat() fails with ENOENT. OK so far.
2) Immediately afterwards (a few seconds max), the application on machine B
creates the file F. No problems so far.
3) When B is done with F, a few seconds later the application on machine A
does an unlink() on F. Because of the negative dentry caching in the Linux
kernel, it doesn't even bother to send an NFS REMOVE request to the NFS
server, as (it thinks) it knows for sure the file doesn't exist. It lets the
unlink() fail with ENOENT. But the file definitely exists.
The application now thinks that the file F doesn't exist any longer, and
track of it. This means ever increasing disk usage as the above scenario
happens all the time when we run ClearCase builds for our (large) software.
After we moved our view servers from Solaris to RHEL3, the disk usage of our
ClearCase view storage doubled in a few months from 150 GB to close to 300
This was a mystery to us until we found that the view storage file system
full of stranded files that weren't supposed to be left there, and that the
application didn't know of and thus couldn't clean out itself.
(In case you wonder why the applications work like that, well, that's how
ClearCase works. A is a view server, B is a build erver where clearmake jobs
(compilations) are run, and the file F is a view-private file created and
removed during the clearmake run. The view server first checks if F exists,
then B actually creates it, writes to it, then when it isn't needed any
the view server is supposed to remove it.)
It is very easy to demonstrate the problem without ClearCase: Just mount a
file system from a NFS server on two R