the utime system call is slow

the utime system call is slow

Post by Matt » Wed, 02 Aug 2006 04:04:01


I have two Suns, one 8 processor 1280 and a dual processor Nettra 240.
We primarily run compiles and a lot of in house development tools on
them. Performance on the 1280 has been degrading and when I was asked
to look into it, I saw that operations that use the utime system call
across NFS take significantly longer on the 1280 than on the 240 as
seen by truss:

syscall seconds calls
utime 466.91 3202

syscall seconds calls
utime 25.37 3202

I have rerun my tests several time over the course of several days and
continue to see the same behavior. I dusted off my copy of System
Performance Tuning and started going down the list to see if there were
any significant bottlenecks:

Network:
netstat -i shows no errors or collisions for the client or NFS server I
am updating timestamps on. Did I mention that the slow server was
running with IP load balancing on two gigabit network interfaces and
the fast one was running at 100M? It isn't network IO.

Memory:
with 20G of RAM, there isn't any significant paging or swapping

CPU:
a lot of time spent in system CPU according to the output of sar, often
two to three times as much spent in system as in user time. Also,
virtual adrian from the se toolkit frequently reports mutex contention
on the 1280

NFS:
I'd love to blame it, but why do the same operations perform speedily
on the other server? (and a few others too, but lets leave them out to
keep things simple) nfsstat -c does show that a lot of calls are for
getattr, but given that this machine's job is to iterate over large NFS
filesystem repeatedly to run compiles, a high getattr count seems
reasonable. I did play with file attribute caching mount options, but
they havn't helped and I think getattr may be a red herring.
Retransmissions were less than 0.01%
The DNLC hit ratio is 99% according to vmstat -s


So, what tools are out there and available to go deeper into the system
than truss?

tia,
Matt
 
 
 

the utime system call is slow

Post by Rich Tee » Wed, 02 Aug 2006 05:40:23


DTrace.

--
Rich Teer, SCNA, SCSA, OpenSolaris CAB member

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.yqcomputer.com/

 
 
 

the utime system call is slow

Post by Matt » Wed, 02 Aug 2006 06:32:32


Right, critical piece I forget to mention. Solaris 2.8
 
 
 

the utime system call is slow

Post by Michael Vi » Wed, 02 Aug 2006 07:24:38

In article < XXXX@XXXXX.COM >,





Time for an upgrade.

--
DeeDee, don't press that button! DeeDee! NO! Dee...
 
 
 

the utime system call is slow

Post by tunl » Wed, 02 Aug 2006 15:07:36


If mpstat(1M) is reporting high values of "smtx" - " sleep on
mutex "
you probably have a softare running in the machine that is not
multithreaded enough for an 8 CPU system.
It may be that the impact is not noticable on your 2 CPU system.
also the CPU's in the 240 has higher clock speed and flies trhough
the critical code sections faster.

smtx values above 200 was considered "high" for the US-II
I had a couple of servers once that had smtx values above 1000
and they were slow.........

//Lars
 
 
 

the utime system call is slow

Post by Matt » Wed, 02 Aug 2006 22:50:09


I would love to, but current development software requirements keep my
company on 2.8.
 
 
 

the utime system call is slow

Post by Matt » Wed, 02 Aug 2006 22:51:43


Thanks Lars. Over what interval should I check?
mpstat 1
mpstat 5
mpstat 10