TCP connection hang problem (resumes upon new TCP request)

TCP connection hang problem (resumes upon new TCP request)

Post by changxu » Mon, 19 Dec 2005 07:06:15


I'm running a simulation with one client machine and four server
machines
(all in the same LAN and running Fedora Core 2 with kernel
2.6.5-1.358smp). The client sends about 1.2 million requests (each of
size
432 bytes) through a TCP connection to servers and servers read it.

In my first simulation, the client randomly distribute each request to
one
of the four servers and it works fine. However, in my 2nd simulation,
where the clients sends all the requests to a central distributor
(running
on one of the four servers) and the central guy then distribute each
request to
one of the four servers, the TCP connection between the client and the
central
distributor seems to hang, after sometime (from a few minutes to half
an
hour). The client stops writing requests to the socket and the central
guy
stops reading from the socket.
But, if I launch any other TCP connection request (e.g., telnet
xx.xx.xx.xx 80) to the central distributor machine, the program resumes
from wehre it hangs (client starts to write the socket and the central
distributor starts to read the sockets again), although it would hang
again
after a while unless I redo another tcp connection to that machine.


Anyone could provide a clue/hint to solve this problem? Thanks. BTW, I
do
observe that there are about 12 tcp connections in the TIME WAIT status
on the
central distributor server, it is from another thread of the server
process where it periodically opens a new socket, sends a performance
report through that socket to a remote machine, and then closes the
socket
immediately. I guess it should not be the reason of the above problem
but
not quite sure.
 
 
 

TCP connection hang problem (resumes upon new TCP request)

Post by Maxim Yego » Mon, 19 Dec 2005 22:46:07


[]


Use tcpdump/ethereal/etc.. to see what's going on with your connection.