[ace-users] Detection of one silent peer blocks all other peer processes from detecting.

[ace-users] Detection of one silent peer blocks all other peer processes from detecting.

Post by omer » Sat, 16 Aug 2003 01:57:02


I'm doing a followup on my problem from prev week - I also want to
make sure I thank you. I post it on the google group so info is

Problem description is in the post following, at the end.

You were right , and at least part of the problem of serialization of
calls to the handle_input() of peers [of the dead peer] was open
handles present while spawning processes.

In NT the problem of was solved by setting the handle_inheritence flag
to false in the Process_Options that is used to spawn the process-es.

In Unix This flag has no influence as far as I see, so we're trying to
close the open handles imediately upon beginning of the new thread.
With ACE this is a bit tricky as I only gain control after the exec
system command took place so I don't have access to old pointers to
the handles.

Thanx again, Omer Shibolet.

-----Original Message-----
From: Douglas C. Schmidt [mailto: XXXX@XXXXX.COM ]
Sent: 28 2003 16:35 To: Omer Shibolet; XXXX@XXXXX.COM
Subject: Re: ACE: Detection of one silent peer blocks all other peer
processes from detecting.

Hi Omer,
>> Let me first say how nice is it to use ACE, it's one hell of a >> software :), very nice design too.

Thanks very much!
>> I'll express myself more freely :

Great, I appreciate that.
>> I have a server process that spawns several worker processes and >> connects with them. The server accepts requests from clients, and >> routes these requests (ACE_SOCK_STREAM s) to the worker processes.

>> I am trying to handle the case when the server dies.

>> So all the worker-processes listen (reactor) on the pipe that connects them to >> the server (each worker process is conected to the server using an SPIPE and a >> Pipe). Each process has its own line and own reactor of course.

>> Now the server process aborts - it doesn't appear in the system (nt or unix) an >> ymore.

>> I see the first worker process receives its "handle_input" while the >> other proc esses are still "stuck". Only after the first worker >> process terminates (as a r esult of a recv(<<=0), the next worker >> process receives its handle_input() even t that signals the death of >> the server. The 3rd worker process will only get to the >> handle_input() after the second finished.

>> It is strange, as if some synchronous process is taking place, >> notifying all th e processes, each in its turn of the server's >> death.

That sounds like an OS-level IPC issue, not an ACE-level issue!!
>> BTW, maybe this is related, but I also noticed that if one of the >> worker-proces s is up, the server cannot open its acceptor, an error >> "address is in use" occu rs as if the worker process occupies it.

It sounds to me like you're not closing your TCP ports properly when
you fork() the children.

Take care,