Gnus stuck in select/gettimeofday loop

Gnus stuck in select/gettimeofday loop

Post by Neil Wood » Sun, 26 Feb 2006 16:38:10


I'm running the latest Gnu Emacs snapshot from Debian, and the latest
CVS snapshot of Gnus (checked out today).

Occasionally?Emacs will lock up completely whilst in Gnus, at this point
it is unresponsive to C-g. Doing an strace of the process gives the
following outputs:

% strace -Tvp 24093
[...]
select(8, [3 5 6 7], NULL, NULL, {0, 30}) = 0 (Timeout) <0.000640>
gettimeofday({1140846360, 622910}, NULL) = 0 <0.000016>
gettimeofday({1140846360, 623000}, NULL) = 0 <0.000017>
gettimeofday({1140846360, 623069}, NULL) = 0 <0.000016>
gettimeofday({1140846360, 623137}, NULL) = 0 <0.000015>
select(8, [3 5 6 7], NULL, NULL, {0, 31}) = 0 (Timeout) <0.000641>
gettimeofday({1140846360, 623910}, NULL) = 0 <0.000016>
gettimeofday({1140846360, 624019}, NULL) = 0 <0.000019>
gettimeofday({1140846360, 624089}, NULL) = 0 <0.000015>
gettimeofday({1140846360, 624157}, NULL) = 0 <0.000015>
[many repetitions of this...]

and strace for a few seconds:

% strace -c -p 24093
Process 24093 attached - interrupt to quit
Process 24093 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.76 7.385984 825 8952 1 select
1.24 0.092740 3 35839 gettimeofday
0.00 0.000035 35 1 write
0.00 0.000012 6 2 read
0.00 0.000005 2 3 ioctl
0.00 0.000004 2 2 2 sigreturn
0.00 0.000004 1 4 poll
------ ----------- ----------- --------- --------- ----------------
100.00 7.478784 44803 3 total

There is a single network connection open from Emacs to sea.gmane.org,
which doesn't want to timeout - I suspect?this may be related.
Disconnecting from the net and waiting for 20 minutes had no effect.

Curiously, even though Emacs was unresponsive to the keyboard and mouse,
I *did* manage to exit Gnus by typing:

% gnuclient -q -f gnus-group-exit

At this point I saved buffers and exited.

Any help on this, and advice on how best I can debug this would be
greatly appreciated.

Thanks in advance.

s I mentioned, this is an occasional occurrence (there is no way to
reproduce this on demand) - the above is the output of the last
occurrence a few minutes ago.

very time this has happened there has been an open network connection
which won't die, but stays in ESTABLISHED state, checked using netstat.

--
Neil.
Don't make a big deal out of everything; just deal with everything.
 
 
 

Gnus stuck in select/gettimeofday loop

Post by Karl Klein » Sun, 26 Feb 2006 23:06:31

Neil Woods <cnw+ XXXX@XXXXX.COM > writes:
...

Just to be clear, the fact that Emacs becomes unresponsive to C-g is a
sure indicator that you have a bug in Emacs itself, not in Gnus.

It might be more helpful if you'd let folks know the exact version of
GNU Emacs that you've got (M-x emacs-version RET); perhaps there is a
known problem from that version, and possibly you should merely find
an update more recent than Debian's snapshot.

 
 
 

Gnus stuck in select/gettimeofday loop

Post by Neil Wood » Mon, 27 Feb 2006 07:18:57

>>>>> Karl Kleinpaste < XXXX@XXXXX.COM > writes:


Okay.


M-x emacs-version RET

GNU Emacs 22.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.8.10) of
2006-02-22 on drazi, modified by Debian

There was nothing in the Debian bug archives resembling this problem,
and searching the web provided some (very) marginally relevant hits,
though they were some years old.

This problem has been intermittent through the last several versions of
emacs-snapshot from Debian. It's conceivable that it could be a library
or kernel problem, but I've had no problems otherwise so this seems
unlikely. I'm running the last stable version of the Debian kernel

% uname -rv
2.6.12-1-k7 #1 Tue Sep 27 13:22:07 JST 2005

--
Neil.
Don't go to bed with no price on your head.
-- Baretta
 
 
 

Gnus stuck in select/gettimeofday loop

Post by Reiner Ste » Mon, 27 Feb 2006 07:47:13


[...]

| User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (gnu/linux)

`emacs-version' will additionally show the compile date and a little
information about the system. `M-x report-emacs-bug RET' will provide
more information and send the report to the right mailing list.
Maybe running Emacs within gdb (many Emacs developers do this all the
time) would provide more information when the lock up occurs. See
etc/DEBUG for more information.


Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://www.yqcomputer.com/
 
 
 

Gnus stuck in select/gettimeofday loop

Post by no-spa » Mon, 27 Feb 2006 09:06:36

Neil Woods <cnw+ XXXX@XXXXX.COM > writes:


Actually, I have seen this too occasionally with CVS Emacs.

In all cases it seems like it is stuck (or just dead slow) reading the
active file from the news server.

IIRC, C-g usually manages to interrupt this, but not instantly...

--
Kim F. Storm http://www.yqcomputer.com/
 
 
 

Gnus stuck in select/gettimeofday loop

Post by Neil Wood » Mon, 27 Feb 2006 10:20:46

>>>>> Reiner Steib <reinersteib+ XXXX@XXXXX.COM > writes:



Thanks Reiner. In fact that's exactly what I'm doing right now (running
under gdb), since Emacs crashed with a segmentation fault a little while
ago. Strange. I had two frames open, one with Gnus and the other with
the *scratch* buffer. On focusing the other frame Emacs segfaulted
(I'm using FVWM2 as my window manager, in case that's relevant).

Anyway, I shall submit a bug report if/when it crashes again, using
`M-x report-emacs-bug RET'.

There's a new snapshot available as of this evening, which I'm currently
downloading and compiling with symbols, so we'll see if that makes a
difference.

Thanks.
--
Neil.
We are not loved by our friends for what we are; rather, we are loved in
spite of what we are.
-- Victor Hugo
 
 
 

Gnus stuck in select/gettimeofday loop

Post by Romain Fra » Mon, 13 Mar 2006 01:23:20


XXXX@XXXXX.COM (Kim F. Storm) writes:


Do you use article prefetching?[1]

I had never managed to reproduce this bug after seeing it reported a few
times, and after enabling prefetching a few days ago suddenly I keep
running into it... And I can confirm that C-g has absolutely no effect
when it happens.

The backtrace looks like the following:

#0 0xffffe410 in __kernel_vsyscall ()
#1 0xa7756d6d in select () from /lib/tls/i686/cmov/libc.so.6
#2 0x081ee002 in wait_reading_process_output (time_limit=0, microsecs=100,
read_kbd=0, do_display=0, wait_for_cell=137976009, wait_proc=0x92b2190,
just_wait_proc=0) at process.c:4496
#3 0x081eceef in Faccept_process_output (process=153821588, timeout=0,
timeout_msecs=800, just_this_one=137976009) at process.c:3895
#4 0x081ac24c in Ffuncall (nargs=4, args=0xafa6f860) at eval.c:2889
#5 0x081e4608 in Fbyte_code (bytestr=151252459, vector=151256380, maxdepth=56)
at bytecode.c:694
[...]

(I'm not sure if there is really a bug, or if it's just a manifestation
of the flakiness of my nntp server.)


Do you think it could be related to adaptive read buffering?

--
Romain Francoise < XXXX@XXXXX.COM > | The sea! the sea! the open
it's a miracle -- http://www.yqcomputer.com/ | sea! The blue, the fresh, the
| ever free! --Bryan W. Procter

Footnotes:
[1] See the `gnus-asynchronous' variable.
 
 
 

Gnus stuck in select/gettimeofday loop

Post by no-spa » Tue, 14 Mar 2006 18:52:14

Romain Francoise < XXXX@XXXXX.COM > writes:


No.


I don't know...

In the original report, select is called with a timeout value of 30
and 31 usecs, which is quite odd.

select(8, [3 5 6 7], NULL, NULL, {0, 30}) = 0 (Timeout) <0.000640>
select(8, [3 5 6 7], NULL, NULL, {0, 31}) = 0 (Timeout) <0.000641>

Adaptive read buffering uses usec values of 10000 - 70000 and the call
to accept-process-output in nnheader-accept-process-output has a timeout
of 100 msec corresponding to 100000 usecs.

So why is select called with such short timeout values?

I don't have time right now to look into this, so if someone else
would take a look, I would appreciate it.

Anyone want analyze this further?

--
Kim F. Storm http://www.yqcomputer.com/
 
 
 

Gnus stuck in select/gettimeofday loop

Post by no-spa » Fri, 24 Mar 2006 07:52:30


Neil Woods <cnw+ XXXX@XXXXX.COM > writes:


I have installed a fix to accept-process-output which `correctly' interprets
the "msec" arg as milli-seconds rather than micro-seconds.

So instead of waiting 100 micro seconds, gnus now waits 100 milli seconds
which I hope will fix this "tight loop". So please tell me if you see
this again after upgrading from CVS.


--
Kim F. Storm http://www.yqcomputer.com/