Hi Valentin,
In principle, yes. The problem is - The program which makes trouble (pppoe) does not run stand alone,
it is created as a child process of an another process (pppd). So when pppoe exits, the return code
is not logged in any way. So I have to write a log entry just after this read. I changed the code to:
----------- SNIP SNIP -------
void
asyncReadFromPPP(PPPoEConnection *conn, PPPoEPacket *packet)
{
static unsigned char buf[2 * READ_CHUNK];
unsigned char *ptr = buf;
unsigned char c;
int r;
static FILE *debugFile;
debugFile = conn->debugFile;
#define READ_CHUNK2 2048
r = read(0, buf, READ_CHUNK2);
if (debugFile && r > READ_CHUNK2) {
fprintf(debugFile, "suspicious read: max: %d read return value: %d\n",READ_CHUNK2,r);
fflush(debugFile);
}
if (r < 0) {
fatalSys("read (asyncReadFromPPP)");
}
----------- SNIP SNIP --------------
The buffer used is now on the heap, not on the stack, it is 8k, 2k is specified in the read call.
Logging is done before any furher actions.
In my log, I saw following entries:
suspicious read: max: 2048 read return value: 2629
suspicious read: max: 2048 read return value: 3271
suspicious read: max: 2048 read return value: 4095
suspicious read: max: 2048 read return value: 4095
suspicious read: max: 2048 read return value: 4095
The programm is running fine, so the mentioned problem seems not to be a side effect of something other.
I took a look on the code of the parent process - pppd. This program starts pppoe and communicates
with it via a pty.
In the linux kernel, I added the following lines in fs/read_write.c (in the routine sys_read(unsigned int fd,char * buf,int count))
just after the "error = file->f_op->read(inode,file,buf,count);" call:
---------- SNIP SNIP -------
if (error > count)
{
printk("read_write.c: read(inode,file,buf,%d) returned %d\n",count,error);
}
---------- SNIP SNIP -------
I recompiled the kernel, restarted the box and now I'm having a log entry on this issue in both the
/var/log/messages:
Sep 22 19:19:28 ratz kernel: read_write.c: read(inode,file,buf,2048) returned 3002
as well as in my pppoe log.
suspicious read: max: 2048 read return value: 3002
since these two entries correspond, it is a bug in the kernel (2.0.38).
(Or in the pty driver. This has to be located.)
I've found a posting on a quite similiar issue on the net:
http://www.yqcomputer.com/
So I'm propably not the only one with this problem.
Thank you for your assistance,
regards,
Martin