Linux 2.4.30-rc3 md/ext3 problems (ext3 gurus : please check)

Linux 2.4.30-rc3 md/ext3 problems (ext3 gurus : please check)

Post by Hifumi His » Thu, 07 Apr 2005 19:10:10


Hi.


>Sorry, was offline for a week last week; I'll try to look at this more
>closely tomorrow. Checking the buffer_uptodate() without either a
>refcount or a lock certainly looks unsafe at first glance.
>
>There are lots of ways to pin the bh in that particular bit of the
>code. The important thing will be to do so without causing leaks if
>we're truly finished with the buffer after this flush.
>

I have measured the bh refcount before the buffer_uptodate() for a few days.
I found out that the bh refcount sometimes reached to 0 .
So, I think following modifications are effective.

diff -Nru 2.4.30-rc3/fs/jbd/commit.c 2.4.30-rc3_patch/fs/jbd/commit.c
--- 2.4.30-rc3/fs/jbd/commit.c 2005-04-06 17:14:47.000000000 +0900
+++ 2.4.30-rc3_patch/fs/jbd/commit.c 2005-04-06 17:18:49.000000000 +0900
@@ -295,6 +295,7 @@
struct buffer_head *bh;
jh = jh->b_tprev; /* Wait on the last written */
bh = jh2bh(jh);
+ get_bh(bh);
if (buffer_locked(bh)) {
spin_unlock(&journal_datalist_lock);
unlock_journal(journal);
@@ -302,11 +303,14 @@
if (unlikely(!buffer_uptodate(bh)))
err = -EIO;
/* the journal_head may have been removed now */
+ put_bh(bh);
lock_journal(journal);
goto write_out_data;
} else if (buffer_dirty(bh)) {
+ put_bh(bh);
goto write_out_data_locked;
}
+ put_bh(bh);
} while (jh != commit_transaction->t_sync_datalist);
goto write_out_data_locked;



>
>> > If some of the write succeeded and some failed, then I believe the
>> > correct behaviour is to return the number of bytes that succeeded.
>> > However this change to the return status (remember the above patch is
>> > a reversal) causes any failure to over-ride any success. This, I
>> > think, is wrong.
>>
>> Yeap, that part also looks wrong.
>
>Certainly it's normal for a short read/write to imply either error or
>EOF, without the error necessarily needing to be returned explicitly.
>I'm not convinced that the Singleunix language actually requires that,
>but it seems the most obvious and consistent behaviour.
>
>--Stephen

When an O_SYNC flag is set , if commit_write() succeed but
generic_osync_inode() return
error due to I/O failure, write() must fail .

I think that following error handling code is rational in
do_generic_file_write() .

if (file->f_flags & O_SYNC)
err = (status < 0) ? status : written;
else
err = written ? written : status;
out:

return err;


Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

Linux 2.4.30-rc3 md/ext3 problems (ext3 gurus : please check)

Post by Hifumi His » Fri, 08 Apr 2005 12:10:10

Hi,


>Yes. But it is conventional to interpret a short write as being a
>failure. Returning less bytes than were requested in the write
>indicates that the rest failed. It just doesn't give the exact nature
>of the failure (EIO vs ENOSPC etc.) For regular files, a short write is
>never permitted unless there are errors of some description.

When commit_write() FULLY succeed (requested bytes == returned bytes) but
generic_osync_inode() return error due to I/O failure, current
do_generic_file_write() cannot
return error. I encountered above situation a lot under an I/O trouble
condition .

In ver 2.6.11, the return value of generic_osync_inode() is returned
directry to user
when I/O failure occur.


thanks.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/

 
 
 

Linux 2.4.30-rc3 md/ext3 problems (ext3 gurus : please check)

Post by Hifumi His » Fri, 29 Apr 2005 18:10:08


>In checking out the patch one last time, though, I found one anomaly.
>The patch that was submitted to 2.6 for this problem also changed
>write_inode_now() to return an error value. The patch that got
>submitted to 2.4 had no such change. Was there a reason for this
>difference between the two versions?

Right. Also in ver 2.4, I know write_inode_now() has to be changed for perfect
I/O error detection during synchronous writing.

In do_generic_file_write(mm/filemap.c), does the current return value handling is
unchanged?

Thanks,

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/