ext2/3 performance regression in 2.6 vs 2.4 for small interleaved writes

ext2/3 performance regression in 2.6 vs 2.4 for small interleaved writes

Post by Diego Call » Fri, 13 Feb 2004 06:30:25


El Thu, 12 Feb 2004 05:02:39 +0800 Michael Frank < XXXX@XXXXX.COM > escribi

>> 2.4 has a deadline scheduler. 2.6 default is anticipatory.

I though the 2.4 io scheduler wasn't "deadline" base, I think the first
"deadline" io scheduler was the one merged ~2.5.39

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

ext2/3 performance regression in 2.6 vs 2.4 for small interleaved writes

Post by Andrew Mor » Fri, 13 Feb 2004 19:10:20


I don't know why the single-stream case would be slower, but the two-stream
case is probably due to writeback changes interacting with a weakness in
the block allocator. 10 megs/sec is pretty awful either way.

You have two files, each allocating blocks from the same part of the disk.
So the blocks of the two files are intermingled.

The same happens in 2.4, although the effect can be worse in 2.6 if the two
files are in different directories (because 2.6 will still start these file
out in the same blockgroup, usually - 2.4 will spread different directories
around).


Either way, you have intermingled blocks in the files.

In 2.4, we write these blocks out in time-of-dirtying-the-block order, so
these blocks are written out to nice big linear chunks of disk - the block
write order is 1,2,3,4,5,6,7...

However in 2.6, we write the data out on a per-file basis. So we write
file 1 (blocks 1,3,5,7,9,...) and then we write file 2 (blocks
2,4,6,8,10,...). So you'll see that instead of a single full-bandwidth
write, we do two half-bandwidth writes. If it weren't for disk writeback
caching, it would be as much as 4x slower.

Reads will be slower too - you will probably find that reading back a file
which was created at the same time as a second stream is significantly
slower than reading a file which was created all on its own. 2.4 and 2.6
shouldn't behave significantly differently here.

It's an unfortunate interaction. The 2.6 writeback design is better,
really, because it is optimised for well-laid out files - the better your
filesystem is at laying the files out, the faster it all goes. But in this
particular case, the poor layout decisions trip it up.

The ideal fix for this of course is to just fix the dang filesystems to not
do such a silly thing. But nobody got to that in time. Delayed allocation
would fix it too. You can probably address it quite well within the
application itself by buffering up a good amount of data for each write()
call. Maybe a megabyte.

XFS will do well at this.

You might be able to improve things significantly on ext2 by increasing
EXT2_DEFAULT_PREALLOC_BLOCKS by a lot - make it 64 or 128. I don't recall
anyone trying that.


But I must say, a 21x difference is pretty wild. What filesytem was that
with, and how much memory do you have, and what was the bandwidth of each
stream, and how much data is the application passing to write()?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/