Question about reading a big binary file and write it into several text (ascii) files

Question about reading a big binary file and write it into several text (ascii) files

Post by Albert T » Wed, 26 Jan 2005 05:44:32


I am learning and pretty new to Python and I hope your guys can give me
a quick start.

I have an about 1G-byte binary file from a flat panel x-ray detector; I
know at the beggining there is a 128-byte header and the rest of the
file is integers in 2-byte format.

What I want to do is to save the binary data into several smaller files
in integer format and each smaller file has the size of 2*1024*768

I know I can do something like

Bur I don't them how to save files in integer format (converting from
binary to ascii files) and how to do this in an elegant and snappy way.

Please reply when you guyes can get a chance.
Warm regards,

Question about reading a big binary file and write it into several text (ascii) files

Post by bokr » Wed, 26 Jan 2005 12:56:38

It looks like 16-bit pixels in the 1024*768 images, I assume
You could do that, but why duplicate so much data that you may never look at?
E.g., why not a class that provides a view of your big file in terms of an image index
and returns an efficient array in memory e.g., (untested)

import array
def getimage(n, f, offset=128):*2*1024*768)
return array('H',*1024*768)) # 'H' is for unsigned 2-byte integers (check endianness for swap need!)

Then usage would be
imfile = open('big_file.bin', 'rb')
imarray = getimage(23, imfile)
And you could get pixel x,y by
xpix, ypix = imarray[x+y*1024] # or maybe x*768+y etc.

or your could make getimage a method of a class that you intialize with
the file and which could maintain an lru cache of images
with a particular disk directory as backup, etc. etc. and would provide
images wrapped with nice methods to support whatever you are doing with the images.

Best is probably to leave the original format alone, e.g., (untested and needs try/except)
this should split the big file into individual image files named file0.ximg .. filen.ximg

f = open('xray.seq/, 'rb')
header =
nfile = 0
while 1:
im =*1024*768)
if not im: break
if len(im) != 2*1024*768: print 'broken tail of %s bytes'%len(im); break
fw = open('file%s.ximg' % nfile, 'wb')
nfile +=1

then you could use getimage above with offset passed as 0 and image number 0, e.g.,

im23 = getimage(0, open('file23.ximg','rb'), 0) # img 0, offset 0

But then you might wonder about all those separate files, unless you want to
put them on a series of CDs where they wouldn't all fit on one. Whatever ;-)

You will probably lose in both speed and space if you try to make some kind
of ascii disk files. You aren't thinking XML are you??!! For this, definitely ick ;-)

What you want to do will depend on the big picture, which is not apparent yet ;-)

Sorry to give nothing but untested suggestion, but I have to go, and I
will be off line mostly for a while.

Bengt Richter