Can the last record in a direct access file be shorter than the record length?

Can the last record in a direct access file be shorter than the record length?

Post by Arjen Mark » Wed, 04 Feb 2009 16:54:18


Hello,

I have run into a small issue with direct-access files (within the
PLplot project - http://www.yqcomputer.com/ ).

The background: some compilers use record lengths expressed in "words"
rather than bytes
as the unit of record length for direct-access files. The minimum
length of a record is then
4 bytes, but the file I need to read is actually a binary file, so I
would like to deal with individual
bytes instead.

The algorithm I have deals nicely with this problem, but it does
assume that the
last record can be read regardless of its actual length. The compilers
I have tried
do not complain about it, but that may be a mere coincidence.

To illustrate the problem more concretely, here is a sample program
(it uses integers instead of characters, but the principle is the
same):

! chk_direct_access.f90 --
! Quick check: can we read an incomplete last record
! from a direct access file?
!
program chk_direct
integer :: a, b, c

open( 10, file = 'chk_direct.xxx', access = 'direct', recl = 4 )
a = 123456789
b = 0
write( 10, rec = 1 ) a
close( 10 )

open( 11, file = 'chk_direct.xxx', access = 'direct', recl = 8 )
b = 0
read( 11, rec = 1, iostat = ierr ) b
read( 11, rec = 2, iostat = ierr2 ) c
close( 11 )

write(*,*) 'a = ', a
write(*,*) 'b = ', b , ierr
write(*,*) 'c = ', c , ierr2

endprogram

Invariably, ierr is printed as 0 and ierr2 (the result of reading well
beyond the file - logical
record 2 contains not a single byte inside the file) is not.

Is this a coincidence or is it well-defined behaviour? (I can live
with both, but the first might
cause trouble in the future)

Regards,

Arjen
 
 
 

Can the last record in a direct access file be shorter than the record length?

Post by Arjen Mark » Wed, 04 Feb 2009 17:36:31


Ah, well, we will just wait for a bug report then. Meanwhile it
seems to work for a fair number of compilers.

Regards,

Arjen

 
 
 

Can the last record in a direct access file be shorter than the record length?

Post by Terenc » Wed, 04 Feb 2009 17:46:37


I don't use many Fortran compilers, but only one old one assumes the
RECL variable to be in b1-byte units; all the other 3 assume 4 byte
units, but offer a compiler option to use 1-byte units.
 
 
 

Can the last record in a direct access file be shorter than the record length?

Post by Arjen Mark » Wed, 04 Feb 2009 17:55:44


Of the ones I have easy access to, two use 1-byte units, the other
two use 4-bytes units with an option to use 1-byte units.

Unfortunately, in the context of this library we can not assume
anything about the unit nor enforce anything. It is also just one
example program where this situation occurs, so it is a very minor
issue. But still annoying enough for me to try and solve it as
best I could.

Regards,

Arjen
 
 
 

Can the last record in a direct access file be shorter than the record length?

Post by Arjen Mark » Wed, 04 Feb 2009 21:28:54


Interesting, yes, I had not thought of that before. As for the
remarks you make at the end of your tutorial, I have indeed
followed that strategy in the past:
- Use various combinations of values for form and access in the open
statement
- Determine which one succeeds

I can add that an older version of (I think) the Lahey compiler
used "form='unformatted' access='transparant'" files for this.

And in some cases you had to actually write to the file before
you knew the open statement really was successful.

Regards,

Arjen
 
 
 

Can the last record in a direct access file be shorter than the record length?

Post by Dave Allur » Fri, 06 Feb 2009 02:17:48


Nice tutorial, Clive. Thanks.

We had a similar discussion recently about reading the last record with
stream I/O:

http://www.yqcomputer.com/

In that thread Thomas Koenig offered a pretty good solution, repeated
here for your convenience:



program main
implicit none
integer(kind=1), dimension(200000) :: a
integer :: i,j

open(unit=10,file="rc", status="old", access="stream",
form="unformatted")
do
read (10,end=10) a
inquire(10,pos=i)
end do
10 continue
inquire(10,pos=j)
print *,j-i
end program main

This is the best I have seen so far for combining portability and
efficiency. I have not yet had a chance to try it out.

--Dave