C0X: File information funcs

C0X: File information funcs

Post by davi » Thu, 09 Oct 2003 04:48:14


I've posted a new proposal for ISO C0X (C200X, i.e., the next ISO C
standard) for a set of types and functions that retrieve information
about a file or I/O stream. It's at:

http://www.yqcomputer.com/

(There is also a related proposal for functions that retrieve info
about a file system, to which I have devoted a separate thread
with the subject "C0X: File system info funcs".)

SUMMARY

Addition of a new structure type to <stdio.h>:

struct _fileinfo
{
int fi_type; // File type
unsigned long int fi_perms; // Access permissions
long long int fi_id; // Serial number
long long int fi_size; // Size, in bytes
time_t fi_modified; // Modification time
time_t fi_accessed; // Last access time
time_t fi_created; // Creation time
};

Addition of constants (preprocessor macros) for the fi_type member:

_FILE_TYPE_FILE // Regular file
_FILE_TYPE_DIR // Directory
_FILE_TYPE_UNKNOWN // Unknown type

Addition of constants (preprocessor macros) for the fi_perms member:

_FILE_PERM_READ // Read access
_FILE_PERM_WRITE // Write access
_FILE_PERM_EXEC // Execute access
_FILE_PERM_SEARCH // Search access

Addition of functions to interrogate the implementation for information
about a file name or I/O stream:

extern int _getfileinfo(const char *fname, struct _fileinfo *info);
extern int _fgetfileinfo(FILE *fp, struct _fileinfo *info);

Comments?

-- David R. Tribble, david at tribble.com --
 
 
 

C0X: File information funcs

Post by Keith Thom » Thu, 09 Oct 2003 08:10:07


XXXX@XXXXX.COM (David R Tribble) writes:
[...]
[...]

Note that this is undefined (by your proposal, set to _TIME_UNKNOWN)
on Unix-like systems.

BTW, if you're going to add _TIME_UNKNOWN, it should be in <time.h>.

--
Keith Thompson (The_Other_Keith) XXXX@XXXXX.COM < http://www.yqcomputer.com/ ~kst>
San Diego Supercomputer Center <*> < http://www.yqcomputer.com/ ~kst>
Schroedinger does Shakespeare: "To be *and* not to be"

 
 
 

C0X: File information funcs

Post by Douglas A. » Thu, 09 Oct 2003 12:18:13


I think before we buy into such a facility, there will have
to be a lot of negotiation concerning the information that
will be made available for the specified file. One model
is what NFS provides for status information.

As to the "permissions", there are various file protection
schemes that do not fit well if at all into the proposed
scheme. Even venerable Unix has multiple levels of file
permissions. Perhaps this aspect (at least) should be
modularized, e.g. hidden within some opaque type with
access-probing functions specified.

Also, we need support for directories before it makes
sense to report that a file is a directory. The POSIX
<dirent.h> functions are a start, but for many purposes
more is needed. Check 9P2000 to see one successful
protocol that supports tree walking.
 
 
 

C0X: File information funcs

Post by Richard Ke » Thu, 09 Oct 2003 19:49:59


XXXX@XXXXX.COM (David R Tribble) writes:

The web page says:

Implementations may represent the file size as a value rounded up
to the next multiple of an implementation-dependent block size

Is this meant to match up the size exactly with the effect of 7.19.2:

Such a stream may, however, have an implementation-defined number
of null characters appended to the end of the stream.

The relationship between the two should probably be made explicit (or
explicitly disavowed, if there isn't supposed to be one).

--
http://www.yqcomputer.com/
 
 
 

C0X: File information funcs

Post by Dan.Po » Thu, 09 Oct 2003 20:07:39

In < XXXX@XXXXX.COM > XXXX@XXXXX.COM (David R Tribble) writes:


Why reinvent the wheel instead of standardising as much as possible
(in the context of a C standard) from the POSIX <stat.h> and <dirent.h> ?

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: XXXX@XXXXX.COM
 
 
 

C0X: File information funcs

Post by davi » Fri, 10 Oct 2003 01:12:08

>> time_t fi_created; // Creation time

Keith Thompson < XXXX@XXXXX.COM > wrote>

Hmmm, I was thinking it would correspond to the 'st_ctime' member of
'struct stat' in POSIX. But you're right, 'st_ctime' is not the
creation time of the file, but the time that its inode info was last
modified. So for POSIX, my 'fi_accessed' would probably the greater
of both 'st_ctime' and 'st_atime'.

Still, there are systems that provide a file's creation time (most
notably, Win32).



Yes, that's what is specified in the proposal.

-- David R. Tribble, david at tribble.com --
 
 
 

C0X: File information funcs

Post by davi » Fri, 10 Oct 2003 01:40:32


I tried to find the most minimal but most useful set of attibutes.
But of course, I'm open to further ideas.

I also left the door open for implementations to provide
additional attributes if they choose.



I want to keep things as simple as possible. Hence the use
of basically two fundamental access permissions: READ and WRITE.
I based this on the theory that a given program can determine
if it can do either of those operations on a given file or stream.

If this involves multiple levels of access checking, then
presumably the O/S provides mechanisms for doing just that.
I assume, after all, that if fopen() can determine whether
the program can read/write the file, then so can a function
like _getfileinfo().

-- David R. Tribble, david at tribble.com --
 
 
 

C0X: File information funcs

Post by davi » Fri, 10 Oct 2003 01:50:11


Yes, as I note in passing in my proposal, ISO C does not define
or even mention (AFAIK) the concept of "directory". Since I
think it's a reasonably useful minimum to determine whether a
given file is a FILE or DIRECTORY, those are the only two file
types that I specifically mandate in my proposal. Of course,
I leave the door open for implementations to provide additional
file types as well (e.g., SOCKET, PIPE, etc.).

It would be justified, in my opinion, to add a simple, suitably
vauge definition of "directory" to the std in order to support,
at the very minimum, the ability to determine that a given file
name is one or not.

Adding more elaborate support for directories, such as
functions for searching/traversing them, would be best handled
in a separate proposal. A simplified POSIX or BSD model would
probably be a good place to start.
(And, yes, I have considered doing just that.)

-- David R Tribble, david at tribble.com --
 
 
 

C0X: File information funcs

Post by davi » Fri, 10 Oct 2003 03:47:56

>> Comments?



Hey, don't think I didn't consider simply proposing that POSIX's
struct stat, stat(), and fstat() should become part of ISO C.

But it's not that simple, and here are my reasons why:

1. Function fstat() accepts a file descriptor, which has no meaning in
ISO C (note that fileno() is not part of ISO C). The accepted way
to specify I/O streams in ISO C is to use type FILE*. So I'd have
to invent a new function to do this (perhaps fpstat()), and fstat()
as it stands cannot be added to ISO C.

2. The stat structure contains several members that have Unix-specific
meanings. Trying to get attributes like user-ID and group-ID accepted
into ISO C would require adding definitions of those concepts to the
std, concepts which are used nowhere else in the std. This reduces
the chances that C, an widespread, almost generic systems programming
language, would incorporate such system-specific concepts.

3. The stat.st_mode member actually represents two attributes of a file,
the type (file/dir/etc.) and the access permissions. I took the simpler
tactic of providing those two attributes in two distinct members.

4. The stat.st_mode permissions bits reflect a Unix-specific concept of
user/group/other access levels. Again, I don't think grafting these
concepts onto ISO C is the way to keep C generic for (almost) all
platforms. We'd just end up adding definitions for these concepts in
the std that would not be referenced anywhere else in the std.

The alternative would be to ignore the multi-level access concepts and
provide a simple READ/WRITE access check. Which is the approach I took.

5. The names 'stat()' and 'struct stat' intrude into the user namespace.
These names have not been reserved in previous (future) editions of
ISO C. If the semantics of those names is not exactly the same as Unix
(and MS Visual C, and others), this will cause compatibility problems.

So I decided that rather than choose one implementation (widespread as
it is) as the model to standardize, it would be better to invent a new,
simpler model that (practically) all implementations could live with.

-- David R Tribble, david at tribble.com --
 
 
 

C0X: File information funcs

Post by Richard He » Fri, 10 Oct 2003 06:08:47


What about important hosted environments, such as OS390, which don't have
the concept of "directory"? Will you add support for PDSs, sequential data
sets, GDGs, etc? Where do you stop?

--
Richard Heathfield : XXXX@XXXXX.COM
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.yqcomputer.com/ ~scs/C-faq/top.html
K&R answers, C books, etc: http://www.yqcomputer.com/
 
 
 

C0X: File information funcs

Post by Douglas A. » Fri, 10 Oct 2003 14:43:59


It may have to actually write or read something in order
to do that. For a write-only file that would be a bad
idea..


But fopen doesn't probe the readability or writability of
the file; it merely sets up the standard library for later
attempts to read or write, and it's those attempts that
will discover whether the operation is allowed.
 
 
 

C0X: File information funcs

Post by Douglas A. » Fri, 10 Oct 2003 14:47:47


Presumably no file info would ever say that a file is
a directory, in that case.

I would suggest to anybody who wants to walk this path
that they look up the Software Tools portable interface,
which was developed to suit a great variety of mainframe
OSes.
 
 
 

C0X: File information funcs

Post by Keith Thom » Fri, 10 Oct 2003 16:10:35

"Douglas A. Gwyn" < XXXX@XXXXX.COM > writes:

[...]

C99 7.19.5.3 p4 says:

Opening a file with read mode ('r' as the first character in the
mode argument) fails if the file does not exist or cannot be read.

I don't see a similar statement for write mode. An fopen() call with
mode "w" typically fails for a read-only file, but I suppose that's
not required.

--
Keith Thompson (The_Other_Keith) XXXX@XXXXX.COM < http://www.yqcomputer.com/ ~kst>
San Diego Supercomputer Center <*> < http://www.yqcomputer.com/ ~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
 
 
 

C0X: File information funcs

Post by algran » Fri, 10 Oct 2003 17:02:36


A PDS is a similar situation to something like a multi-streamed file
under OS/2 (or its Windows emulation) or Macintosh. Some properties
are 'file-like' (e.g. access) and others are 'directory-like' (e.g.
individual elements can be opened in their own right). Then there
are things like tar files, zip files etc. Are we talking about 'the
operating system' or 'the environment in general'? Do we need to
know about compression? Migration? The number of real blocks
occupied by a sparse file (without reading it)? Carriage controls?

OS390 brings up a whole other set of questions. Just obtaining the
exact size of a file in bytes may require reading it. Or you may
just need an upper limit. For a catalogued tape dataset you might
be able to get the permissions without mounting the tape but have
no idea of the size. In general there may be various different
types of information that are obtained from different sources.
There needs to be some way of indicating what information you need
and what the resulting structure contains.

(Incidentally the C programming API for Novell fileservers is
possibly the most complete I've ever encountered in terms of dealing
with different filing system models, though even that only copes with
Windows, OS/2, Mac and POSIX/NFS.)
 
 
 

C0X: File information funcs

Post by rridg » Fri, 10 Oct 2003 18:30:53


Who's going to use these services? What problems does it solve?

I'm not sure much of the fuctionality would be very useful in a program
that doesn't depend other non-standard functions.

The only thing I can see being useful is getting the last modification
time, which is a fairly widespread concept, and standard C has functions
for dealing with time. Otherwise I can test permissions already with
fopen(), and can't do anything portable knowing that something is a
directory or an executable. Can't compare serial numbers portability
because I can't tell if files are on different filesystems portably.
I can get some portability out file sizes, I suppose.

I think what might be good in this area is something that's somewhere
between Standard C and POSIX. Something that has a more wider and
useful scope than what you're proposing but isn't supposed to be
practical on every possible or existing Standard C implementation.
Basically, something that standardized the existing practice of many
C implementations which support a fair number of Unix-like file I/O
functions. Maybe as an optional annex to the C standard.

Ross Ridge

--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] XXXX@XXXXX.COM
-()-/()/ http://www.yqcomputer.com/
db //