type punning question...

type punning question...

Post by Jeff W. Bo » Fri, 15 Sep 2006 09:30:53


I'm trying to understand the instances when aliasing is allowed based on ISO
9899:1999.

Specifically - taking the C socket API as specified by RFC2133 as an example...

Most implementations of the functions that take 'struct sockaddr *' as an
argument internally alias that memory using 'struct sockaddr_in *' or 'struct
sockaddr_in6 *' eventually.

For example, as taken from the latest FreeBSD source
http://www.yqcomputer.com/

In the 'getnameinfo' function:

int
getnameinfo(const struct sockaddr *sa, socklen_t salen,
char *host, size_t hostlen, char *serv, size_t servlen,
int flags)
{
...
switch (sa->sa_family) {
case AF_INET:
v4a = (u_int32_t)
ntohl(((const struct sockaddr_in *)sa)->sin_addr.s_addr);
if (IN_MULTICAST(v4a) || IN_EXPERIMENTAL(v4a))
flags |= NI_NUMERICHOST;
v4a >>= IN_CLASSA_NSHIFT;
if (v4a == 0)
flags |= NI_NUMERICHOST;
break;
#ifdef INET6
case AF_INET6:
{
const struct sockaddr_in6 *sin6;
sin6 = (const struct sockaddr_in6 *)sa;
switch (sin6->sin6_addr.s6_addr[0]) {
case 0x00:
if (IN6_IS_ADDR_V4MAPPED(&sin6->sin6_addr))
;
else if (IN6_IS_ADDR_LOOPBACK(&sin6->sin6_addr))
;
else
flags |= NI_NUMERICHOST;
break;
default:
if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr)) {
flags |= NI_NUMERICHOST;
}
else if (IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
flags |= NI_NUMERICHOST;
break;
}
}
break;
#endif
}
...
}

Is this usage valid? I do not mean to pick on FreeBSD here - I'm sure this code
is older than the C99 standard. I use this as an example because I have similar
code of my own to update (socket code) and there is some benefit to my code
looking similar to the socket API. I'm wondering if this is acceptable usage or not.

I have looked at section 6.5, paragraph 7 of the standard. From my reading of
it, this usage is not allowed. But if this is true (and I must admit I'm not
overly confident in my reading of this) I'm trying to figure out how to
implement these kinds of functions - in a backwards compatible way.

I guess I would need to memcpy the structure into a union of all possible
structures internal to the function - and then access the fields through the
union? And then memcpy the specific portion of that union back to return it?
(Not exactly great for optimization...)

Sorry this has become so long.

Thanks,
jeff
 
 
 

type punning question...

Post by Douglas A. » Sat, 16 Sep 2006 03:36:19


There is a lot of historical cruft in the sockets interface,
and much existing code confutes AF_* with IF_* and has other
technical infelicities.

What C guarantees is that you can pun a pointer to one structure
type to that of another structure type, and use the result to
access any of the initial set of structure members that have
the same type according to both structure declarations. I leave
it to you to verify whether or not the posted example followed
this rule.

In other cases, use of a union type is suggested.

 
 
 

type punning question...

Post by Keith Thom » Sat, 16 Sep 2006 07:07:07

"Douglas A. Gwyn" < XXXX@XXXXX.COM > writes:


Where is that guaranteed?

C99 6.5.2.3p5 says:

One special guarantee is made in order to simplify the use of
unions: if a union contains several structures that share a common
initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of
the complete type of the union is visible. Two structures share a
_common initial sequence_ if corresponding members have compatible
types (and, for bit-fields, the same widths) for a sequence of one
or more initial members.

Strictly speaking, this guarantee applies only if the two structures
are members of a union.

Of course the simplest way to satisfy the requirement is to lay out
the structures in the same way whether they're part of a union or not,
but at least in theory the common initial sequences of two structures
could be laid out incompatibly if the compiler can provide that
they're never used as members of the same union.

If it were intended to allow pointer punning as you describe, why
doesn't the standard say so?

--
Keith Thompson (The_Other_Keith) XXXX@XXXXX.COM < http://www.yqcomputer.com/ ~kst>
San Diego Supercomputer Center <*> < http://www.yqcomputer.com/ ~kst>
We must do something. This is something. Therefore, we must do this.
 
 
 

type punning question...

Post by Robert Gam » Sat, 16 Sep 2006 08:56:35

eff W. Boote wrote:

The gist of your question seems to be whether it is allowed to access
the members of one structure type through a pointer to a different
structure type, in general the answer is no (I do not agree with Doug's
statement elsethread about this). There is a "special guarantee" made
in 6.5.2.3p5 which allows the *inspection* of any of the common initial
members of several structures when they are all part of a union whose
declaration is visible.

The POSIX and X/Open extensions define certain behavior of the C
language which is unspecified or undefined by the ISO C Standard. For
example, XSI requires that a function pointer may be safely stored in a
void pointer which is a requirement of the dlsym() function. I am not
aware of any place where all such language extensions provided by POSIX
are enumerated although such a list would be useful.

The POSIX specification clearly states that the common initial members
of the different sockaddr* structures may be accessed through a pointer
to one of the other protocol-specific structure types but it does not
appear to specify how this is legally implemented, i.e. I can't tell if
the ISO C guarantee is extended to include structures that are not all
members of a union or if this must be implemented by placing all such
structures into a union. The GNU C library uses the union method in
socket.h, I don't know if FreeBSD uses a similar method.

Robert Gamble

 
 
 

type punning question...

Post by Jeff W. Bo » Sun, 17 Sep 2006 01:52:43


Yes, you stated the gist of the question much more succinctly than I was able.

My biggest confusion is in trying to understand one of the bullets in the
'aliasing' paragraph in the standard C99.6.5p7 - bullet 5:

an aggregate or union type that includes one of the aforementioned types among
its members (including, recursively, a member of a subaggregate or contained
union), or

What is this bullet saying? The 'union' portion makes sense. I believe it is
saying that an alias is valid if done through a union. But, the aggregate
portion is confusing for me. The standard defines an aggregate as an array or a
structure. So, if we consider this bullet with regard to a structure - it is not
clear to me what it is saying.

Unless it is just saying that an int pointer can point at an int that is
contained within a structure?

I unfortunately agree with your conclusion.

Therefore, I plan on on creating a union of all these types in my code.

Unfortunately, if I don't change the API - I will need to add assignments inside
all of my functions to copy the sockaddr object into the union. This is probably
not too costly because these are not the kinds of functions that are called many
times - but it is kind of ugly. I guess this is the price that is paid for
maintaining legacy API's.

Thanks,
jeff
 
 
 

type punning question...

Post by Robert Gam » Sun, 17 Sep 2006 03:22:42


It says that an object may have its value accessed by a
structure/union/array which has one of the aforementioned types as a
member. For example:

struct {
int i;
} s1, s2;

int *pi = &s1.i;

s1.i = 1;
s2.i = 2;

s1 = s2; /* The int object pointed to by pi is accessed (and
modified) by the structure as a whole */
assert( *pi == 2); /* Always true */

In the above example the int object was not modified through the
pointer or the structure member but by the structure as a whole.
Without this clause the last statement would not have to be true
because of optimizations that could occur due to the aliasing rules.
The rule similiarly applies for arrays, unions, structures containing
arrays, etc.

Robert Gamble