char **argv & char *argv[]

char **argv & char *argv[]

Post by jab3 » Sun, 05 Dec 2004 12:51:35


(again :))

Hello everyone.

I'll ask this even at risk of being accused of not researching adequately.
My question (before longer reasoning) is: How does declaring (or defining,
whatever) a variable **var make it an array of pointers?

I realize that 'char **var' is a pointer to a pointer of type char (I hope).
And I realize that with var[], var is actually a memory address (or at
least as it is represented by C, IIRC (an internal copy which is a fixed
pointer)) pointing (permanently) to the first element of an array. And I
realize that *var[] is an array of pointers where each pointer can point to
the beginning of a string (or whatever). But then there is **var. How
does that then become an array of pointers?

Hmmm. It's coming to me. Wait. So we declare 'char **var'. *var
is/contains a memory address of size char, which could point to the
beginning of a string (**var). (?) *var+1 would be the next char memory
address, which could point to a string (*(*var+1)) (moving char bytes
through memory (1)). That's my hangup. How is the **argv structure
formed? Are the arguments added, then memory allocated for that many, then
dividing them up across the argv variable? Because I've learned to accept
that **argv points to a string, *(*argv+1) points to the next, etc. (Are
the () necessary? Am I even right?) But how does it get that way? I feel
like I'm almost to a satori experience with this aspect of pointers (which
would be nice :)), but there's something holding me back (my mind maybe?).
I think I just need to get a grasp of the mechanics behind the creation of
argv. (Don't ask; whenever I'm studying pointers I get stuck on these
issues and I can't stop thinking about how, so I become unable to wrap my
head around it)

Where does the program store the arguments before putting them in argv? Is
there a buffer it puts each argument in, then copies it into argv? It's
driving me crazy. (Similar to how passing a pointer to printf with %s
(char *str = "Confused";printf("%s", str);) is the same as a string. How
does it (the compiler, program, ??) know? I then figured when it receives
the memory address, expecting a string, it dereferences the pointer,
traversing it until it gets a '\0'? Close?) Although it actually just hit
me that if I were to pass a normal string variable (char str[6] = "idiot")
as 'printf("hello, %s", str)' then str is actually a pointer to the first
element of str[6]. Ahhhh... :)

I realize that perhaps the argv example is implementation specific and not
topical. Perhaps you could imagine a similar situation, i.e. passing a
**var in a function that is in fact an array of pointers. Is the **var
construction often used without being an array of pointers? Also, why is
it technically more accurate to define argv as **argv and not *argv[]?
(according to a book I have, Linux Programming by Example).


Please excuse the rambling. I know I'm not being very clear. There's a
reason for that; hence the post :). Thanks for any help or guidance, and
patience.

-jab3
 
 
 

char **argv & char *argv[]

Post by Yan » Sun, 05 Dec 2004 13:48:34

please those that see errors in my answers, point them out)

jab3 wrote:

In the declaration 'char var[];' var, when used by itself is just a
pointer to the first element, and when you access the first or second
element your compiler even turns that var[0] into *(var+0) and var[1]
into *(var+1), literally using the number you gave it as the offset. So
you can think of var[] being sorta equal to *var so *var[] can sorta
equal to **var.. I don't know if im making too much sense, but check out
K&R's book, it explains it well


yeah they are necessary because + is of lower precedence than *


really check out K&R's book and go to the chapter on pointers, it's
really of great help


When a process is created (at least under unix) the first thing that's
put on the stack is your program's activation record, your environmental
variables, your arguments and your count of args, thus when you pop them
off the stack one by one (as by the standard calling declaration) you
take the count and the args. that's done by the operating system.


any string that's in quotes in a C program gets stored in a read-only
part of your program when it's running, so the line:

char *str = "Confused";

gets coppied inot the read-only memory as soon as your program sees it,
then assigns that address to str, which is why you can't change strings
like that. when you call printf() with that str pointer as one of the
args, it simply goes to that location in read-only memory and reads it.


How

yup that's how c does strings

) Although it actually just hit

now saying:

char str[6] = "idiot";

is different from what i said above since in that statement you declare
an array of chars of length 6 and you assign the string "idiot" to it,
read: writeable memory, that statement is syntatically equivalent to:

char str[6] = { 'i', 'd', 'i', 'o', 't', '\0' };



so as i said above str by itself is just a pointer the the location of
the first char, just as it was in constant string just like it was in an
array as i mentioned first thing in the response, so to the printf
statement it pretty much looks like the same thing


Also, why is

its more accurate because in your system, argv is exactly that, a
pointer to a pointer, the "first dereferencing" gives the address of the
pointer to where the first string is (in argv's case, your program's
name), then the next dereferencing (**argv) would point to the actual
first letter in the first string, (*(*argv+1) would point to the first
letter of the second string, etc)


 
 
 

char **argv & char *argv[]

Post by CBFalcone » Sun, 05 Dec 2004 14:27:09


It doesn't. It makes it a variable holding a pointer to some other
type of pointer. The confusion arises because this is exactly what
you get when you pass an array of those pointers to a function. A
passed array is represented by a pointer to its zeroth element.

--
Chuck F ( XXXX@XXXXX.COM ) ( XXXX@XXXXX.COM )
Available for consulting/temporary embedded and systems.
< http://www.yqcomputer.com/ > USE worldnet address!
 
 
 

char **argv & char *argv[]

Post by Chris Tore » Tue, 07 Dec 2004 07:12:37

gt;jab3 wrote:

The short answer is, "it does not".


More precisely, "char **var" declares "var" as a variable of *type*
"pointer to pointer to char". Whether "var" actually points to
anything at all (much less "anything useful") is up to you, the
programmer.


This is also wrong, or at least, not quite right. :-)

There are some "gotchas" with array declarations that do not occur
with pointers, so we have to start adding more context. If we write,
for instance:

int arr1[8] = { 1, 2, 3, 0 };

outside of a function, or inside a block, we have both declared
and defined "arr1" as a variable of type "array 8 of int" (to use
the "cdecl" program's syntax). Because we initialized the array,
we can omit the size, and have the compiler figure it out:

int arr2[] = { 1, 2, 3, 0 };

but now we get an "array 4 of int", because we only used four
initializers.

On the other hand, we have a peculiar feature of the C language in
which function parameters that *look like* arrays are actually
declared as pointers. If we write:

void somefunc(char s[]) {
/* code */
}

the compiler is obligated to pretend that we actually wrote:

void somefunc(char *s) {
/* code */
}

That is, the local-variable "s" within the function somefunc() has
type "pointer to char", rather than "array MISSING_SIZE of char".

The reason for this peculiar feature has to do with what I call
"The Rule" about arrays and pointers in C, combined with the fact
that C passes arguments by value. For (much) more about The Rule,
see <http://web.torek.net/torek/c/pa.html>.

Except for some new features in C99 intended for optimization,
there is never any reason you *have* to use the array notation to
declare formal parameter names in function definitions, and I
encourage programmers to use the pointer notation, so that the
declaration is not misleading: since "s" inside somefunc() has type
"char *", we should all declare it as "char *" in the first place.

Ever since the C89 standard came out, something peculiar happens
if we write:

int arr3[];

outside a function. This is a "tentative definition" of the
array "arr3", and if we reach the end of a translation unit (roughly,
"C source file") without coming across any more details for arr3[],
it acts as if we had written:

int arr3[1] = { 0 };

On the other hand, though, if we try to use empty square brackets
*inside* a function (not as a parameter but inside the {}s):

void wrong(void) {
int arr4[]; /* ERROR */
/* more stuff */
}

we have done something wrong. Empty square brackets are not allowed
here.

Finally, C99 has something called a "flexible array member" of
structures, which we can ignore for now, but does give you one more
place where you can write empty square brackets and have it mean
something special.

All of these are just things you have to memorize -- quirks about
C that "are just the way they are": not for any particular reason
other than that Dennis Ritchie and/or the C standards folks said
so. They all make it a little more tricky to talk about arrays in
C.


If it is indeed an array at all -- for instance, if we write:

char *arr5[100];

either outside or inside a function (not as a parameter to a
function), then arr5 has type "array 100 of pointer to char", and
each of those 100 "pointer to char"s can point to the first of a
 
 
 

char **argv & char *argv[]

Post by jab3 » Thu, 09 Dec 2004 12:31:47

hris Torek graciously wrote on Sunday 05 December 2004 05:12 pm:


So I hear :).


Is "var" an identifier, an object, an lvalue, or a variable? :) Seriously,
it could also be a value in certain contexts I see, but what is the
situation with lvalue and variable and identifier and object? I see in
K&R2 that an object is a "named region of storage," and an lvalue is an
"expression referring to an object." (197; I don't have the C Standard)
Then it says an identifier is a sequence of letters and digits. (192) Then
in your C for Smarties (which is good BTW; I'll have to digest it some
more), at first you say lvalues and objects are the same (the former being
ISO's term and the latter yours, sort of :)), but then you clarify it by
saying that an lvalue names an object, which is how I see K&R2. Then you
say variables are the best examples of objects. Is that the name or the
location/storage? And where to identifiers fit in? :) Am I right to think
of objects, strictly speaking, as the hardware location of 'stuff'? And
the lvalue is the name I've given that 'stuff,' for instance char stf[] =
"blah". 'stf' is the lvalue, and its location in memory is the object? I
think the more I type the more I confuse myself :).


That IIRC comment was an attempt at remembering your article about "The
Rule" I read a couple of months ago, but I didn't have the
conceptual...framework to process it (I didn't read the previous 3 articles
about types and objects and values and contexts, etc. then) Oh well.


Ahh...That's why "The Rule" is effected. The argument is in a value
context, and C stipulates that the value of an array is a pointer to its
first element, so "The Rule" happens (close?).


Ah, I see. (At least I think I do. Right now. Tonight :))


(BTW, what _is_ a translation unit? I see it used in the K&R2 Appendix A,
and I see it here, but I couldn't find that K&R2 defined what it meant.
They just say "a program consists of one or more _translation units_ stored
in files." (191) Granted, I haven't made it through the book yet. Just
skipped to that Appendix :))

So why is it wrong to declare an 'incomplete' type inside a function?


Umm...I think that's what I meant :).



Yeah, that's what I didn't understand before this reply and further reading
on your site. I had forgotten that "The Rule" is something that happens in
certain situations; not something that is persistent. Right? I mean,
let's say a function is called with a parameter of (char *str) but the
argument passed is "char a_str[20]". So inside of the function, a_str
'becomes' a pointer to char, the first element specifically. So then when
the function is over, is the pointer destroyed?


This reminds me of scalar and list context in Perl. Sort of. :) Not as far
as what each context means, but just the different contexts and how a
'variable' behaves/is treated differently based on how it is being used. I
can get that, for the most part; I'm sure there are tricky ones. But that
still doesn't clarify my confusion over objects, lvaues, identifiers, and
variables.

For instance, what is an example of an object that is not named? The
pointer produced by "The Rule?"


Everything between this and my last comment I'll have to read some more and
think about some more. I'm getting it, but you know. (It's getting
late....for me; work comes early at 6:15am) But anyway, that was a lot of
good stuff. :
 
 
 

char **argv & char *argv[]

Post by Chris Tore » Thu, 09 Dec 2004 17:15:13

do not have time to answer all of this now, but I will put in two
short answers... (well, short-ish; they got longer than I expected.)

[I wrote]

In article < XXXX@XXXXX.COM >
jab3 < XXXX@XXXXX.COM > wrote:

All three, in fact.

The name -- the three-letter sequence v, a, r -- is an identifier.
(This is a syntactic element, i.e., something the compiler uses to
figure out what you wrote. Each token is a syntactic element of
some sort; some tokens are identifiers, like the keyword "char",
some are single character thingies like the '*'; some are two-character
thingies like an && operator. This particular syntactic element
is an identifier.)

The compiler must look up the identifier to see how it is declared
and/or defined. If it is defined as, for instance, a typedef-name
-- such as the ST_TYPE in:

typedef struct st ST_TYPE;

-- then it would be an identifier, but not a variable or lvalue.
But here, it has now been declared (and also defined, eventually)
as a variable:

char **var;

so it is a variable. Identifiers have a bunch of properties, such
as scopes and name-spaces, and a single identifier can actually
have multiple meanings, as in the (really awful) code:

void x(void) {
int x;
goto x;
y:
x += 17;
printf("the answer is %d\n", x);
return;
x:
x = 25;
goto y;
}

Here the single identifier "x" has three different meanings: it is
the name of the function x(), it is the name of a variable of type
int also called x, and it is a goto-label just like "y". (Yuck!)

C99 has kind of mucked up the word "lvalue", which was pretty well
defined in C89; but it is safe to say that all ordinary variables
are lvalues. Even array variables are still lvalues, except that,
confusingly enough, they are "non-modifiable" lvalues. (The term
lvalue dates back to compiler guys saying "the thing on the left
of an assignment", so if you cannot put an array on the left of an
assignment -- because the array is not modifiable -- then why call
it an lvalue at all? Probably it was a bad idea, just like us
USAliens using the word "gas" to refer to both petrol and methane.
But, as Kurt Vonnegut wrote, so it goes.)


The kind of problem we want to solve, by using different words like
"lvalue" and "identifier" and "object", is to be able to talk about
what *p or p[i] means when p has a value from malloc():

char *p;

p = malloc(len + 1);
if (p == NULL) ... handle error ...
strcpy(p, str);

The strcpy() writes on various p[i]'s, e.g., setting p[0] to 'h'
and p[1] to 'e' and so on to put "hello world" into it. These
p[i]'s must be storage, but it is, at least in how we can talk
about it, *different* from that for, e.g.:

char buf[100];
p = &buf[0];
strcpy(p, str);

because in this second case we know that p[0] is the same thing as
buf[0], and so on. When the memory comes from malloc(), p[0] has
no other name like buf[0] -- but it is still memory; it can still
hold values. I call p[0] an object (and so does both C89 and C99).


15 is not an object, it is just a value. Objects hold values (or
hold garbage); values are the things you stick into objects. The
name "i" is an identifier that, in this case, names the object;
the C standards (both C89 and C99) say it is indeed an lvalue.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City,