[PATCH -mm 1/9] unshare system call: system call handler function

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sat, 17 Dec 2005 05:00:31


JANAK DESAI < XXXX@XXXXX.COM > writes:


If it isn't legal how about we deny the unshare call.
Then we can share this code with clone.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sat, 17 Dec 2005 06:20:11

JANAK DESAI < XXXX@XXXXX.COM > writes:



I follow but I am very disturbed.

You are leaving CLONE_NEWNS to mean you want a new namespace.

For clone CLONE_FS unset means generate an unshared fs_struct
CLONE_FS set means share the fs_struct with the parent

But for unshare CLONE_FS unset means share the fs_struct with others
and CLONE_FS set means generate an unshared fs_struct

The meaning of CLONE_FS between the two calls in now flipped,
but CLONE_NEWNS is not. Please let's not implement it this way.

Part of the problem is the double negative in the name, leading
me to suggest that sys_share might almost be a better name.

So please code don't invert the meaning of the bits. This will
allow sharing of the sanity checks with clone.

In addition this leaves open the possibility that routines like
copy_fs properly refactored can be shared between clone and unshare.


Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/

 
 
 

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sat, 17 Dec 2005 07:40:12

Jamie Lokier < XXXX@XXXXX.COM > writes:


Internally I doubt it would make much difference. There are
real differences from modifying current to copying from current.
Mostly it is ref counting but just enough that CLONE_SELF
is unlikely to be a sane thing to do.

Of course we could always implement spawn. The syscall with
every possible option :)

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sat, 17 Dec 2005 21:50:17

ANAK DESAI < XXXX@XXXXX.COM > writes:


Right but that leaves that data shared. It is a bit challenging
describing the bits in a way that makes sense from both the
clone and the unshare perspective.

What I see as fundamental is that after a the syscall a resource
may be in one of two states.
- possibly shared with others
- definitely not shared with others

In the clone case the sharing is guaranteed unless your
parent has just exited. But if you don't count your parent
then clone and unshare should look alike.


Carefully note that CLONE_NEWNS behaves differently than most
of the other clone flags because by default the NS stays shared.


So you are having clone and unshare do opposite things.

If that is the intention you are broken with respect to CLONE_NEWNS.
In your implementation.
clone(NEWNS) implies you don't share struct namespace and
unshare(NEWNS) implies you don't share struct namespace.


Another way to think of it is one way to implement unshare is
to call clone with the appropriate flags and to have the parent
exit. Not counting pids and the process tree this should
give identical results to unshare.

If I called sys_unshare(0) without changing the meaning of
the bits I would expect my VM to be unshared, my FS state
to be unshared, my file descriptors to be unshared, my signal
handling to be unshared, my tgid to be unshared, to continue
sharing my namespace, my sysv semaphore undo semantics to be
unshared, and my thread local storage to be unshared.

Basically I expect sys_unshare(0) to take a thread an turn
it into a full process by default. That preserves the current
defaults about which things should be shared and which things
should be unshared. Basically I expect fork() unshare(0) to
be a noop, but not clone(...) unshare(0).

Just using the bits as resource identifiers and not preserving
the default share/unshare status leads to confusion from
implementors because we the set of legal bits is different
in the two cases, and it is confusing to users because they
can't just take bits passed into clone and get the same results
passing those bits into unshare.

They are different in that unshare does not create a new process
and the difference between operating on a new process versus
your current process should be the only difference, not something
that is lost in the noise of given fresh meaning to the bits.

The core expectation is that unshare(CLONE_NEWNS) will give
a new namespace. Currently to meet that expectation you are
changing the set of bits internally, instead of erring about
a bad set of bits being specified. If however you leave the
defaults to unshare everything but the namespace you don't
have to toggle the bits to meet the expectation of what will
work, and you don't have to change the meaning of the bits.


Of the two names I agree that unshare is more appropriate however
because we have double negatives things get confusing. That was
the point in mentioning this strawman. For my vision of a syscall
that unshared practically everything by default I think unshare
is clearly a better name.


We clearly cannot reuse these copy_* functions as is.

I hadn't looked closely yet. But lets see.

static int share_fs(unsigned long clone_flags, struct fs_struct **fsp)
{
struct fs_struct *fs = current->fs;
if (clone_flags & CLONE_FS) {
atomic_inc(&fs->count);
}
 
 
 

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sat, 17 Dec 2005 22:00:18

Jamie Lokier < XXXX@XXXXX.COM > writes:


Why all it requires is that whenever someone updates clone they update
unshare. Given the tiniest bit of refactoring we should be
able to share all of the interesting code paths.

Which should also improve maintenance considerably.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/
 
 
 

[PATCH -mm 1/9] unshare system call: system call handler function

Post by ebieder » Sun, 18 Dec 2005 11:30:07

Jamie Lokier < XXXX@XXXXX.COM > writes:



The only way I can see to confuse unshare is to add a clone
flag and not implement it in unshare. If there is enough
in common between the implementations I don't see that being
a problem.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to XXXX@XXXXX.COM
More majordomo info at http://www.yqcomputer.com/
Please read the FAQ at http://www.yqcomputer.com/