I have some ideas, which I think:
- Wouldn't be too hard to implement,
- And would give Linux a distinctive advantage over competing
platforms such as Solaris or Windows, when executing Java or .NET code
- Enable some user processes to execute user code in kernel mode. This
would reduce the need for context switches, especially for programs
which spend a great deal of time making system calls.
- However, this could threaten system stability if the user process'
code had bugs in it. So the functionality ought to be restricted only
to trusted processes.
So lets define a new capability: CAP_KERNELMODE meaning the process
can enter kernel mode.
New system call: syskernelmode(int flag). Provides a means of a user
process to enter/exit kernel mode execution. Flag turns kernel mode
on/off. Only can be called by processes with the CAP_KERNELMODE
Mono & Java
The kernel of an operating system provides isolation and protection
between processes, partially since traditional programming languages
such as C provide no such protection. However, some newer languages
such as Java and Mono/CLI/.NET do. Thus, for programs written in such
a system, it should be safe (assuming the security and safety
mechanisms provided by the language are sufficiently complete and the
runtime is bug-free) to run them in kernel mode, without the OS'
protection mechanisms being exercised, relying instead on the
guarantees of safety provided by the compiler.
Also, in both Java and the CLI, the ability is provided to develop
software in a mixture of Java/C#/whatever, and traditional programming
languages such as C, which do not provide the same safety guarantees.
Thus the same process may contain some code which we want to allow to
run in kernel mode, and some code which we don't.
So, I would propose controlling the ability to run in kernel mode on a
page-by-page basis. So with each page, the following flag would be
associated or not: KERNELTRUST, indicating the code contained in this
page is trusted to run in kernel mode. Now, JITed bytecode can be put
in KERNELTRUST pages, and allowed to run in kernel mode by calling
syskernelmode; but native code pages will not have KERNELTRUST, and
attempts by them to call syskernelmode will fail.
A few other considerations:
- As well as CAP_KERNELMODE capability, we should have another one
CAP_KRNLMODEBYPAGE, indicating that the CAP_KERNELMODE capability
extending only to KERNELTRUST pages of a process.
- KERNELTRUST pages should all be set read/only, so native code cannot
One problem is preventing native code from jumping to the middle of a
function in a KERNELTRUST page; it should only be able to reach them
through valid entry points. One solution might be to have a
syskerneljump(void * address) system call, which enters kernel mode
and jumps to address, but only if address is declared in executable
file to be a valid entry point in a KERNELTRUST page. Then, a
KERNELTRUST page would only be privileged if entered via
syskerneljump, which checks that a valid entry point is being called.
Finally, attributes can be added to functions in the programming
language (Java or C#)
* AlwaysKernelMode function should always execute in kernel mode; if
called in user mode, context switch to kernel mode.
* AlwaysUserMode function should always execute in user mode; if
called in kernel mode, context switch to user mode.