GFDL:Microkernel

A microkernel is a minimal form of computer operating system kernel providing a set of primitives, or system calls, to implement basic operating system services such as address space management, thread management, and inter-process communication. All other services, those normally provided by the kernel such as networking, are implemented in user-space programs referred to as servers.

Later extensions of this concept leaded to new architectures such as nanokernels, exokernels and Hardware Abstraction Layers.

Background
One innovation of the Unix operating system is the use of a large number of small programs that can be strung together to complete a task with a pipe, as opposed to using a single larger program that includes all of the same functionality. The result is more flexibility and improved development; since each program is small and dedicated to a single role, it is much easier to understand and debug.

Under Unix, the "operating system" consists of many of these utilities along with the master control program, the kernel. The kernel provides services to start and stop programs, handle the file system and other common "high level" tasks that most programs share, and, perhaps most importantly, schedules access to hardware to avoid conflicts if two programs attempt to simultaneously access the same resource or device. In order to mediate such access, the kernel was given special rights on the system and led to the division between user-space and kernel-space.

Kernel bloat
Early operating system kernels were rather small, partly because computer memories were small. As the capability of computers grew, the number of devices the kernel had to control also grew. Early versions of UNIX had kernels of quite modest size, even though those kernels contained device drivers and file system managers. When address spaces increased from 16 to 32 bits, kernel design was no longer cramped by the hardware architecture, and kernels began to grow.

Berkeley UNIX (BSD) began the era of the "big kernel". In addition to operating a basic system consisting of the CPU, disks and printers, BSD started adding additional file systems, a complete TCP/IP networking system, and a number of "virtual" devices that allowed the existing programs to work invisibily over the network.

This growth trend continued for several decades, resulting in UNIX, Linux, and Microsoft Windows kernels with millions of lines of code in the kernel. Current versions of Linux, Red Hat 7.1 for instance, contain about 2.5 million lines of source code in the kernel alone (of about 30 million in total), while Windows XP is estimated at twice that.

Inter-process communication
Microkernels tried to reverse the growing size of kernels and return to a system in which most tasks would be completed by smaller utilities. Unix attempted to model the world as files, using pipes to move data between them. In an era when a "normal" computer consisted of a hard disk for storage and a printer for input/output, this model worked quite well as most I/O was "linear". The introduction of interactive terminals required only minor "adjustments" to this model; while the display itself was no longer strictly linear, the series of interactions between the user's input and the computer's output remained fairly similar to older systems.

However, modern systems including networking and other new devices no longer seemed to map as cleanly onto files. For instance, trying to describe a window being driven by mouse control in an "interrupt driven" fashion simply doesn't seem to map at all onto the 1960's batch-oriented model. Work on systems supporting these new devices in the 1980s led to a new model; the designers took a step back and considered pipes as a specific example of a much more general concept: inter-process communications, or IPC. IPC could be used to emulate Unix-style pipes, but it could also be used for practically any other task, such as passing data at high speeds between programs. Systems generally refer to one end of the IPC channel as a port.

With IPC the operating system can once again be built up of a number of small programs exchanging data through their ports. Networking can be removed from the kernel and placed in a separate user-space program, which is then called by other programs on the system. All hardware support is handled in this fashion, with programs for networking, file systems, graphics, etc.

Microkernel servers
Servers are programs like any others, although the kernel grants them privileges to interact with parts of memory that are otherwise off limits to most programs. This allows the servers to interact directly with hardware. A "pure" microkernel-based operating system would generally start a number of servers while booting, servers for handling the file system, networking, etc.

The system then functions as if it has a full Unix kernel; the fact that the networking support was being "handed off" is invisible. Instead of a single six million-line kernel, there are a series of smaller programs instead. Additionally, users can choose capabilities as needed and run only those programs, tailoring the system to their needs. For instance, an isolated machine could be instructed not to start the networking server, thereby freeing those resources. The same sort of changes to a traditional kernel, also known as a monolithic kernel or monokernel, are very difficult due to the high level of interconnectedness between parts of the system.

The role of the kernel in such a system is limited. In addition to providing basic task management (starting and stopping other programs), it provides the IPC system and security. When booting, the kernel starts up a series of servers to handle the hardware on the system, granting those servers additional rights as needed. New programs, those being started by the user, use the IPC system to access hardware, calling the kernel to pass along messages after being checked for rights and validity. To handle these tasks, ports introduce filesystem-like endpoints into the IPC system, complete with rights for other programs to use them. For instance, a network server would hold the write permissions to the networking hardware, and keep a number of ports open for reading to allow other programs to call it. Other programs could not take over the networking hardware without the kernel specifically granting this access, and only after the networking server agreed to give up those rights.

The "collection of servers" model offers many advantages over traditional operating systems. With the majority of code in well-separated programs, development on such a system becomes considerably easier. Developing new networking stacks on a traditional monolithic kernel required the entire kernel to be recompiled and rebooted, hard-crashing the machine and forcing a reboot if there is a bug. With a microkernel there is less chance that an updated networking system would do anything other than inconvenience the user and require that one program to be relaunched. It also offers considerably more security and stability for the same reasons. Additionally the kernel itself becomes smaller &mdash; later versions of Mach were only 44,000 lines of code.

Additionally, many "crashes" can be corrected for by simply stopping and restarting the server. In a traditional system, a crash in any of the kernel-resident code would result in the entire machine crashing, forcing a reboot. However, part of the system state is lost with the failing server, and it is generally difficult to continue execution of applications, or even of other servers with a fresh copy. For example, if a server responsible for TCP/IP connections is restarted, applications could be told the connection was "lost" and reconnect to the new instance of the server. However, other system objects, like files, do not have these convenient semantics, are supposed to be reliable, not become unavailable randomly and keep all the information written to them previously.

In order to make all servers restartable, some microkernels have concentrated on adding various database-like techniques like transactions, replication and checkpointing need to be used between servers in order to preserve essential state across single server restarts. A good example of this is ChorusOS, which was targetted at high-availability applications in the telecommunications world. Chorus included features to allow any "properly written" server to be restarted at any time, with clients using those servers being paused while the server brought itself back into its original state.

Essential components of a microkernel
The minimum set of services required in a microkernel seems to be address space management, thread management, inter-process communication, and timer management. Everything else can be done in a user program, although in a minimal microkernel, some user programs may require special privileges to access I/O hardware. A few operating systems approach this ideal, notably QNX and IBM's VM.

Most microkernel systems don't go quite that far. Most put at least some device drivers in the kernel. LynxOS is an example. Most also include a file system in the kernel.

A key component of a microkernel is a good inter-process communication system. Since many services will be performed by user programs, good means for communications between user programs are essential, far more so than in monolithic kernels. The design of the inter-process communication system makes or breaks a microkernel. To be effective, the inter-process communication system must not only have low overhead, it must interact well with CPU scheduling.

Start up, or booting, of a microkernel can be difficult. The kernel alone may not contain enough services to start up the machine. Thus, either additional code for startup, such as key device drivers, must be placed in the kernel, or means must be provided to load an appropriate set of service programs during the boot process.

Some microkernels are designed for high security applications. EROS and KeyKOS are examples. Part of secure system design is to minimize the amount of trusted code; hence, the need for a microkernel. Work in this direction, with the notable exception of systems for IBM mainframes such as KeyKOS and IBM's VM, has not resulted in widely deployed systems.

Performance
Microkernels need a highly efficient way for one process to call another, in a way similar to a subroutine call or a kernel call. The traditional performance problems with microkernels revolve around the costs of such calls. Microkernels must do extra work to copy data between servers and application programs, and the necessary interprocess communication between processes results in extra context switch operations. The components of that cost are thus copying cost and context switch cost.

Attempts have been made to reduce or eliminate copying costs by using the memory management unit, or MMU, to transfer the ownership of memory pages between processes. This approach, which is used by Mach, adds complexity but reduces the overhead for large data transfers. L4 adds a lightweight mechanism using registers if the amount of data being passed is small, which can dramatically improve performance, both in terms of copying, as well as avoiding cache misses in the CPU's cache. On the other hand, QNX does all IPC by direct copying, incurring some extra copying costs but reducing complexity and code size.

Systems that support virtual memory and page memory out to disk create additional problems for interprocess communication. Unless both the source and destination areas are currently in memory, copying must be delayed, or staged through kernel-managed memory. Copying through kernel memory adds an extra copy cost and requires extra memory. Delaying copying for paging delays complicates the interprocess communication system. QNX ducks this problem entirely by not supporting paging, which is an appropriate solution for a hard real-time system like QNX.

Reducing context switch cost requires careful design of the interaction between interprocess communication and CPU scheduling. Historically, UNIX interprocess communication has been based on the UNIX pipe mechanism and the Berkeley sockets mechanism used for networking. Neither of these mechanisms has the performance needed for a usable microkernel. Both are unidirectional I/O-type operations, rather than the subroutine-like call-and-return operations needed for efficient user to server interaction. Mach has very general primitives which tend to be used in a unidirectional manner, resulting in scheduling delays. The Vanguard microkernel supported the "chaining" of messages between servers, which reduced the number of context switches in cases where a message required several servers to handle the request. A number of other microkernels "wrote down" information about the caller, allowing the message to be returned without having to look up the client.

Other microkernels used a variety of more advanced techniques to avoid both of these problems. One solution is to allow the operating system to optionally "promote" certain programs, notably servers, to run inside the kernel's memory space, a technique known as co-location. This can dramatically reduce the IPC overhead, reducing it to something similar to a normal procedure call. This solution does require additional complexity in the kernel's scheduler however, as it must now be able to schedule programs running "within" it, as well as normal programs running in other spaces. Similar complexity is being added to most kernels for other reasons, notably multiprocessor support. Co-location also reduces the number of context switches dramatically, at least in the case where the operating system as a whole interacts heavily with other co-located servers.

The question of where to put device drivers owes more to history than design intent. In the mainframe world, where I/O channels have memory management hardware to control device access to memory, drivers need not be entirely trusted. The Michigan Terminal System (MTS), in 1967, had user-space drivers, the first operating system to be architected in that way.

Minicomputers and microcomputers have not, with a few exceptions, interposed a memory management unit between devices and memory. (Exceptions include the Apollo/Domain workstations of the early 1980s.) Since device drivers thus had the ability to overwrite any area of memory, they were clearly trusted programs, and logically part of the kernel. This led to the traditional driver-in-the-kernel style of UNIX, Linux, and Windows.

As peripheral manufacturers introduced new models, driver proliferation became a headache, with thousands of drivers, each able to crash the kernel, available from hundreds of sources. This unsatisfactory situation is today's mainstream technology.

With the advent of multiple-device network-like buses such as USB and FireWire, more operating systems are separating the driver for the bus interface device and the drivers for the peripheral devices. The latter are good candidates for moving outside the kernel. So a basic feature of microkernels is becoming part of monolithic kernels.

Security
Recently (2006) a debate has started about the potential security benefits of the microkernel design.

Many attacks on computer systems take advantage of bugs in various pieces of software. For instance, one of the common attacks is the buffer overflow, in which malicious code is "injected" by asking a program to process some data, and then feeding in more data than it stated it would send. If the receiving program does not specifically check the amount of data it received, it is possible that the extra data will be blindly copied into the receiver's memory. This code can then be run under the permissions of the receiver. This sort of bug has been exploited repeatedly, including a number of recent attacks through web browsers.

To see how a microkernel can help address this, first consider the problem of having a buffer overflow bug in a device driver. Device drivers are notoriously buggy, but nevertheless run inside the kernel of a traditional operating system, and therefore have "superuser" access to the entire system. Malicious code exploiting this bug can thus take over the entire system, with no boundaries to its access to resources. For instance, an attack on the networking stack over the internet could then ask the file system to delete everything on the hard drive, and no security check would be applied because the request is coming from inside the kernel. Even if such an check were made, the malicious code could simply copy data directly into the target drivers, as memory is shared among all the modules in the kernel.

A microkernel system is somewhat more resistant to these sorts of attacks for two reasons. For one, an identical bug in a server would allow the attacker to take over only that program, not the entire system. This isolation of "powerful" code into separate servers helps isolate potential intrusions, notably as it allows a the CPU's memory management unit to check for any attempt to copy data between the servers.

But a more important reason for the additional security is that the servers are isolated in smaller code libraires, with well defined interfaces. That means that one can audit the code, as its smaller size makes this easier to do (in theory) than if the same code was simply one module in a much larger system. This doesn't mean that the code is any more secure, per se, but that it should contain less bugs in general. This not only makes the system more secure, but more stable as well.

Key to the argument is the fact that a microkernel "automatically" isolates high-privilege code in protected memory because they run in separate servers. This isolation could likely be applied to a traditional kernel as well. However, it is precisely this mechanism that forces data to be copied between programs, leading to the microkernel's generally slower performance. In the past, outright performance was the main concern of most programs. Today this is no longer quite as powerful an argument as it once was, as security problems become endemic in a well-connected world.

But securing the kernel by no means guarentees system security. For instance, if a bug remained in the system's web browser that allowed attack, that attack could still legally ask the file system to erase the drives via the normal IPC messages. Securing against these sorts of "reasonable requests" is considerably more difficult unless a very complex system of rights is available. Even with this capability, the complexity of the interconnections between various programs in the system makes it difficult to apply security checks that are themselves free of bugs.

Examples
Examples of microkernels and operating systems based on microkernels:


 * AmigaOS
 * Amoeba
 * Brainix
 * Chorus microkernel
 * Coyotos
 * EROS
 * K42
 * LSE/OS (a nanokernel)
 * KeyKOS (a nanokernel)
 * The L4 microkernel family, used in TUD:OS, GNU Hurd.
 * MERT
 * Minix
 * MorphOS
 * Phoenix-RTOS
 * QNX
 * RadiOS
 * Spring operating system
 * Symbian OS
 * VSTa