Hacking Guide
Un article de HurdFr_Wiki.
Cet article est une ébauche. Cela signifie qu'il est n'est pas considéré par son auteur comme terminé.
Vous êtes libre de l'améliorer et de retirer cet avertissement si vous jugez que l'article est maintenant finalisé.
This document is an introduction to GNU Hurd and Mach programming. The purpose of this guide is to help interested people start hacking the Hurd or extending it (by writing translators). It gives lots of references to the Hurd - or GNU Mach - source files. It is recommended that you read through some of these sources. Indeed the Hurd sources are very well written and commented and you can learn a lot by reading them.
The Hurd looks very complex and hard to learn - at a first glance. But it isn't, because you don't need to understand everything at once, you may do it slowly and step-by-step and can apply your existing knowledge. There are also libraries that make hacking of certain common kinds of translators easy. I think that the only problem is the absence of nice documentation like the « Linux Module Programming Guide » and such, which makes it possible to get into it step by step. This document tries to fill that gap.
Sommaire |
Licensing
This is a modified and wikified version of the Hurd Hacking Guide.
The Hurd Hacking Guide is Copyright © 2001, 2002 Wolfgang Jährling <wolfgang@pro-linux.de>.
The modifications are Copyright © 2006 HurdFR <wiki@hurdfr.org>
| Erreur lors de la création de la miniature : convert: unable to open image `/var/lib/mediawiki1.7/upload/d/d1/Heckert_GNU_white.png': No such file or directory. convert: unable to open file `/var/lib/mediawiki1.7/upload/d/d1/Heckert_GNU_white.png'. convert: missing an image filename `/var/lib/mediawiki1.7/upload/thumb/d/d1/Heckert_GNU_white.png/64px-Heckert_GNU_white.png'. | |
| GFDL | |
| Vous avez la permission de copier, distribuer et/ou modifier ce document selon les termes de la Licence de documentation libre GNU, version 1.1 ou plus récente publiée par la Free Software Foundation ; sans sections inaltérables, sans texte de première page de couverture et sans texte de dernière page de couverture. | Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. |
Requirements
- You should know at least basic things about the Hurd. I suggest you read this guide by Marcus Brinkmann, this document by Gaël Le Mignot, or for french-speaking readers, this introduction by HurdFR.
- Having a good understanding of the Hurd design is also welcome.
- It would be good to know what a translator is Marcus Brinkmann also wrote a fine introduction.
- The sources of the Hurd and GNU Mach are also useful:
# cvs -d:pserver:anoncvs@cvs.sv.gnu.org:/cvsroot/hurd login # empty string # cvs -z3 -d:pserver:anonymous@cvs.sv.gnu.org:/cvsroot/hurd co hurd # cvs -z3 -d:pserver:anonymous@cvs.sv.gnu.org:/cvsroot/hurd co -r gnumach-1-branch gnumach
- A GNU/Hurd installation might help you, too[1] . Alternatively, HurdFR gives access to GNU/Hurd boxes to its members and some guests.
- Diving into the header files of the Hurd's libraries is dangerous, because you can drown very easily, because you won't find the way out of all these data structures. But there are many enlightening and interesting comments in there, look them up when you need more information.
- If you know the principles of Mach[2], what MiG is, etc., then this will of course help you a lot, but it should not be necessary. Knowing about the Linux kernel might also help to some degree.
- Oh, and you should know the C programming language. :-)
Short overview of the Hurd and Mach
"We're way ahead of you here. The Hurd has always been on the cutting edge of not being good for anything." (Roland McGrath)
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." (Linus Torvalds)
As you already know, the Hurd is a set of servers on top of a microkernel. These servers need to communicate frequently to implement the functions usually provided by monolithc kernels, and many more. This method must be simple and yet advanced, and as fast as possible. The advantage of the Hurd over other systems is that it provides such a facility and does not require existing applications to be modified to take advantage of its communication framework. How does the Hurd reach this goal?
The communication framework is provided by the microkernel, Mach. Communication is done by sending messages through so-called "ports", which are a kind of message queue that lies in the microkernel. For each port, there is one and only one task with receive permission (i.e. this task receives the messages someone sends to this port). Other tasks might have a send-permission or a send-once-permission (which is used for getting a reply from a server, because ports are one-way channel) for this port, or even no permission at all.
Now that we know the mean of communication, we need to find out how the "connection" is initiated between two servers, i.e. how servers can find each other. Each communication system has its own system : DNS in the Internet, IOR for CORBA, displays for X11, and so on. The Hurd way of finding another server, specifically a port, is really simple: through the file system. Opening a file on an ext2 filesystem, in in the Hurd, means getting a send right to a port associated with the ext2fs translator (i.e., for which ext2fs has the receive permission).
Your favourite e-mail client doesn't support random signatures? Write a random-signature-translator[3] (which returns a new signature each time you read from it). And the best thing is: now _all_ e-mail clients may use this feature! That's why GNU/Hurd does not require programs to be modified to take advantage of most of the nifty features it provides. They think they're just reading the contents of a file, which in fact they're contacting another server which may provide different contact each time.
We can say that the file system is the name-space for services, this also true in the other direction: The name-space for services _is_ the file system. This a very important thing to understand. While the file system is the canonical way to get a port, there are other ways as well; for example, you can get a port in a message.
If you are wondering why I compared this kind of communication with CORBA, the following quote from the paper Towards a New Strategy of OS Design[4] might help you understand the reason:
“With translators, the filesystem can act as a rendezvous for interfaces which are not similar to files. Consider a service which implements some version of the X protocol, using Mach messages as an underlying transport. For each X display, a file can be created with the appropriate program as its translator. X clients would open that file. At that point, few file operations would be useful (read and write, for example, would be useless), but new operations (XCreateWindow or XDrawText) might become meaningful. In this case, the filesystem protocol is used only to manipulate characteristics of the node used for the rendezvous. The node need not support I/O operations, though it should reply to any such messages with a message_not_understood return code.”
Basics of Mach and MiG
GNU/Hurd currently runs only on the GNU Mach microkernel, so the Hurd interfaces use Mach abstractions. While there are many of them used in translators, there is one you must understand and know how to handle : ports and port sets.
Mach ports
The terminology on ports is quite complex and, as Marcus Brinkmann puts it[5], [the Hurd developers]] are not exactly strict in our wording when talking about ports.
Let's make the distinction between ports, port rights and port names clear (quoting Marcus Brinkmann) :
- As explained earlier, a port is a unidirectional channel that conveys messages from N clients to one server.
- Ports are accessed through port rights. Each port has only one receive right and can have multiple send and send-once rights.
- A port name denotes an entry in the task's port name space. A port name actually identifies port rights. They are assoced with either :
- a port right ;
- a port set ;
- a dead name entry, which signifies that the port right once contained is associated to a dead port ;
- a null entry, which indicates that the port name is associated to no port right ;
Port names are used through the mach_port_t type, which is actually the unsigned integer that serves as the index of the port name in the task's port name space. Still quoting Marcus Brinkmann :
"Assume you have mach_port_t 5, and want to send a message to it. To send a message, you pass the task mach_task_self() [which returns a send right to the task port], the port name, the message ID and the arguments. The task is used to get the ipc name space, the port name is used to find the entry in this name space. The entry tells Mach about the port rights you have for the associated port with this port name. You have a single port name for all receive/send rights you might have for a port. But you have distinct port names for send once rights, because this is easier for Mach to manage - and for user programs, too.”
Port names can either be a valid, positive (non-null) integer or MACH_PORT_NULL (0, indicating a null entry), MACH_PORT_DEAD (~0, indicating a dead name entry). You may check with MACH_PORT_VALID (port) if a port is neither of these two values.
Exchanging ports
Messages often carry port rights. The logical way to do this would be to create a port right using mach_port_allocate () and then send it in the body of the message. However, this implies two system calls for each port right transfer. Thus, Mach allow to do this in one step.
To do this, Mach provides special values that can be used in the message body so that Mach automatically creates the needed port right and send it in the same operation. Those are : MACH_MSG_TYPE_(MAKE|MOVE|COPY)_(RECEIVE|SEND|SEND_ONCE). For MAKE operations, you have to supply a receive right ; for COPY and MOVE operations, you have to supply a port right of the type that's supposed to be transmitted.
Port sets
Port sets are often used where port names would be used. The fact that you can use a port set or a port gives a great deal of flexibility for user programs. Here is how Marcus Brinkmann defines them :
"A port set is a set of ports. It is useful to combine ports into a port set if you just want the next message on any of the ports you have a receive right for. In the Hurd, we use port classes and buckets provided by libports, though. [...] You can think of port rights as capabilities associated with ports and port sets, if you want."
MiG
The basic way to write IPC using Mach is neither trivial nor enjoyable. It involves filling mutliple headers with numerous fields and passing them to the mach_msg () system call. Implementing RPCs (Remote Procedure Calls) on top of them adds complexity. It is nevertheless a really good way to get first-hand experience on handling ports and messages : that's why you should try to solve it as an exercise[6]. If you really don't want to do this exercise, you can read the correction. In any case, you really should read the first two chapters of the Mach Kernel Principles[7] and have a good look at the rest of the book.
Making it easier is the purpose of MiG (Mach Interface Generator). To define RPCs, you write an interface definition file, feed it into MiG and then it outputs two C sources and a header file, which do the Mach port magic for you. Then you can send messages with simple function calls.
Of course, you only need to do that if you want to define your own interfaces. The Hurd already contains various interfaces. The apropriate functions are in glibc (through libhurduser), thus you don't have to specify any special flags if you want to use them.
The syntax of MiG files is similar to Pascal and should not be hard to understand. You could try the same Hello World-style exercise as above with MiG, as suggested in this exercise (you can find the correction there).
Notes and references
- ↑ http://hurdfr.org/pages/doc/GuideInstall.pdf
- ↑ FIXME compilation of PDF docs
- ↑ Or better use the already existing `run' translator, which was written my Marcus Brinkmann and is much more flexible; using the existing filemux translator would also be possible
- ↑ http://www.gnu.org/software/hurd/hurd-paper.html
- ↑ http://mail.gnu.org/pipermail/help-hurd/2001-July/004700.html
- ↑ http://web.walfield.org/pub/people/neal/papers/hurd-misc/mach-ipc-without-mig.txt
- ↑ http://www.cs.cmu.edu/afs/cs/project/mach/public/www/doc/osf.html

