<!--<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V4.2//EN">-->

<chapter label="2">
<title>Conception of a UNIX system: Defora GNU/Linux</title>
<para>
This chapter deals with the whole process of the creation of a UNIX distribution. Existing software has been used for the system itself. Some tools have been written for the project though, such as the software manager described in chapter 3.
</para>

<sect1>
<title>Creation of the base system</title>

<sect2>
<title>The software choice</title>
<formalpara>
<title>The kernel</title>
<para>
</para>
</formalpara>
<para>
This choice was obvious: <filename>linux</filename>. There are many reasons, the most important are:
</para>
<itemizedlist>
<listitem>
	<para><emphasis>It's free</emphasis>.</para>
	<para> Not only it does not cost anything, but full access to the source code is provided. This is the main reason for its popularity, which then led to its other advantages. Even if at least one of the <filename>BSD</filename> flavours could have been this famous instead, the strength of <filename>linux</filename> is certainly in its license: it enforces any contribution of the kernel to be open source as well, forcing it to always benefit to the community.<para>
</listitem>
<listitem>
	<para><emphasis>It's working</emphasis>.</para>
	<para>Its stability and extensibility have been proven. It has been ported to many architectures, and it has device drivers for most available peripherals.</para>
</listitem>
<listitem>
	<para><emphasis>It's easy</emphasis>.</para>
	<para>The accessibility of linux-based distributions is very good now, and still allows an advanced user to completely tune its system. This possibility, such as recompiling the kernel, is certainly easier to grasp on a <filename>linux</filename> system than on the possible alternatives. Moreover these alternatives are only FreeBSD and OpenBSD, for a comparable set of possibilities. And my own experience so far is mainly with linux-based systems.</para>
</listitem>
<listitem>
	<para><emphasis>It's rich</emphasis>.</para>
	<para>The kernel has a huge list of features, but would not have much use without software running on top. Thanks to a community of million of volunteers, and many companies, <filename>linux</filename> has become a platform of choice for server, development, workstation and desktop uses.</para>
</listitem>
</itemizedlist>

<formalpara>
<title>The libc and system tools</title>
<para>
</para>
</formalpara>
<para>
Like the kernel, the choice is quite obvious, particularly with the <filename>linux</filename> kernel: the GNU libc, called <filename>glibc</filename>. It almost always comes with Linux distributions, then being a major part of GNU/Linux systems success. Whenever <filename>linux</filename> is said to be stable, it's also because of the <filename>glibc</filename>. The GNU foundation also provides a free implementation of all the common UNIX basic tools, which are of course designed to run with the GNU libc, and often the best version available. These tools are also free in the same way of the <filename>linux</filename> kernel, the GNU foundation being the creator of the license used, the GPL (General Public License, available from http://www.gnu.org/licenses/gpl.html).
</para>

<formalpara>
<title>Additional tools</title>
<para>
</para>
</formalpara>
<para>
The system would not be complete without a way to install and manage it. The most important tool defining a UNIX distribution is the software manager. That's why one has been written for this project: its philosophy and use is presented in the administration section of this chapter, while its conception details are in chapter 3.
</para>
</sect2>

<sect2>
<title>Preparation of the system</title>
<para>
This step has been inspired from the "Linux From Scratch" guide. It is written by a community of GNU/Linux users, assembling their own system themselves: it also mentions the known problems with software compilation and installation, which was very helpful sometimes.
</para>
<formalpara>
<title>Creation of a nested compilation farm</title>
<para>
</para>
</formalpara>
<para>
Of course one needs a prior system, in order to compile the final system. But the compiled programs have to be linked against the final system libraries, not the initial system's. That's why an intermediate system is needed.
</para>
<para>
However when starting an intermediate system, there are not any system libraries yet (and even no software at all). That's why the programs starting this system will have to be compiled statically, which means that every executable file produced will contain every function call it needs, even the ones that would usually be shared (at least those from the libc). This intermediate system consequently consumes space, but not too much software need to be compiled there.
</para>
<para>
The software to compile then is basically the libc, the compiler, a shell, some essential tools and low-level libraries such as <filename>make</filename> (compilation helper) or <filename>gzip</filename> (compression tool and library).
</para>

<formalpara>
<title>Creation of the new system base</title>
<para>
</para>
</formalpara>
<para>
When every tool needed to compile the future system base has been statically compiled and installed, the new system creation may start. The first software to compile is the libc, and then the C compiler. However there is still a linking problem, because the default libraries used for compiling and linking are those present in <filename>/</filename> and <filename>/usr</filename>, and not those from the intermediate system. To solve this a special technique is used, called <filename>chroot</filename>. It consists in spawning a program, typically an interactive shell, in a jailed environment, where the <filename>/</filename> directory is actually bound to any other directory. In our case we need to launch a shell with our base system directory as the faked <filename>/</filename>. This technique is often used to increase security of networked processes, because in case of compromission the possible damages can only affect the files present in their environment.
</para>
<para>
From this point the compilation and installation of the other programs and libraries are done one by one, depending on their respective needs. For instance, <filename>bash</filename> (shell) may be installed immediately, to test the new system, but it may benefit from the <filename>curses</filename> library, which should then be compiled before (of course this can be also done afterward).
</para>
<para>
The latter example clearly illustrates the need of dependencies tracking, when the system has to be distributed in binary format (and even in source). This is a good reason to use a software management system.
</para>
</sect2>

</sect1>

<sect1>
<title>Administration of the system</title>

<sect2>
<title>Configuration</title>
<para>
In an attempt to present the system setup in a convenient way for the user, setup files have been placed according to this simple rule:
</para>
<itemizedlist>
<listitem>
	<para><emphasis>A specific program or task needs one file</emphasis>:</para>
	<para>the file is directly placed in <filename>/etc</filename>.</para>
</listitem>
<listitem>
	<para><emphasis>Otherwise</emphasis>:</para>
	<para>the files are placed in a subdirectory of <filename>/etc</filename>.</para>
</listitem>
</itemizedlist>
<para>
This rule has been followed where possible, and some programs had to be patched to respect it. Most software original packages use the GNU <filename>autotools</filename> system, which is very versatile about this, but some required symbolic links to other parts of the system, like <filename>/var/lib</filename> subdirectories.
</para>
<para>
Where some scripts had to be written, for example the <filename>init</filename> scripts (for system initialization), consistency was the main preoccupation. These are in <filename>/etc/init/rc.d</filename>, from their respective packages, and unfortunately the attempt to share code between each other (implying consistency) is not very successful, because of the complexity of some services.
</para>
<para>
Generally, some software sometimes needed adjustments, in order to respect the filesystem hierarchy standard for example. There also the "Linux From Scratch" guide was very helpful.
</para>
</sect2>

<sect2>
<title>Software management</title>
<para>
This is certainly the main role of any UNIX distribution: provide software to users. This is the most frequent operation performed by system administrators, so it has to be easy, fast, and efficient. Almost every distribution has its own system, and I had my own idea of how things should be done, so Defora would have to have its own.
</para>
<para>
The idea is simply to allow installation or uninstallation of multiple packages at the same time, without the need to perform operations on packages directly. This requires some packages interdependency consideration, and the concept of remote repositories. Of course the usual database and packages information operations would have to be supported: packages probing, installed packages listing, files search.
</para>
<para>
These operations would ideally require very little effort to be done: the software manager should accept a simple syntax, and offer a sufficient but short help. This is not completely the case in many famous distributions, so I hope this work will be easier to grasp.
</para>
</sect2>

</sect1>

<sect1>
<title>Distribution of the system</title>

<sect2>
<title>Using the software manager</title>
<para>
As it is able to simply uncompress packages, the software manager alone is able to create a base system of the distribution. The software needed are the following:
</para>
<para>
<filename>bash</filename>, <filename>bzip2</filename>, <filename>gcc</filename>, <filename>glibc</filename>, <filename>libtools</filename>, <filename>ncurses</filename>, <filename>pkgr</filename> and <filename>tar</filename>.
</para>
<para>
From this point it is possible to <filename>chroot</filename> inside this minimal system, just like when it was still being compiled, and install the rest of the desired software. Depending on the needs, these should be installed: a kernel image, filesystem utilities (e2fsprogs, fileutils, util-linux, ...), a bootloader (lilo), a text editor (vim), system initialization (sysvinit).
</para>
<para>
If installed on a bootable device, this new system can be run as the main one, and completely self-manageable and reproducible. This needed a preliminary installed UNIX system on the machine, but the next solution doesn't.
</para>
</sect2>

<sect2>
<title>Without any initial system</title>
<para>
The only way to install the system without one already installed, is to start the computer on a removable device containing one. Nowadays computers can boot on floppy disks, CD-ROM drives (actually it is a floppy disk emulation), ZIP drives, and even USB drives for the most recent.
</para>
<formalpara>
<title>Defora on a CD-ROM</title>
<para>
</para>
</formalpara>
<para>
The creation of such a system is not trivial, but one has been built for this project. A Defora system has been created with the above method, to be burned on a CD. It has been setup to launch a graphical session automatically, from which a special... shell script can be invoked to install the system interactively.
</para>
<formalpara>
<title>A particular configuration</title>
<para>
</para>
</formalpara>
<para>
The first difficult part is to setup the system so that it needs as little disk space as possible, because every temporary file has to be stored in the main memory. Moreover, one cannot boot directly on a CD-ROM drive, because it is then seen as a floppy disk drive. So a special bootable disk had to be prepared.
</para>
<formalpara>
<title>The bootable image</title>
<para>
</para>
</formalpara>
<para>
A specially tuned kernel is booted from the floppy disk: it has to be small, because the floppy also hosts a minimal system image, loaded in main memory as the root filesystem, in order to continue the process. This image can reasonably only include one executable file, which then has to load the necessary drivers, search every drive for the installation CD, and mount it. The kernel is then told to keep his root filesystem in main memory, so that a writable disk space is available, and launch the initialization sequence from the CD.
</para>
<formalpara>
<title>Burning the CD</title>
<para>
</para>
</formalpara>
<para>
To burn the bootable CD, the floppy disk image has to be dumped, and placed on the CD-ROM drive as a regular file. Then the CD-ROM image can be created on disk, or directly burned to a recordable CD, using the "El-Torito" standard to specify the wanted floppy image as the emulated boot device.
</para>
<para>
Defora GNU/Linux is then a self-reproducing UNIX system.
</para>

</sect1>

</chapter>
