Anatomy of a Linux Distribution...Create a Debian spinoff...
A trip through the major components that make up a working Linux OS.
There are more than 81% of all web servers on the internet run on open source web servers (Apache or Nginx), and more than 70% of servers in the world are powered by Linux (the open source operating system core written by Linus Trovalds in 1991).
Linux OS Distributions, like Ubuntu or Kali or Manjaro are assembled out of open source components. Although this assembly looks highly customized ( or branded is the technical term for that) but nonetheless built using openly available simple but alot of components. This makes those distributions perfect candidates to be played with.
In this article we are going to talk about,
- Motherboard Firmwares.
- Bootloaders.
- The Linux kernel!
- Init-systems.
- Package Managers.
- A Display Renderer(Xorg or Wayland), Display manager & Desktop Environments: Gnome & gdm vs KDE & sddm3.
- Briefly comparing the major Linux distribution types: debian, arch and fedora.
8. And, how to create your own debian based distribution!
Story in the chronological order
Let's start at the boot, power button pressed. The first software that runs on the CPU is a firmware software that resides on your motherboard's non-volatile memory. It was the case in the old times that we used BIOS for the firmware but now it has been replaced by a much better firmware UEFI which gives us alot of more features than BIOS along with security.
This piece of software is generally not open sourced. Companies try to make it so that it works perfectly on their hardware and if someone alters it somehow, it breaks and sometimes bricks your system completely. Clearly, not to mess with, and we are anyways not going to worry about it.
How does UEFI look for an operating system kernel
In the early days of BIOS, you would keep a code known as a bootloader, compiled, into the first 512 bytes of a storage media, known as boot sector. BIOS would check if that code is there by checking a magic number kept at the last 2 bytes of this first 512 bytes section. If the magic number was to be found as 0xAA55, it means that the bootloader is present and all you have to do is load the first 510 bytes into memory and then jump to that location to start the bootloader.
Now, with the advent of UEFI, we have a slightly different scenario. Every device will have a FAT32 efi partition,,roughly 100MB's in size, which keeps a set of bootloaders as efi binaries which can be executed by the UEFI system. It is the job of this binary to find the next link in the boot process, the kernel, and start its execution.
The Bootloader
The bootloader is not a part of the kernel and exists independently. Everyone nowadays people use GRUB or LILO bootloader, and we shall stick with GRUB when building our own distro at the end.
Linux Kernel
Linux kernel, by Linus Torevalds, is written in C and is open-source. If you are already running a Linux distribution then you can find the Linux kernel, vmlinuz
, executable in the /boot
directory in your root /
.
The role of the kernel is to provide resources to user space programs and special access to hardware to selected processes. It manages memory, CPU, networking, processes, devices, and more. Basically, it is the go-to for everything that a user space software might need one day.
Because of the kernel being the only connection to the hardware, it has to contain the device drivers which talk to the hardware. These device drivers know how to communicate with their devices, either by memory-mapped-IO or by port-mapped-IO. For example, in memory mapped io, the devices generally place the data in the memory and then raise an interrupt to let the cpu know that the data is available go fetch it. CPU passes the interrupt handling to the appropriate device driver handler.
The kernel is modular so it allows users to add and remove drivers, which are also called modules, to a running kernel byinsmod
andrmmod
commands.
The kernel sets up memory, process tree, initializes other CPU cores by running kernel threads on them, enables interrupts and somewhere along the way, it sets up an initial-ram-disk(initrd) as the root file system in RAM, although this option is optional. At the end of this, it calls the first user space program, the init process.
Init Process
@Line #1558 of kernel file main.c calls the /sbin/init
which is a symlink which points to /lib/systemd/systemd
in the case of systemd init system. The init process in linux always has a process id (pid) of 1 and it is an ancestor of every process but itself in the system. An init process never returns because there is no parent to receive the status.
When the kernel is initialized by the bootloader, it is provided with a bunch of command line options, one of which is to provide a path for an init process. We can edit this in grub and pass anything else, but you should not provide one hand written by you, why...?
It has some subtle responsibilities, first and foremost is to manage daemon(background services) dependency tree. Every daemon depends on some other daemon for its functionality, so systemd manages that and it also keeps the daemon processes running for the whole duration. Others could be to comply to boot scripts or that if any process, which has spawned a child, terminates before waiting for its child to finish, the init process becomes the new parent of this child and receives the status received from it.
Systemd has taken on more responsibilities than a typical init system would. People got worried that it has gone against the KISS philosophy where you would only create one system for one single task. OpenRC is one of the alternatives which pleases the community of such folks, originated from gentoo distribution and quickly gained publicity. It is written in C and is also modular.
Package Managers
A package manager like apt
or pacman
keeps a repository of compiled packages in their repositories, for the most common hardware architectures, and client would look upto them for installs and updates. Linux has one of the best package managers in the world and is the reason of behind its success( it is debatable if linux is successful or not...I think it is...but just not famous enough).
apt
is the default package manager for debian as pacman
is for arch based distros. Archlinux follows the principle of KISS, Keep It Simple Stupid, and so Pacman
can build dependency tree and install packages only, on the other hand APT
can process more stuff like enabling systemd services or configuring other system subsystems but will need to manually do it in pacman
.
Display
In linux, we have something called as display servers. These are the softwares that control the display pipeline and are divided into two categories of the recent.
- Xorg: Old and legacy, X11
- Wayland: New and shiny, Unity
Hovering over stack-overflow I found a very apt architecture diagram for both of these servers, brazenly I take,
Display in a Linux system in itself is extremely complicated, so I am not even going to try explaining it here. For all of our purposes we need a desktop environment to make it all work out of the box.
When the daemons are initialized and running, the init system launches a display manager. A display manager is a program which provides the user with a login screen. A text based login screen(getty) is shown in the absence of any installed display manager.
Or if any display manager is installed, for example sddm.
It is the job of the display manager to setup some environment variables and start a Desktop environment for the logged in users, like KDE or Gnome.
All the distributions in the market highly rely on customization to these programs for their unique display characteristics.
Mainline Distributions: Debian, Fedora and Arch
The choices of the different components for an init system, package manager, desktop environment and the packages and daemons installed make the different distributions. There are only these 3 types in Linux distributions, essentially the three kings of the distro world, if you may, rest all are their derivatives.
So, Debian is the most stable one and you would want your work machine to be a debian based if reliability is what you are after. Arch is more famous for being on the edge, endeavoring into the wild, trying out new subsystems and packages before anyone else, great tool for learning and experimentation though. I have personally never used a Fedora, but I did read somewhere that if you want to work with Red-Hat distributions then it is a good start.
Anatomy of a Debian Distribution
A pure and raw debian system will contain a,
- GRUB bootloader
- The Linux kernel
- systemd init system
- An apt package manager with debian repository mirrors configured!
- alot of packages...Like VNC, openSSH, PulseAudio, libraries and more.
Please note that therer is no Desktop Environment for the debian system, so, all you get is a terminal command-line login screen like this,
If we need anything on the top of debian either
- we can install it manually after installing the system or
- we can add our own packages and desktop environments to create a completely new distribution.
Former is the usual way of handling things but the latter one is the interesting one. For a cool stunt let me show you how to create distribution now.
Create your own Debian Spinoff Distribution!
Task: We have to create an ISO live media which gives you 2 features,
- A live environment to be able to test the distribution.
- A way to install the distribution in a new PC.
Pre-requisites
- A running debian based system. I am using Debian 11 for this procedure.
- For this we will use the debian made installer,
live-build
Installlive-build
package:sudo apt install live-build
. - Create a directory,
mkdir -p /home/<user>/distro_build
, to be used as a workspace andcd /home/<user>/distro_build
into it. - Now we configure and build the
live-build
package by,
$lb config -b iso --cache true --apt-recommends true -a amd64 --binary-images iso --debian-installer live --mode debian --debian-installer-gui true --archive-areas "main contrib" --security true --win32-loader false --interactive shell
Note: if you are not running amd64 architecture then you may have to put in your architecture in place of amd64
.
This command will take a while, then we will be able to choot into the environment by,
$ lb chroot
Let it run and after a while(5 mins on a 4-core machine) then you get a prompt change after chrooting into the the directory we created at the start,
(live) root@debian:/#
This is the most important step! Here, we install all the required packages that you want in your live boot media and the installed OS and any configuration changes made here to the chrooted system will stay in the live version and so in the installed one too.
So, for example, let us say that we need the 3-D modeling tool, Blender, in your distribution: apt install blender
.
If there is anything to add this is the time, do apt install everything that you need. You may bloat up the size of the ISO file but go crazy and install all the basic packages that you need.
One crucial package that we will add here is the Desktop Environment, KDE-Plasma, by issuing the command: apt install task-kde-desktop
.
It will take a long time to install and then just exit the chrooted environment by typing,
/# exit
Start the actual build process of the ISO file,
$lb binary
Let it run for a very long time, if everything went fine successfully you will have a live-image-amd64.iso
in the current directory.
root@debian:~/live# ls -l total 1907440 drwxr-xr-x 2 root root 4096 Apr 24 16:12 auto drwxr-xr-x 9 root root 4096 Apr 24 17:58 binary -rw-r--r-- 1 root root 1174 Apr 24 18:13 binary.modified_timestamps drwxr-xr-x 10 root root 4096 Apr 24 16:22 cache drwxr-xr-x 17 root root 4096 Apr 24 18:20 chroot -rw-r--r-- 1 root root 11317986 Apr 24 17:58 chroot.files -rw-r--r-- 1 root root 4902 Apr 24 16:34 chroot.packages.install -rw-r--r-- 1 root root 54498 Apr 24 17:57 chroot.packages.live -rw-r--r-- 1 root root 236 Apr 24 14:01 cmd drwxr-xr-x 20 root root 4096 Apr 24 14:02 config -rw-r--r-- 1 root root 219 Apr 24 16:07 config_cmd -rw-r--r-- 1 root root 18253 Apr 24 18:13 live-image-amd64.contents -rw-r--r-- 1 root root 11317986 Apr 24 17:58 live-image-amd64.files -rw-r--r-- 1 root root 1922877440 Apr 24 18:19 live-image-amd64.iso -rw-r--r-- 1 root root 7511453 Apr 24 18:20 live-image-amd64.iso.zsync -rw-r--r-- 1 root root 54498 Apr 24 17:57 live-image-amd64.packages drwxr-xr-x 3 root root 4096 Apr 24 16:12 local
That is the bootable ISO for your distribution. Now spin up a virtual machine to test the ISO. Upon bootup you should see a KDE desktop running out of the live boot.
As you can see we have successfully installed a desktop environment on the top of a live media disk for debian. The live version has a password of live
for the account 'Debian live user'.
If you go and see in the applications you will find that there is blender that we installed during the chroot process.
To install this distribution, simply reboot the system and at the grub menu select install.
Note: I have not touched on BRANDING, which is what you do when you want to put your custom banners during the installation phase and your own artwork for splash screen and wallpapers in the live and the installed system. Please look into the branding of particular tools to showcase your brand.
Thankyou for reading and have fun creating more distros!