Skip to content
Bogdan edited this page Dec 13, 2025 · 1 revision

This page documents the why and how behind mini-init-asm: a tiny init process meant to run as PID 1 in Linux containers, implemented in pure assembly (x86-64 NASM + ARM64 GAS) with direct Linux syscalls.

It's not trying to compete with mature inits. If you just want the default, battle-tested solution, Tini (or Docker's --init) is usually the right answer. This project exists because:

  • PID 1 behavior in containers is a real footgun,
  • the "tiny init" pattern is easy to explain and test,
  • and building it in assembly keeps the control-flow explicit and auditable.

Contents


The PID 1 problem in containers

A lot of containers run the application directly as PID 1:


PID 1: your-app
├─ child-1
└─ child-2

This works until it doesn't. Common symptoms:

  • docker stop hangs (your app ignores SIGTERM, or never forwards signals to children)
  • Ctrl+C / SIGINT doesn't cleanly stop everything
  • zombies accumulate (exited children stay as zombies until PID 1 reaps them)

PID 1 has special semantics on Linux:

  • it's responsible for reaping orphaned children (and often grandchildren in container setups),
  • and it should be the place where shutdown behavior is centralized (forward signals, enforce grace periods).

A tiny init puts a "real parent" in front of your app:


PID 1: mini-init-asm
└─ PGID leader: your-app
├─ child-1
└─ child-2

Your application becomes "just another process" again, with a parent that behaves like a container-friendly init.


Design goals

The project constraints were intentionally simple:

  1. Behave like a responsible PID 1
  • Forward termination signals to the whole process group
  • Reap zombies (including tricky cases where grandchildren get reparented)
  • Exit with meaningful status codes
  1. Keep it small and auditable
  • No libc, no runtime, direct syscalls only
  • Single static binary per architecture
  • Straightforward control-flow (easy to trace and test)
  1. Be container-friendly
  • Suitable for FROM scratch images
  • Graceful shutdown window + SIGKILL escalation if needed
  • Optional restart-on-crash mode (small, env-driven; not a full supervisor)
  1. Support amd64 and arm64
  • x86-64 NASM (SysV ABI)
  • ARM64 GAS (AArch64)

High-level behavior

Always run the target in its own session + process group (PGID mode)

mini-init-asm is PGID-first by design:

  • the child becomes a session leader (setsid)
  • the child's PID becomes the PGID (signals can be sent to the entire group)
  • signal forwarding uses kill(-pgid, sig) rather than targeting only the child PID

That means multi-process containers behave predictably: the whole "tree" gets the shutdown signal.

Graceful shutdown, then escalation

On a "soft" shutdown signal (TERM/INT/HUP/QUIT):

  1. forward the signal to the process group
  2. start a grace timer (EP_GRACE_SECONDS, default: 10)
  3. if the grace window expires and the main child is still alive → send SIGKILL to the process group

Zombie reaping

On SIGCHLD:

  • call waitpid(-1, WNOHANG) in a loop
  • track the main child exit, but keep reaping until the container is ready to exit

Exit code semantics

  • child exits normally → return the child's exit code
  • child dies by signal → return EP_EXIT_CODE_BASE + signal_number (default base 128)
  • e.g. SIGTERM (15) → 128 + 15 = 143
  • if SIGKILL was used after grace timeout → EP_EXIT_CODE_BASE + 9

How the event loop works

Instead of async signal handlers, mini-init-asm uses a small fd-driven loop:

  • block signals in PID 1
  • create:
  • signalfd for HUP, INT, QUIT, TERM, CHLD (+ optional extras)
  • timerfd for grace-period management
  • epoll to wait on both fds

Pseudo-logic:

for (;;) {
n = epoll_wait(epfd, events, MAX, -1);

for (i = 0; i < n; i++) {
if (events[i].fd == signalfd_fd) {
read(signalfd_fd, &si, sizeof(si));
sig = si.ssi_signo;

if (sig in {HUP,INT,QUIT,TERM}) {
kill(-pgid, sig);
arm_timerfd_once(grace_seconds);
} else if (sig == SIGCHLD) {
while ((pid = waitpid(-1, &st, WNOHANG)) > 0) {
if (pid == main_child) remember_exit_status(st);
}
if (main_child_exited) exit(mapped_status);
} else {
kill(-pgid, sig);
}

} else if (events[i].fd == timerfd_fd) {
read(timerfd_fd, &expirations, sizeof(expirations));
if (!main_child_exited) kill(-pgid, SIGKILL);
}
}
}

Implementation detail: on x86-64 the child is spawned via clone(SIGCHLD, ...) (fork-like usage), then setsid + optional setpgid in the child before execve.


Configuration

The goal is to keep the CLI surface tiny and push most knobs into environment variables.

CLI

  • -v, --verbose - verbose logging
  • -V, --version - version output

Environment variables

Shutdown behavior:

  • EP_GRACE_SECONDS - seconds between first soft signal and SIGKILL escalation (default 10)
  • EP_EXIT_CODE_BASE - base for "killed-by-signal" exit codes (default 128)

Signal fan-out:

  • EP_SIGNALS - CSV list of extra signals to listen for and forward (example: USR1,RT1,RT5)

Reaping and supervision:

  • EP_SUBREAPER=1 - enable subreaper mode (PR_SET_CHILD_SUBREAPER)
  • EP_RESTART_ENABLED=1 - enable restart-on-crash
  • EP_MAX_RESTARTS - max restart attempts (0 = unlimited)
  • EP_RESTART_BACKOFF_SECONDS - backoff delay between restarts

Parsing notes:

  • numeric env vars are treated as non-negative decimals
  • invalid / overflowed values are ignored (defaults apply)
  • timer-related values are clamped to fit signed 64-bit seconds

Trying it in Docker

A minimal pattern is: copy mini-init into the image and use it as ENTRYPOINT.

Example (single-arch, scratch runtime):

FROM debian:stable-slim AS build

RUN apt-get update && \
apt-get install -y --no-install-recommends \
nasm make binutils ca-certificates && \
rm -rf /var/lib/apt/lists/*

WORKDIR /src
COPY . .
RUN make

FROM scratch
COPY --from=build /src/build/mini-init-amd64 /mini-init
COPY your-app /your-app

ENTRYPOINT ["/mini-init", "--"]
CMD ["/your-app"]

Build and run:

docker build -t mini-init-asm-demo .
docker run --rm -it mini-init-asm-demo

# From another terminal:
docker stop <container-id>

Expected behavior:

  • your app receives SIGTERM (and should exit cleanly if it handles shutdown)
  • any subprocesses get the same signal (group-wide fan-out)
  • zombies get reaped by PID 1

Build and test

Prerequisites (Debian/Ubuntu)

sudo apt-get update
sudo apt-get install -y nasm make binutils

Build (x86-64)

make
./build/mini-init-amd64 -- /bin/sh -c 'echo hello && sleep 5'

Cross-build (ARM64 / AArch64)

sudo apt-get install -y gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu
make build-arm64

Run ARM64 via QEMU (on x86 host)

sudo apt-get install -y qemu-user-static

# Recommended: ensures the second `--` reaches mini-init (QEMU can swallow delimiters)
qemu-aarch64-static -- ./build/mini-init-arm64 -- /bin/sh -c 'echo hello && sleep 5'

Tests

make test # e2e tests on x86-64 host
make test-all # broader coverage (edge cases, restart, exit mapping, etc.)
make test-arm64 # ARM64 smoke tests via QEMU

(See scripts/ for individual test harnesses.)


Notes and limitations

This project is intentionally narrow:

  • no privilege dropping
  • no seccomp/capability/AppArmor/SELinux configuration (use orchestrator/image-level security policies)
  • not trying to be a general-purpose supervisor
  • restart mode is intentionally minimal (use real supervisors if you need richer lifecycle management)

Compared to Tini

If you already use Tini, the "core semantics" should feel familiar:

  • signal forwarding
  • zombie reaping
  • optional subreaper behavior

Where mini-init-asm differs:

  • PGID-mode is always on (signals go to the process group by default)
  • implementation is pure assembly with direct syscalls (no libc)
  • it has a small, env-driven restart-on-crash mode (Tini intentionally avoids supervision)

For most production workloads, Tini (or Docker's --init) remains the easiest default. Use mini-init-asm when you specifically want the PGID-first behavior + a tiny static binary + "readable down to syscalls".