Systemd Sandboxing
The server uses systemd-level hardening for both host services and system daemons. This page covers the Cloudflared tunnel sandboxing, journald hardening, and emergency mode configuration.
Source: server/containers/cloudflared.nix, server/modules/systemd.nix
Cloudflared Sandboxing
Cloudflared runs as a systemd service on the host (not in a container), so its sandboxing is critical. It has one of the most restrictive systemd configurations in the setup.
Source: server/containers/cloudflared.nix
Full Configuration
systemd.services.cloudflared-tunnel = {
after = [ "network-online.target" ];
wants = [ "network-online.target" ];
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart = "${pkgs.cloudflared}/bin/cloudflared tunnel run --token-file ${config.age.secrets.cloudflare_tunnel_token.path}";
Restart = "on-failure";
RestartSec = 5;
User = "cloudflared";
Group = "cloudflared";
# ... (full sandboxing below)
};
};Sandboxing Breakdown
User and Process Isolation
| Directive | Value | Effect |
|---|---|---|
User / Group | cloudflared | Runs as a dedicated unprivileged system user |
PrivateUsers | true | Runs in a user namespace; root inside the namespace maps to nobody outside |
PrivatePIDs | true | Separate PID namespace; cannot see or signal other processes |
UMask | 0077 | Files created by the service are owner-only |
KeyringMode | private | Separate kernel keyring; cannot access host or other services' keys |
Filesystem Isolation
| Directive | Value | Effect |
|---|---|---|
ProtectSystem | strict | Entire filesystem is read-only except explicit ReadWritePaths |
ProtectHome | true | /home, /root, /run/user are inaccessible |
PrivateTmp | true | Separate /tmp and /var/tmp; cleaned on service stop |
PrivateMounts | true | Separate mount namespace; cannot see host mounts |
PrivateDevices | true | Only pseudo-devices (/dev/null, /dev/zero, etc.) are available |
ProcSubset | pid | Only PID-related entries visible in /proc |
ProtectProc | invisible | Processes outside the namespace are invisible in /proc |
ReadOnlyPaths | [token path] | Only the tunnel token is readable; everything else is blocked |
Kernel Protection
| Directive | Value | Effect |
|---|---|---|
ProtectKernelTunables | true | /proc/sys, /sys, etc. are read-only |
ProtectKernelModules | true | Cannot load/unload kernel modules |
ProtectKernelLogs | true | Cannot read kernel ring buffer (dmesg) |
ProtectControlGroups | true | Cgroup hierarchy is read-only |
ProtectHostname | true | Cannot change the system hostname |
ProtectClock | true | Cannot change the system clock |
Capability Restrictions
AmbientCapabilities = null;
CapabilityBoundingSet = null;All Linux capabilities are dropped. The service has zero elevated privileges. Cloudflared only needs to make outbound TCP connections, which doesn't require any capabilities.
Network Restrictions
RestrictAddressFamilies = [ "AF_INET" "AF_INET6" ];Only IPv4 and IPv6 sockets are allowed. Unix domain sockets (AF_UNIX), Netlink (AF_NETLINK), and all other socket families are blocked.
System Call Filtering
SystemCallArchitectures = "native";
SystemCallFilter = [
"~@clock"
"~@cpu-emulation"
"~@debug"
"~@module"
"~@mount"
"~@obsolete"
"~@privileged"
"~@raw-io"
"~@reboot"
"~@resources"
"~@swap"
];| Blocked Group | Syscalls Blocked |
|---|---|
@clock | clock_settime, settimeofday, etc. |
@cpu-emulation | modify_ldt, subpage_prot, etc. |
@debug | ptrace, process_vm_readv, etc. |
@module | init_module, finit_module, delete_module |
@mount | mount, umount2, pivot_root, etc. |
@obsolete | Legacy syscalls (sysfs, uselib, etc.) |
@privileged | chroot, setuid, capset, etc. |
@raw-io | ioperm, iopl, raw device access |
@reboot | reboot, kexec_load |
@resources | setrlimit, sched_setscheduler, etc. |
@swap | swapon, swapoff |
SystemCallArchitectures = "native" blocks 32-bit syscalls on the 64-bit system, preventing compatibility-mode exploitation.
Other Restrictions
| Directive | Value | Effect |
|---|---|---|
NoNewPrivileges | true | Cannot gain privileges via SUID/SGID binaries |
LockPersonality | true | Cannot change execution domain (blocks personality-based attacks) |
MemoryDenyWriteExecute | true | Cannot create writable+executable memory mappings (blocks JIT exploitation) |
RestrictRealtime | true | Cannot acquire realtime scheduling |
RestrictSUIDSGID | true | Cannot create SUID/SGID files |
RestrictNamespaces | true | Cannot create new namespaces (blocks container escape techniques) |
RemoveIPC | true | All IPC resources are cleaned on service stop |
Journald Hardening
Source: server/modules/systemd.nix
systemd.services.systemd-journald.serviceConfig = {
UMask = 0077;
PrivateNetwork = true;
ProtectHostname = true;
ProtectKernelModules = true;
};| Directive | Value | Effect |
|---|---|---|
UMask | 0077 | Journal files are only readable by root |
PrivateNetwork | true | Journald has no network access (blocks remote log exfiltration) |
ProtectHostname | true | Cannot change hostname |
ProtectKernelModules | true | Cannot load kernel modules |
Emergency Mode
boot.initrd.systemd.suppressedUnits = [ "emergency.service" "emergency.target" ];
systemd.enableEmergencyMode = false;Emergency mode is completely disabled. On a headless server with no physical access, an emergency shell is an attack vector, not a recovery tool. If the system fails to boot, it should reboot (via panic=1) and attempt to recover automatically.
Disabled Services
Unnecessary services are explicitly disabled to reduce attack surface:
systemd.services = {
pre-sleep.enable = false;
prepare-kexec.enable = false;
systemd-rfkill.enable = false;
systemd-hibernate-clear.enable = false;
systemd-networkd-wait-online.enable = false;
};| Service | Reason for Disabling |
|---|---|
pre-sleep | Server never sleeps |
prepare-kexec | kexec is disabled via sysctl (kexec_load_disabled = 1) |
systemd-rfkill | No wireless hardware |
systemd-hibernate-clear | Hibernation is disabled (hibernate=no boot param) |
systemd-networkd-wait-online | Static IP; no need to wait for DHCP |
Shutdown Timeouts
systemd.settings.Manager = {
runtimeTime = "15s";
rebootTime = "30s";
kexecTime = "1m";
};Short timeouts ensure the server doesn't hang during shutdown or reboot. If a service doesn't stop within 15 seconds, systemd kills it.