Troubleshooting
Common issues and their solutions for both systems.
Boot Issues
Laptop: Secure Boot Failure
Symptom: System won't boot after NixOS rebuild; UEFI reports Secure Boot violation.
Cause: Lanzaboote failed to sign a boot component, or Secure Boot keys were not enrolled.
Fix:
# 1. Disable Secure Boot in BIOS temporarily
# 2. Boot into NixOS
# 3. Check signing status
sbctl verify
# 4. If unsigned entries exist, rebuild
sudo nixos-rebuild switch --flake ~/nixos-config/laptop#laptop
# 5. Re-enable Secure Boot in BIOSIf keys need re-enrollment:
sudo sbctl enroll-keys --microsoftServer: Stuck at LUKS Prompt
Symptom: Server is unreachable after reboot; it's waiting for the LUKS passphrase in the initrd.
Fix: Connect via SSH to the initrd and unlock:
ssh -p 22 root@192.168.1.20
# In the initrd shell:
cryptsetup-askpass
# Enter LUKS passphraseNetwork Requirements
The initrd SSH server requires the r8169 network driver (loaded via availableKernelModules) and DHCP via udhcpc. Ensure the server's Ethernet port is connected and the router provides DHCP.
Server: Boot Loop (Panic)
Symptom: Server reboots repeatedly; never becomes reachable.
Cause: A critical boot service failed, triggering boot.panic_on_fail → panic=1 (auto-reboot after 1 second).
Fix:
- Connect a monitor and keyboard (or serial console)
- At the systemd-boot menu, select a previous generation
- Once booted, check logs:
journalctl -b -1 -p err - Fix the issue and rebuild
If no previous generations work, boot from a live USB:
# Mount the BTRFS volume
cryptsetup open /dev/nvme0n1p2 crypted
mount /dev/mapper/crypted /mnt -o subvol=nix
# Chroot and debug
nixos-enter --root /mntImpermanence Issues
Missing Files After Reboot
Symptom: A file or directory you created is gone after reboot.
Cause: The root filesystem is wiped on every boot. Only paths under /persist survive.
Fix: Add the path to the impermanence configuration in server/modules/impermanence.nix:
environment.persistence."/persist" = {
directories = [
"/your/directory"
];
files = [
"/your/file"
];
};Then rebuild:
sudo nixos-rebuild switch --flake /home/nixos/nixos-homelab#homelabContainer Data Missing
Symptom: A container lost its data after reboot.
Cause: Container data is stored under /var/lib/nixos-containers/<name>/. This path must be persisted.
Verify: Check that the container's data directory exists under /persist:
ls -la /persist/var/lib/nixos-containers/The impermanence module persists /var/lib/nixos-containers as a directory, so all container state should survive. If a specific container's data is missing, check if it stores data outside this path.
Container Networking
Container Can't Reach the Internet
Symptom: A container can't download packages or connect to external services.
Check NAT configuration:
# Verify NAT is active
sudo iptables -t nat -L -n
# Check if the container's veth interface exists
ip link show | grep ve-
# Check container's IP
machinectl shell <container> /run/current-system/sw/bin/ip addrCommon causes:
- NAT not configured for the interface: Ensure
networking.nat.internalInterfaces = [ "ve-+" ]is set - DNS not resolving: The container should use the host as its DNS server or have its own DNS config
- Firewall blocking: Check
iptables -L -nfor relevant FORWARD rules
Container Can't Reach Host Services
Symptom: A container can't connect to services on the host (e.g., AdGuard on port 53).
Fix: Containers use their hostAddress to reach the host. Verify the addressing:
# Inside the container
ping <hostAddress> # e.g., 192.168.100.10If pinging fails, check that the container's network config matches:
containers.<name> = {
hostAddress = "192.168.100.10";
localAddress = "192.168.100.11";
};Service Unreachable via Domain Name
Symptom: https://service.nemnix.site doesn't work from the LAN.
Causes:
- DNS not pointing to server: Ensure your router uses
192.168.1.20as DNS, or configure AdGuard DNS rewrites - AdGuard DNS rewrite missing: Check AdGuard Home UI for a rewrite rule mapping
service.nemnix.site→192.168.1.20 - Traefik route missing: Check Traefik logs:
journalctl -M traefik -u traefik - TLS certificate not issued: Check Traefik ACME logs for DNS-01 challenge failures
Secrets Issues
agenix Decryption Failure
Symptom: nixos-rebuild switch fails with an agenix error about unable to decrypt secrets.
Cause: The SSH host key doesn't match the public keys in secrets.nix.
Fix:
# Check the host key fingerprint
ssh-keygen -lf /persist/etc/ssh/ssh_host_ed25519_key.pub
# Compare with the key in secrets.nix
cat server/secrets/secrets.nixIf they don't match, you need to either:
- Restore the correct host key from backup
- Or re-key all secrets with the new key:
# Add the new public key to secrets.nix
# Then re-encrypt all secrets
cd server/secrets
agenix -rSecret File Not Found at Runtime
Symptom: A service fails to start, logging that its secret file doesn't exist.
Check:
# List decrypted secrets
ls -la /run/agenix/
# Check the service's expected path
systemctl show <service> | grep -i secretCommon causes:
- Secret not defined in the module: Ensure
age.secrets.<name>is defined - Identity path wrong: Verify
age.identityPathspoints to/persist/etc/ssh/ssh_host_ed25519_key - File permission issue: Check that the secret's
ownermatches the service'sUser
Backup Issues
Restic Backup Failing
# Check the timer and last run
systemctl status restic-backups-backup.timer
journalctl -u restic-backups-backup.service --since today
# Test manually
sudo restic -r /backup --password-file /run/agenix/restic_password snapshotsCommon causes:
- Backup disk full: Check
df -h /backup - Password file missing: Check
ls -la /run/agenix/restic_password - Repository corruption: Run
restic -r /backup --password-file /run/agenix/restic_password check
Restic Repository Repair
# Check for errors
restic -r /backup --password-file /run/agenix/restic_password check
# If pack files are damaged
restic -r /backup --password-file /run/agenix/restic_password check --read-data
# Repair index
restic -r /backup --password-file /run/agenix/restic_password repair index
# Repair snapshots
restic -r /backup --password-file /run/agenix/restic_password repair snapshotsAuto-Upgrade Issues
Upgrade Failed
# Check the upgrade log
journalctl -u nixos-upgrade.service -b
# Common issues:
# - Network failure during flake input fetch
# - Build failure in updated packages
# - Git commit failure (dirty working tree)If the upgrade broke the system:
# Roll back to the previous generation
sudo nixos-rebuild switch --rollback
# Or select a previous generation at bootGit Amend Failed
The ExecStartPost git amend has || true so it won't fail the upgrade, but if commits aren't appearing:
cd /home/nixos/nixos-homelab
git log --oneline -5
git statusCommon cause: the working tree has uncommitted changes that prevent the amend.
Performance Issues
High Memory Usage
# Check per-process memory
btop # or: ps aux --sort=-%mem | head
# Check container memory
for c in $(machinectl list --no-legend | awk '{print $1}'); do
echo "$c: $(systemctl show systemd-nspawn@$c --property=MemoryCurrent)"
doneDisk I/O Issues
# Check I/O scheduler per device
for dev in /sys/block/*/queue/scheduler; do
echo "$dev: $(cat $dev)"
done
# Check if fstrim has run recently
journalctl -u fstrim.service --since "1 week ago"Getting Help
# System overview
btop
# Security audit
sudo lynis audit system
# Check all failed services
systemctl --failed
# Full system journal (errors only)
journalctl -p err -b