Troubleshooting and Recovery


Pi 5 Vault Node — Raspberry Pi 5 hosting a 5×10TB RAID5 array mounted at /media/plutus, exposing a single Samba share Cornucopia, and running the arr stack natively (Sonarr, Radarr, Prowlarr, qBittorrent‑nox, NZBGet). This runbook covers detection, containment, recovery, verification, and post‑incident documentation for the most common failures.


Quick Incident Checklist (30‑second view)

  1. Identify subsystem: RAID / drive / filesystem / service / Samba.

  2. Isolate if filesystem corruption or rebuild required: stop arr services.

  3. Diagnose with mdadm, smartctl, journalctl, getfacl.

  4. Act using the steps below for the specific scenario.

  5. Verify with the verification commands.

  6. Document commands run, timestamps, and outcomes.

Commands — Verification (run after fixes)

bash

# RAID health
sudo mdadm --detail /dev/md0
cat /proc/mdstat

# Mount & usage
df -h /media/plutus
mount | grep md0

# Services
systemctl status sonarr radarr prowlarr qbittorrent-nox nzbget

# Samba & ACLs
sudo testparm -s
getfacl /media/plutus | head -n 20

Scenario Playbooks (copy/paste)

1 Drive failure (detected via SMART or mdadm)

Detect

bash

sudo mdadm --detail /dev/md0
sudo smartctl -a /dev/sdX

Mark failed and remove

bash

sudo mdadm --manage /dev/md0 --fail /dev/sdX
sudo mdadm --manage /dev/md0 --remove /dev/sdX

Replace, partition, add

bash

sudo sgdisk -Z /dev/sdX
sudo sgdisk -n1:0:0 -t1:fd00 /dev/sdX
sudo mdadm --add /dev/md0 /dev/sdX
watch -n 10 cat /proc/mdstat

Safety: double‑check device node before failing/removing.

2 Array degraded but drive healthy

Checks

bash

cat /proc/mdstat
sudo mdadm --detail /dev/md0
dmesg | grep -i sata

Actions

3 Corrupted filesystem

Take offline

bash

sudo systemctl stop sonarr radarr prowlarr qbittorrent-nox nzbget
sudo umount /media/plutus

Repair

bash

sudo fsck.ext4 -f /dev/md0
sudo mount /media/plutus
df -h /media/plutus

Safety: do not run fsck on a mounted filesystem.

4 Service fails to import (Sonarr/Radarr)

Check

bash

ls -ld /media/plutus/downloads /media/plutus/Media
getfacl /media/plutus/Media
journalctl -u sonarr -n 200

Fixes

bash

sudo setfacl -R -m g:media:rwx /media/plutus
sudo setfacl -R -d -m g:media:rwx /media/plutus

5 Samba permission mismatch

Verify

bash

sudo testparm -s
getfacl /media/plutus

Reapply

bash

sudo chown -R plutus:plutus /media/plutus
sudo setfacl -R -m g:media:rwx /media/plutus
sudo setfacl -R -d -m g:media:rwx /media/plutus
sudo systemctl restart smbd
smbclient -L localhost -N

Note: Changing force user will affect all clients.

Post‑Incident Documentation (minimum)

Evidence to include in report (minimal, high‑signal)