This repository has been archived on 2026-05-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
AgrarianGameArchive/Docs/Ops/DevelopmentInfrastructureRecoveryRunbook.md
T

5.2 KiB

Development Infrastructure Recovery Runbook

Purpose

This runbook gives a simple recovery path when the Agrarian development machines or shared project storage are unreachable. It covers:

  • Unraid DevBox
  • Ubuntu-Codex
  • Windows-Builder
  • The projects SMB share

Use the least disruptive recovery path first. Do not reboot DevBox until VM-level and service-level checks have failed, because it hosts shared storage and the development VMs.

Current Baseline

System Role Address / Name Notes
DevBox Unraid host, SMB storage, VM host 192.168.5.8 / DevBox Hosts projects share and VMs.
Ubuntu-Codex Source-control and automation VM 192.168.5.10 expected, current host access may also show 192.168.5.6 or 192.168.5.9 depending on interface Mounts //192.168.5.8/projects at /mnt/projects.
Windows-Builder Unreal/Visual Studio/GPU build VM 192.168.5.12 Uses fixed VirtIO MAC 52:54:00:17:ec:5d.
projects share Shared Unreal project storage \\DevBox\projects / /mnt/projects Repo path is /mnt/projects/AgrarianGameBulid.

First Triage

From Ubuntu-Codex or another LAN machine:

ping -c 3 192.168.5.8
ping -c 3 192.168.5.12
nc -vz -w 5 192.168.5.12 3389
mount | rg '/mnt/projects|cifs|smb'

From DevBox:

virsh list --all
virsh domiflist Windows-Builder
virsh domiflist Ubuntu-Codex

Expected VM NIC baselines:

  • Windows-Builder: bridge br0, model virtio-net, MAC 52:54:00:17:ec:5d
  • Ubuntu-Codex: bridge br0, model virtio-net, MAC 52:54:00:a5:cf:63

Safe Reboot Order

Use this order when multiple systems are unhealthy:

  1. Save or stop active work where possible.
  2. Restart only the failing service if the host is reachable.
  3. Restart the affected VM from inside the guest if guest access works.
  4. Use virsh shutdown <VM> from DevBox if guest access does not work.
  5. Use virsh reboot <VM> only when a graceful shutdown is not enough.
  6. Use virsh destroy <VM> only when the VM is hung and no graceful path works.
  7. Reboot DevBox only after confirming SMB, libvirt, or host networking cannot be recovered individually.

Before planned VM shutdowns, consider a manual VM backup if the change is risky:

/bin/bash /boot/config/custom/agrarian-vm-backup.sh --shutdown-running --vm Windows-Builder
/bin/bash /boot/config/custom/agrarian-vm-backup.sh --shutdown-running --vm Ubuntu-Codex

Windows-Builder Recovery

Use these in order:

  1. Confirm the VM is running:

    virsh domstate Windows-Builder
    virsh domiflist Windows-Builder
    
  2. Confirm RDP is listening from Ubuntu-Codex:

    nc -vz -w 5 192.168.5.12 3389
    
  3. Use the QEMU guest-agent path before relying on RDP when possible.

  4. If RDP is down but guest commands work, check:

    Get-Service TermService
    Get-NetConnectionProfile
    Get-NetFirewallRule -DisplayGroup "Remote Desktop"
    
  5. Restart RDP only if it is not listening:

    Restart-Service TermService -Force
    
  6. If Unreal visual inspection is needed, use Sunshine/Moonlight instead of RDP.

Detailed Windows-Builder references:

  • Docs/Ops/WindowsBuilderNetworkRdpStability.md
  • Docs/Ops/WindowsBuilderGpuRemoteAccess.md

Ubuntu-Codex Recovery

Use these in order:

  1. Confirm VM state from DevBox:

    virsh domstate Ubuntu-Codex
    virsh domiflist Ubuntu-Codex
    
  2. Confirm SSH or console access.

  3. Confirm the project mount:

    mount | rg '/mnt/projects'
    ls -la /mnt/projects/AgrarianGameBulid
    
  4. If /mnt/projects is missing, remount the SMB share using the existing system mount configuration rather than creating a new ad hoc mount.

  5. Confirm Git access:

    git -C /mnt/projects/AgrarianGameBulid status --short
    git -C /mnt/projects/AgrarianGameBulid remote -v
    

Do not wipe local changes to recover the VM. Preserve uncommitted work first with a commit, patch, or backup copy outside the repo.

DevBox And SMB Recovery

Use these in order:

  1. Confirm the host is reachable:

    ping -c 3 192.168.5.8
    
  2. Confirm Unraid services and VM state through the Unraid UI or SSH.

  3. Confirm the projects share is visible:

    smbclient -L //192.168.5.8 -N
    
  4. Confirm Ubuntu-Codex sees the share mounted:

    mount | rg '/mnt/projects'
    
  5. If DNS name DevBox fails but IP works, use the IP temporarily and repair the router/DNS record later.

  6. Avoid storing project files directly on the Unraid OS boot filesystem. Project data belongs on the projects share or in VMs.

When To Stop And Inspect Before Rebooting

Pause before rebooting if:

  • A build, package, backup, or VM disk copy is running.
  • Unreal Editor is open with unsaved assets.
  • Git has uncommitted changes that are not understood.
  • A backup or restore test is in progress.
  • DevBox has disk or array warnings.

After Recovery

After any VM or DevBox recovery:

  1. Confirm /mnt/projects/AgrarianGameBulid is reachable.
  2. Run git status --short.
  3. Confirm Windows-Builder RDP with nc -vz -w 5 192.168.5.12 3389.
  4. Confirm Sunshine only if visual inspection is needed.
  5. Record unusual recovery steps in the handoff notes or the relevant ops doc.