mitchross / talos-argocd-proxmox
Talos ArgoCD Homelab. My personal production Cluster.
View on GitHubAI Architecture Analysis
This repository is indexed by RepoMind. By analyzing mitchross/talos-argocd-proxmox in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewTalos ArgoCD Proxmox Cluster > Production-grade GitOps Kubernetes cluster on Talos OS with self-managing ArgoCD, Cilium, and zero-touch PVC backup/restore A GitOps-driven Kubernetes cluster using **Talos OS** (secure, immutable Linux for K8s), ArgoCD, and Cilium, running on Proxmox. Managed via **Omni** (Sidero's Talos management platform) with the **Proxmox Infrastructure Provider** for automated node provisioning. Key Features • **Self-Managing ArgoCD** - ArgoCD manages its own installation, upgrades, and ApplicationSets from Git • **Directory = Application** - Apps discovered automatically by directory path, no manual Application manifests • **Sync Wave Ordering** - Strict deployment ordering prevents race conditions • **Zero-Touch Backups** - Add a label to a PVC, get automatic Kopia backups to NFS with disaster recovery • **Gateway API** - Modern ingress via Cilium Gateway API (not legacy Ingress) • **GPU Support** - Full NVIDIA GPU support via Talos system extensions and GPU Operator • **Zero SSH** - All node management via Omni UI or Talos API Repositories & Resources | Resource | Description | |----------|-------------| | Omni | Talos cluster management platform | | Proxmox Infra Provider | Proxmox infrastructure provider for Omni | | Starter Repo | Full config & automation for Sidero Omni + Talos + Proxmox | | Reference Guide | VirtualizationHowTo guide for Talos Omni on-prem setup | Architecture Sync Wave Architecture ArgoCD deploys applications in strict order to prevent dependency issues: | Wave | Component | Purpose | |------|-----------|---------| | **0** | Foundation | Cilium (CNI), ArgoCD, 1Password Connect, External Secrets, AppProjects | | **1** | Storage | Longhorn, VolumeSnapshot Controller, VolSync | | **2** | PVC Plumber | Backup existence checker (must run before Kyverno in Wave 4) | | **4** | Infrastructure AppSet | Cert-Manager, External-DNS, GPU Operators, Kyverno, Gateway, databases (explicit path list) | | **5** | Monitoring AppSet | Discovers (Prometheus, Grafana, Loki) | | **6** | My-Apps AppSet | Discovers (user applications) | Prerequisites • **Omni deployed and accessible** - See Omni Setup Guide • **Sidero Proxmox Provider configured** - See proxmox provider config • **Cluster created in Omni** - Talos cluster provisioned and healthy • **kubectl access** - Download kubeconfig from Omni UI • **Local tools installed**: , , CLI, CLI ( ) Bootstrap Process Once your cluster is provisioned via Omni, follow these steps to install the GitOps stack. Step 1: Install Cilium CNI Omni provisions Talos clusters without a CNI. Install Cilium to get networking functional: > **Important — version must match:** The CLI version must match the Helm chart version in (currently **1.19.0**). Use to pin it. If versions differ, ArgoCD upgrades Cilium at Wave 0 and regenerates some Hubble certs but not others, causing TLS handshake failures ( ) that block all sync waves. > > **Important — Hubble is disabled at bootstrap on purpose:** The CLI install only provides basic CNI networking. ArgoCD enables Hubble at Wave 0 via the full (which has ). This ensures ArgoCD is the sole owner of Hubble TLS certificates — no cert mismatch between CLI install and ArgoCD's Helm render. The in then preserves those certs on subsequent syncs. > > **Important — cluster name must match:** must match for Hubble certificate SANs. If is run without , certificates are generated for or , causing TLS failures. Step 2: Install Gateway API CRDs Verify Cilium: Step 3: Pre-Seed 1Password Secrets Step 4: Bootstrap ArgoCD **Option A: Bootstrap Script (Recommended)** **Option B: Manual Steps** Step 5: Verify Step 6: Access ArgoCD UI (Optional) What Happens After Bootstrap ArgoCD takes over and manages everything from Git: • **Wave 0**: Cilium, 1Password Connect, External Secrets deploy in parallel • **Wave 1**: Longhorn, Snapshot Controller, VolSync deploy after networking + secrets are ready • **Wave 2**: PVC Plumber deploys (backup checker for Kyverno) • **Wave 4**: Infrastructure AppSet deploys cert-manager, Kyverno, GPU operators, databases, gateway, etc. • **Wave 5**: Monitoring AppSet deploys Prometheus, Grafana, Loki • **Wave 6**: My-Apps AppSet deploys user applications New applications are discovered automatically by directory structure - add a directory with a and push to Git. Cluster Access (Omni) • **Kubeconfig**: Download from Omni UI > your cluster > "Download Kubeconfig" • **Node management**: All done through Omni web UI (upgrades, configuration, patches) • **No needed**: Omni handles Talos upgrades and system extensions Backup System All PVC backups use **Kopia on NFS** via VolSync, automated by Kyverno policies. Add or label to any PVC and backups happen automatically with zero-touch disaster recovery. • **Backend**: Kopia filesystem repository on TrueNAS NFS ( ) • **Encryption**: Kopia password from 1Password ( item) • **Restore**: Automatic on PVC recreation - PVC Plumber checks for existing backups, Kyverno injects • **Details**: See docs/pvc-plumber-full-flow.md, docs/backup-restore.md, and docs/cnpg-disaster-recovery.md • **AI-guided database recovery**: Copy/paste prompts are in LLM Recovery Prompt Templates Hardware Troubleshooting | Issue | Steps | |-------|-------| | **ArgoCD not syncing** | / / Force refresh: delete and re-apply | | **Cilium issues** | / / | | **Storage issues** | / | | **Secrets not syncing** | / / | | **GPU issues** | / | | **Backup issues** | / | Emergency Reset Documentation • **CLAUDE.md** - Full development guide and patterns for this repository • **docs/pvc-plumber-full-flow.md** - Complete PVC backup/restore flow diagram • **docs/backup-restore.md** - Backup/restore workflow • **docs/argocd.md** - ArgoCD GitOps patterns • **docs/network-topology.md** - Network architecture • **docs/network-policy.md** - Cilium network policies • **omni/** - Omni deployment configs, machine classes, and cluster templates • **omni/omni/README.md*…