Skip to main content

GitOps with ArgoCD

How STOA leverages GitOps for declarative, auditable configuration management across multiple environments.

GitOps Philosophy​

STOA embraces GitOps principles where Git is the single source of truth for all platform configuration:

  • Declarative Configuration β€” Desired state defined in YAML, not imperative scripts
  • Git as Source of Truth β€” All configuration stored, versioned, and auditable in Git
  • Automated Sync β€” ArgoCD continuously reconciles actual vs desired cluster state
  • Self-Healing β€” Drift detected automatically, cluster state restored to match Git
  • Audit Trail β€” Every change has a Git commit with author, timestamp, and rationale

Architecture​

STOA uses ArgoCD for Kubernetes resource management. Each managed component is an ArgoCD Application:

ComponentArgoCD AppSync Policy
STOA Gatewaystoa-gatewayAuto-sync + self-heal
Control Plane APIcontrol-plane-apiAuto-sync + self-heal
Console UIcontrol-plane-uiAuto-sync + self-heal
Developer Portalstoa-portalAuto-sync + self-heal

Multi-Environment Promotion (ADR-040)​

STOA implements a "Born GitOps" model where environments are first-class citizens, not an afterthought.

Three Environments​

EnvironmentModeColorPurpose
DevelopmentfullGreenUnrestricted β€” create, modify, delete
StagingfullAmberPre-production validation
Productionread-onlyRedLocked β€” changes via promotion only

Promotion Flow​

  1. Develop in dev β€” full CRUD access, rapid iteration
  2. Promote to staging β€” automated tests, integration validation
  3. Approve for prod β€” manual gate, read-only enforcement prevents direct edits

Environment-Scoped Operations​

The Console UI reflects the current environment with visual indicators:

  • Green dot β€” Development (all actions available)
  • Amber dot β€” Staging (all actions available)
  • Red dot + lock icon β€” Production (read-only, no create/edit/delete)

API queries are environment-scoped: GET /v1/apis?environment=staging returns only staging APIs.

ArgoCD Application Example​

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: stoa-gateway
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/stoa-platform/stoa
targetRevision: main
path: stoa-gateway/k8s
destination:
server: https://kubernetes.default.svc
namespace: stoa-system
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Revert manual cluster changes
syncOptions:
- CreateNamespace=true

Sync Policies​

PolicyEffect
automated.prune: trueResources deleted from Git are removed from cluster
automated.selfHeal: trueManual kubectl changes are reverted to match Git
syncOptions: CreateNamespaceNamespace auto-created if missing

CI/CD Integration​

ArgoCD integrates with STOA's CI pipeline:

StepToolTrigger
Code pushGitHubDeveloper merge
CI pipelineGitHub ActionsPush to main
Docker buildGitHub ActionsCI success
ArgoCD syncArgoCDImage change detected
Health checkArgoCDPost-sync probe
AlertingPrometheusHealth degraded

Image Update Strategy​

STOA uses imagePullPolicy: Always with kubectl rollout restart to deploy new images. ArgoCD monitors the deployment and reports sync status.

Drift Detection​

ArgoCD continuously compares the live cluster state with the Git-defined state:

StatusMeaningAction
Synced + HealthyCluster matches Git, pods runningNone
OutOfSyncGit changed, cluster not yet updatedAuto-sync applies changes
DegradedPods failing health checksInvestigate, potential rollback
UnknownArgoCD can't reach the appCheck repo access

When self-heal is enabled, STOA automatically reverts any manual changes made via kubectl β€” ensuring Git remains the single source of truth.

Configuration Repository Structure​

stoa/
β”œβ”€β”€ stoa-gateway/
β”‚ └── k8s/
β”‚ └── deployment.yaml # Gateway K8s manifest
β”œβ”€β”€ control-plane-ui/
β”‚ └── k8s/
β”‚ └── deployment.yaml # Console UI manifest
β”œβ”€β”€ portal/
β”‚ └── k8s/
β”‚ └── deployment.yaml # Portal manifest
└── charts/
└── stoa-platform/
β”œβ”€β”€ crds/ # CRD definitions (Tool, ToolSet)
β”œβ”€β”€ templates/ # Helm templates
└── values.yaml # Default values

Best Practices​

  • Never manually edit cluster resources β€” All changes through Git PRs
  • Use PRs for all changes β€” Code review + CI validation before merge
  • Environment separation β€” Dev for experimentation, staging for validation, prod via promotion
  • Secrets via Infisical β€” Never store secrets in Git; use external secret management
  • Monitor ArgoCD sync status β€” Grafana dashboard with sync/health alerts
  • Rollback via Git β€” git revert the problematic commit, ArgoCD auto-syncs