GitOps and Delivery
Core Principle
Section titled “Core Principle”GitOps means desired cluster state is declarative, versioned, reviewed, and reconciled by controllers. The cluster should not depend on someone remembering the right kubectl command.
flowchart LR Change[Change request] --> PR[Pull request] PR --> CI[CI validation] CI --> Merge[Merge to environment branch] Merge --> GitOps[Argo CD or Flux] GitOps --> Cluster[Kubernetes cluster] Cluster --> Health[Health and drift status] Health -. feedback .-> PR
Argo CD
Section titled “Argo CD”Argo CD is a Kubernetes controller that compares live cluster state to desired state in Git, reports drift as out-of-sync, and can sync changes automatically or manually.
Use it for:
- Application and platform component deployment.
- Multi-tenant app projects.
- Drift visibility.
- Manual gates for sensitive environments.
- Progressive promotion with related tools.
Flux keeps clusters in sync with configuration sources and can automate updates when new code or images are available. It is composable through controllers and works well for GitOps at fleet scale.
Use it for:
- Cluster bootstrap.
- Helm release management.
- Image update automation.
- SOPS-based secret flows.
- App and infrastructure dependencies.
Helm vs Kustomize
Section titled “Helm vs Kustomize”| Tool | Good for | Risk |
|---|---|---|
| Helm | Packaged apps with values. | Template complexity and values drift. |
| Kustomize | Environment overlays and patching. | Overlay sprawl and hidden differences. |
Staff answer:
I care less about Helm vs Kustomize than about reproducibility: pinned versions, rendered diff, policy checks, promotion path, rollback, and post-sync validation.
Progressive Delivery
Section titled “Progressive Delivery”For inference:
- Canary model/runtime changes.
- Use synthetic requests before real traffic.
- Gate on p99, error rate, queue time, GPU memory, and correctness checks.
- Roll back traffic first; clean up state second.
- Separate model artifact rollout from driver/runtime rollout when possible.
Tools you can mention:
- Argo Rollouts.
- Flagger.
- Gateway API/service mesh traffic splitting.
- Prometheus/OpenTelemetry metrics as rollout gates.
stateDiagram-v2 [*] --> Deploy Deploy --> ValidateHealth ValidateHealth --> Shadow Shadow --> Canary Canary --> Promote: SLO and correctness pass Canary --> Rollback: SLO or correctness fail Promote --> Full Full --> [*] Rollback --> Previous Previous --> [*]
CI/CD Pipeline
Section titled “CI/CD Pipeline”- Build runtime image.
- Scan and sign image.
- Register immutable model artifact.
- Run unit/integration/load tests.
- Render Kubernetes manifests.
- Policy check.
- Diff against target.
- Merge to environment branch/path.
- GitOps controller syncs.
- Post-sync validation.
- Progressive traffic shift.
Incident Scenario
Section titled “Incident Scenario”Question: “Someone hotfixed production with kubectl and GitOps reverted it.”
Answer:
- Confirm impact.
- If hotfix is needed, commit it to source of truth or pause sync intentionally with approval.
- Avoid fighting the controller.
- Afterward, define break-glass process: who can pause sync, how changes are recorded, and how drift is reconciled.
Senior Close
Section titled “Senior Close”GitOps is not a substitute for release engineering. It gives you reconciliation and auditability, but you still need rollout policy, health gates, dependency ordering, and a humane break-glass path.