Overview
GitOps provides a unique observability challenge: the state of your cluster is determined by Git, not just what's running. You need to monitor both sync status and application health.
Metrics to Track
- Sync Status: Is the cluster in sync with Git? (Is the operator lagging?)
- Sync Duration: How long does it take for a change in Git to appear in the cluster?
- Drift Detection: Has someone manually modified the cluster state, deviating from Git?
Example: Monitoring ArgoCD Sync Status (Prometheus)
# Monitor if an application is out of sync
argocd_app_info{sync_status="OutOfSync"}Expected Result:
An alert will trigger if the sync_status remains OutOfSync for more than 5 minutes.
Alert: ArgoCDApplicationOutOfSync
Labels: app=my-web-app, status=critical
Message: Application my-web-app is OutOfSync for > 5m