Helm Deployment Alert Rule
The CEE Ops Center comes equipped with a built-in alert rule - helm_deploy_failure - to indicate the failure status of helm chart deployment. This alert rule comes by default as a Prometheus alerting rule during CEE deployment.
The following is an alert rule definition for helm_deploy_failure alert in Prometheus:
- alert: helm-deploy-failure
annotations:
type: Processing Error Alarm
description: 'Helm chart {{$labels.chart}}/{{$labels.namespace}} deployment failed'
summary: 'Helm chart failed to deploy for 5 minutes'
expr: |
helm_chart_deploy_success < 1
labels:
severity: critical
for: 5m
The following example shows an alert generated when helm chart deployment fails.
alerts active helm-deploy-failure 3edde79a3f86
state active
severity critical
type "Processing Error Alarm"
startsAt 2020-04-17T17:55:57.084Z
source tfchan-dev
labels [ "chart: smi-show-tac" "chartVersion: 0.1.0-helmfail-0108-200310183805-6888120" "component: ops-center" "exported_release: cee-smi-show-tac" "instance: 192.168.190.28:8082" "job: kubernetes-pods" "namespace: cee" "pod: ops-center-cee-ops-center-5ccddd5d9f-6rffw" "pod_template_hash: 5ccddd5d9f" "release: cee-ops-center" ]
annotations [ "description: Helm chart smi-show-tac/cee deployment failed" "summary: Helm chart failed to deploy for 5 minutes" ]
Note | If SNMP Trapper is configured, this alert goes to the external SNMP receiver as an SNMP trap. For instance, when there is already a conflict of resources, the Helm deployment fails. |