CEE Ops-Center Notification

The CEE Ops-Center alert-notification sends the following alerts for different VM states:

  • vm-deployed: minor - DEPLOYED

  • vm-alive: minor – ALIVE (alert lasts for a short time and disappears automatically)

  • vm-error: major - ERROR

  • vm-recovering: warning - RECOVERING

  • vm-recovery_failed: critical - RECOVERY_FAILED

All required fields are included in alert labels for notification from alert-notification. All VM alerts are viewable on the Grafana dashboard.

VM Action Notifications

Delete Action: When delete VM action is triggered, CM sends notifications that the VM is deleted. The VM states are UNDEPLOYED and ERROR for vm delete action.
clusters abc-cluster-15 nodes kvm-1 vms upf1 actions delete
Redeploy Action: When VM is in RECOVERY_FAILED state, NSO sends a request to redeploy the VM. A redeploy action does both delete action and sync action.
clusters abc-cluster-15 nodes kvm-1 vms upf1 actions redeploy

Redeploy Action Notification: The redeploy action sends a notification to the CM. The redeploy vm action has the following 4 states: UNDEPLOYED, ERROR, REDEPLOYED, REDEPLOY_ERROR.

show notification stream vm-state
 
notification
 eventTime 2021-02-23T21:27:28.692+00:00
 vm-state-notification
  cluster_name cndp-testbed
  node_name kvm-1
  vm_name upf2
  state UNDEPLOYED
  message
 !
!
notification
 eventTime 2021-02-23T21:29:18.699+00:00
 vm-state-notification
  cluster_name cndp-testbed
  node_name kvm-1
  vm_name upf2
  state REDEPLOYED
  message
 !
!

Configuring the Alert Notification in CEE

The user must configure alert notifications when they deploy the UPF VMs. Log in to the CEE cli to add the following configuration:

config
bulk-stats prune-interval-days 3
prometheus kvm-metrics defaults private-key "-----BEGIN OPENSSH PRIVATE KEY-----LGXtil23N4YV=\n-----END OPENSSH PRIVATE KEY-----\n"
prometheus kvm-metrics defaults user cloud-user
prometheus kvm-metrics monitor-server 10.194.62.41
 hostname abc-bm-15-master
exit
Note

The user must replace the IP, hostname, private key and user details.

Sample Notification from the Alert Notification Stream

notification
 eventTime 2021-01-08T03:28:54.501+00:00
 smi-alert-notification
  starts-at 2021-01-08T03:28:24.493874101Z
  ends-at 0001-01-01T00:00:00Z
  alert-status firing
  smi-alert-notification alert-label
   name alertname
   value vm-recovery-failed
  !
  smi-alert-notification alert-label
   name cluster
   value test-cee-kvm_cee-voice
!
  smi-alert-notification alert-label
   name hostname
   value test-bm-15-master
  !
  smi-alert-notification alert-label
   name instance
   value metrics-proxy-test-bm-15-master:9100
  !
  smi-alert-notification alert-label
   name job
   value metrics-proxy
  !
  smi-alert-notification alert-label
   name message
   value 10.1.1.3
  !
  smi-alert-notification alert-label
   name monitor
   value prometheus
  !
  smi-alert-notification alert-label
   name node_name
   value master
  !
  smi-alert-notification alert-label
   name replica
   value test-cee-kvm_cee-voice
  !
  smi-alert-notification alert-label
   name severity
   value critical
  !
  smi-alert-notification alert-label
   name state
   value RECOVERY_FAILED
  !
  smi-alert-notification alert-label
   name vm_name
   value upf2
  !
  smi-alert-notification alert-annotation
   name summary
   value upf2 failed to recover.
  !
  smi-alert-notification alert-annotation
   name type
   value Equipment Alarm
  !
 !
!

For information, refer to the UCC SMI Operations Guide – SMI Cluster Manager Operations chapter.