CDL Alarms

This section describes the alarms in the CDL deployment.

The following alarms are introduced to detect failure:

Alarms

Alarm

Severity

Description

cdlLocalRequestFailure

critical

This alarm is triggered if the local request success rate is less than 90%, for more than 5 minutes.

cdlRemoteConnectionFailure

critical

This alarm is triggered if the active connections from the endpoint pod to the remote site reaches 0, for longer than 5 minutes. This alarm is for GR deployment only.

cdlRemoteRequestFailure

critical

This alarm is triggered if the incoming remote request success rate is less than 90%, for more than 5 minutes. This alarm is for GR deployment only.

cdlReplicationError

critical

If the ratio of outgoing replication requests to the local requests in cdl-global namespace is below 90%, for more than 5 minutes. This alarm is for GR deployment only.

cdlKafkaRemoteReplicationDelay

critical

This alarm is triggered if the kafka replication delay to the remote site crosses 10 seconds, for longer than 5 minutes. This alarm is for GR deployment only.