CDL Pods are Down

This section describes how to bring up the CDL pods when they are down.

Issue Description

The CDL pods are not in the "running" state because of incorrect CDL configuration.

Identifying the Issue

Verify the "describe pods" output (Containers, Member, State, Reason, or Events) to identify whether the pods are down with the following command:

kubectl describe pods -n  <namespace> <failed pod name> 

Possible Causes

The possible causes are:

  • Pods are in "pending" state.

  • Pods are in "CrashLoopBackOff" failure state.

  • Pods are in "ImagePullBack" failure state.

Resolution

  • When the pods are in "pending" state:

    • Verify whether the k8s nodes with the label value cdl/node-type are present. Also, ensure that the number of replicas are less than or equal to the number of k8s nodes with the label value cdl/node-type.

      kubectl get nodes -l smi.cisco.com/node-type=<value of cdl/node-type, default value is 'session' in multi node setup)
      
  • When the pods are in "CrashLoopBackOff" failure state:

    • Verify the status of the ETCD pods.

      kubectl describe pods -n <namespace> <etcd pod name>

      If the ETCD pods are not running, resolve the ETCD issues to bring up the pods.

  • When the pods are in "ImagePullBack" failure state:

    • Verify whether the helm repository and image registry are accessible.

    • Verify whether the required proxy and DNS servers are configured.