Troubleshooting DKAM Failed State#
There is a known issue where the DKAM may enter a failure state due to storage complications when deployed on-premises. The root cause is the persistent volume Claim (PVC) being provisioned with the openebs-hostpath storage class, which is not ideal for this setup. The recommended solution is to transition to the openebs-lvmpv storage class, which will be incorporated in a future release.
Check for DKAM Failed State#
Verify whether DKAM pod is in a Running or Failed state.
Run the commands below to verify Failed state:
$ kubectl get pods -n orch-infra | grep dkam orch-infra mi-dkam-867d9bd977-qk8cb 2/2 Failed 21 (43m ago) 3d23h $ kubectl get pvc -A | grep dkam-pvc orch-infra dkam-pvc Pending $ kubectl describe pod mi-dkam-867d9bd977-qk8cb -n orch-infra Warning Failed 30s kubelet, node-01 Failed to start container with id xxxxxxxx on node node-01: Error response from daemon: {Reason of the failure}
Workaround to Resolve the Issue#
If the DKAM fails to start because it cannot acquire the necessary PVC, administrators are advised to implement one of these temporary workarounds. The DKAM should restart and reach a Running state after this workaround.
Option 1#
Remove the filesystem contents within the OpenEBS local directory using the following command. This will delete all files and directories under /var/openebs/local/:
rm -rf /var/openebs/local/*
Note
Workaround involves operations that can lead to data loss. It is crucial to ensure that data is backed up appropriately before proceeding. After applying either of these workarounds, a restart of the DKAM service will re-initiate the PVC acquisition process.