Issues This KB Resolves
- A pod is stuck pending due to CPU/Memory constraints on a node with the pods pinned volume
0/5 nodes are available: 1 Insufficient cpu, 4 node(s) had volume node affinity conflict.
- A user needs to migrate a single pod to another node.
- A notebook needs to be migrated to another node.
- A notebook is unable to be scheduled due to capacity issues
- A notebooks previous node was scaled or drained away
Rok is a powerful tool when it comes to snapshotting and rescheduling pods.
You can check out our drain documentation to learn more about how to safely drain nodes. We can schedule Rok volume backed pods across availability zones, clouds, or even to an on prem environment as long as they are within a single cluster with access to object storage. Rok-Registry will be the tool of choice for migrations across clusters or environments.
How does this work?
The csi-controller checks for *:NoSchedule taints to determined if a PVC needs unpinning. The controller moves the PersistentVolume(PV) to a different node by removing the
nodeAffinity for that PV, effectively allowing Kubernetes to use that volume from anywhere else, and when Kubernetes tries to mount that volume to a new node (
ControllerPublishVolume CSI call), it will recreate the volume to that node using the latest available snapshot and bound the PV to that node (re-add
nodeAffinity). That means the SAME volume from a Kubernetes perspective is now bound to a new node even though the storage is local. No more attaching and reattaching volumes from external storage. We still recommend you bind node pools to a single availability zone. As of EKF 1.5 we have a capacity aware scheduler that should prevent a pod from being moved to a node without the expected capacity.
kubectl cordon NODE
kubectl delete POD
kubectl uncordon NODE
Alternate Notebook Procedure
If the pod is a notebook and the above process doesn't work, you can use Roks snapshot capabilities to move the notebook. Note that the docs referenced are from an earlier release, but the processes has not changed. If you run into any issues please contact Arrikto support.
- Take a snapshot of the notebook. https://docs.arrikto.com/release-1.4/user/rok/snapshot-notebook.html
- Create a new notebook from this snapshot. https://docs.arrikto.com/release-1.4/user/rok/present-notebook.html
- Delete the old notebook via the Kubeflow dashboard
Pod is UNABLE to schedule
When attempting to determine why a pod MAY not be able to schedule we can look at the pod description and see what the issue is using a kubectl describe pod <pod-name> command even if the pod is pending and unable to be schedule. Occasionally (pre 1.5 release) pods can be scheduled onto nodes without the appropriate amount of storage capacity. This can also happen if a user spins up a notebook and then stops a notebook. If the node the notebook WAS on is scaled away or drained, the notebook will need to be moved if it hasn't already. Make sure to confirm the requested ephemeral-storage is not saturating the cluster.
If you DO NOT have enough capacity on the node but the PersistentVolume is showing a node affinity.
kubectl get pv <your-pvc> -o yaml
the nodeAffinity will show you the node the pod is expected to be scheduled on as well as where the volume SHOULD live.
You will need to:
- Create a new notebook from a snapshot https://docs.arrikto.com/release-1.4/user/rok/present-notebook.html
- Delete the old notebook via the Kubeflow dashboard.
Rok when scheduling nodes uses the latest snapshot available. During a drain operation the snapshot is taken from the volume living on the node, but in this case the volume hasn't been hydrated yet, so Rok is using the last snapshot it took. Depending on when the last snapshot was taken and if nodes were protected from unsafe operations the volume will be current or potentially missing some data. We highly recommend not deleting nodes for any reason and following our unsafe operations prevention guide.