Skip to content

Learn to Run - Platform Configuration - Set Node Roles

Docs

Note - Always refer to documentation - this is just a students' guide

The following node roles can be configured on the cluster:

  1. System node: Reserved for Run:ai system-level services.

  2. GPU Worker node: Dedicated for GPU-based workloads.

  3. CPU Worker node: Used for CPU-only workloads.

Pre-reqs

  1. Ensure that scheduling restrictions are enabled in the cluster.

Edit the runaiconfig file to set global.nodeAffinity.restrictScheduling to true.

kubectl edit runaiconfig runai -n runai
# Add the following field:
#     global.affinity.nodeAffinity.restrictScheduling: true
  1. Label the node to reflect the role:
# List the nodes
kubectl get nodes
# Choose our node to restrict to CPU only workloads
kubectl label nodes <node-name> node-role.kubernetes.io/runai-cpu-worker=true
  1. Check the label has stuck:
kubectl get no <node-name> --show labels
  1. Reset the label:
kubectl label node <node-name> node-role.kubernetes.io/runai-cpu-worker-