Upgrading The Cluster¶
Note - Always refer to documentation - this is just a students' guide
Pre-Reqs¶
Set some ENVs for your cluster¶
- Set your domain
- Change to the Artifacts directory
Back up your runai-backend
values¶
- Recover values with helm:
- Run the upgrade to the new version
helm upgrade runai-backend -n runai-backend runai-backend/control-plane \
--version "~2.21" -f runai_control_plane_values.yaml --reset-then-reuse-values
Observe the Backend Pods¶
- In your terminal, launch K9S
- Set the namespace
Install the Cluster¶
- When all pods are healthy in the
runai-backend
namespace: - Go to the Resources / Clusters page
- Check the box next to the cluster you want to upgrade and select "GET INSTALLATION INSTRUCTIONS"
- Deploy to the same cluster as the control plane and set relevant matching version
- Copy the installation instructions to deploy
- Click "done"
Observe the Cluster Pods¶
- In your terminal, launch K9S
- Set the namespace
Check for the correct nodes and GPUs¶
- Check the Resources / Clusters tab to ensure both the backend and cluster are at the correct version
- Visit the nodes and page to ensure that all expected GPUs are registered
- View the dashboards to ensure that metrics are working as expected