Kubernetes Survival Guide¶
Quick reference for essential kubectl commands during the Run:AI course. For technical users who need basic Kubernetes navigation.
Basic kubectl Commands¶
Cluster and Context¶
# Check cluster connection
kubectl cluster-info
# View nodes
kubectl get nodes
# Check current context
kubectl config current-context
Namespaces¶
# List namespaces
kubectl get namespaces
kubectl get ns
# Set default namespace
kubectl config set-context --current --namespace=<namespace>
# View all namespaces
kubectl get pods --all-namespaces
kubectl get pods -A
Working with Pods¶
# List pods
kubectl get pods
kubectl get pods -o wide
# Describe pod
kubectl describe pod <pod-name>
# View logs
kubectl logs <pod-name>
kubectl logs -f <pod-name> # follow logs
# Execute commands
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec <pod-name> -- <command>
# Copy files
kubectl cp <pod>:<path> <local-path>
kubectl cp <local-path> <pod>:<path>
Other Resources¶
# Deployments
kubectl get deployments
kubectl describe deployment <name>
kubectl scale deployment <name> --replicas=3
# Services
kubectl get services
kubectl describe service <name>
# ConfigMaps and Secrets
kubectl get configmaps
kubectl get secrets
kubectl describe secret <name>
# Events (for troubleshooting)
kubectl get events --sort-by=.metadata.creationTimestamp
Port Forwarding¶
# Forward port to pod
kubectl port-forward pod/<pod-name> <local-port>:<pod-port>
# Forward port to service
kubectl port-forward service/<service-name> <local-port>:<service-port>
# Example: Access Run:AI UI
kubectl port-forward -n runai-system service/runai-cluster-service 8080:80
Resource Monitoring¶
# Resource usage
kubectl top nodes
kubectl top pods
# GPU resources
kubectl describe node <node-name> | grep nvidia.com/gpu
# Persistent volumes
kubectl get pv
kubectl get pvc
Run:AI Specific¶
# Run:AI system components
kubectl get pods -n runai-system
# Project workloads
kubectl get pods -n runai-<project-name>
# GPU allocation on nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,GPU:.status.capacity.nvidia\.com/gpu
Troubleshooting¶
# Check pod status and events
kubectl describe pod <pod-name>
# View previous container logs
kubectl logs <pod-name> --previous
# Check resource constraints
kubectl describe node <node-name>
# Test connectivity from pod
kubectl exec -it <pod-name> -- ping <target>
kubectl exec -it <pod-name> -- nslookup kubernetes.default
Useful Aliases¶
Add to ~/.bashrc
or ~/.zshrc
:
alias k='kubectl'
alias kgp='kubectl get pods'
alias kgs='kubectl get services'
alias kdp='kubectl describe pod'
alias kl='kubectl logs'
alias kex='kubectl exec -it'
Quick Reference¶
Task | Command |
---|---|
List pods | kubectl get pods |
Pod details | kubectl describe pod <name> |
Pod logs | kubectl logs <name> |
Shell access | kubectl exec -it <name> -- /bin/bash |
Port forward | kubectl port-forward <name> 8080:80 |
Check events | kubectl get events |
Node status | kubectl get nodes |
All namespaces | kubectl get pods -A |