Kubernetes Survival Guide¶

Docs

Quick reference for essential kubectl commands during the Run:AI course. For technical users who need basic Kubernetes navigation.

Basic kubectl Commands¶

Cluster and Context¶

# Check cluster connection
kubectl cluster-info

# View nodes
kubectl get nodes

# Check current context
kubectl config current-context

Namespaces¶

# List namespaces
kubectl get namespaces
kubectl get ns

# Set default namespace
kubectl config set-context --current --namespace=<namespace>

# View all namespaces
kubectl get pods --all-namespaces
kubectl get pods -A

Working with Pods¶

# List pods
kubectl get pods
kubectl get pods -o wide

# Describe pod
kubectl describe pod <pod-name>

# View logs
kubectl logs <pod-name>
kubectl logs -f <pod-name>  # follow logs

# Execute commands
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec <pod-name> -- <command>

# Copy files
kubectl cp <pod>:<path> <local-path>
kubectl cp <local-path> <pod>:<path>

Other Resources¶

# Deployments
kubectl get deployments
kubectl describe deployment <name>
kubectl scale deployment <name> --replicas=3

# Services
kubectl get services
kubectl describe service <name>

# ConfigMaps and Secrets
kubectl get configmaps
kubectl get secrets
kubectl describe secret <name>

# Events (for troubleshooting)
kubectl get events --sort-by=.metadata.creationTimestamp

Port Forwarding¶

# Forward port to pod
kubectl port-forward pod/<pod-name> <local-port>:<pod-port>

# Forward port to service
kubectl port-forward service/<service-name> <local-port>:<service-port>

# Example: Access Run:AI UI
kubectl port-forward -n runai-system service/runai-cluster-service 8080:80

Resource Monitoring¶

# Resource usage
kubectl top nodes
kubectl top pods

# GPU resources
kubectl describe node <node-name> | grep nvidia.com/gpu

# Persistent volumes
kubectl get pv
kubectl get pvc

Run:AI Specific¶

# Run:AI system components
kubectl get pods -n runai-system

# Project workloads
kubectl get pods -n runai-<project-name>

# GPU allocation on nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,GPU:.status.capacity.nvidia\.com/gpu

Troubleshooting¶

# Check pod status and events
kubectl describe pod <pod-name>

# View previous container logs
kubectl logs <pod-name> --previous

# Check resource constraints
kubectl describe node <node-name>

# Test connectivity from pod
kubectl exec -it <pod-name> -- ping <target>
kubectl exec -it <pod-name> -- nslookup kubernetes.default

Useful Aliases¶

Add to ~/.bashrc or ~/.zshrc:

alias k='kubectl'
alias kgp='kubectl get pods'
alias kgs='kubectl get services'
alias kdp='kubectl describe pod'
alias kl='kubectl logs'
alias kex='kubectl exec -it'

Quick Reference¶

Task	Command
List pods	`kubectl get pods`
Pod details	`kubectl describe pod <name>`
Pod logs	`kubectl logs <name>`
Shell access	`kubectl exec -it <name> -- /bin/bash`
Port forward	`kubectl port-forward <name> 8080:80`
Check events	`kubectl get events`
Node status	`kubectl get nodes`
All namespaces	`kubectl get pods -A`