Skip to content

Kubernetes Survival Guide

Docs

Quick reference for essential kubectl commands during the Run:AI course. For technical users who need basic Kubernetes navigation.

Basic kubectl Commands

Cluster and Context

# Check cluster connection
kubectl cluster-info

# View nodes
kubectl get nodes

# Check current context
kubectl config current-context

Namespaces

# List namespaces
kubectl get namespaces
kubectl get ns

# Set default namespace
kubectl config set-context --current --namespace=<namespace>

# View all namespaces
kubectl get pods --all-namespaces
kubectl get pods -A

Working with Pods

# List pods
kubectl get pods
kubectl get pods -o wide

# Describe pod
kubectl describe pod <pod-name>

# View logs
kubectl logs <pod-name>
kubectl logs -f <pod-name>  # follow logs

# Execute commands
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec <pod-name> -- <command>

# Copy files
kubectl cp <pod>:<path> <local-path>
kubectl cp <local-path> <pod>:<path>

Other Resources

# Deployments
kubectl get deployments
kubectl describe deployment <name>
kubectl scale deployment <name> --replicas=3

# Services
kubectl get services
kubectl describe service <name>

# ConfigMaps and Secrets
kubectl get configmaps
kubectl get secrets
kubectl describe secret <name>

# Events (for troubleshooting)
kubectl get events --sort-by=.metadata.creationTimestamp

Port Forwarding

# Forward port to pod
kubectl port-forward pod/<pod-name> <local-port>:<pod-port>

# Forward port to service
kubectl port-forward service/<service-name> <local-port>:<service-port>

# Example: Access Run:AI UI
kubectl port-forward -n runai-system service/runai-cluster-service 8080:80

Resource Monitoring

# Resource usage
kubectl top nodes
kubectl top pods

# GPU resources
kubectl describe node <node-name> | grep nvidia.com/gpu

# Persistent volumes
kubectl get pv
kubectl get pvc

Run:AI Specific

# Run:AI system components
kubectl get pods -n runai-system

# Project workloads
kubectl get pods -n runai-<project-name>

# GPU allocation on nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,GPU:.status.capacity.nvidia\.com/gpu

Troubleshooting

# Check pod status and events
kubectl describe pod <pod-name>

# View previous container logs
kubectl logs <pod-name> --previous

# Check resource constraints
kubectl describe node <node-name>

# Test connectivity from pod
kubectl exec -it <pod-name> -- ping <target>
kubectl exec -it <pod-name> -- nslookup kubernetes.default

Useful Aliases

Add to ~/.bashrc or ~/.zshrc:

alias k='kubectl'
alias kgp='kubectl get pods'
alias kgs='kubectl get services'
alias kdp='kubectl describe pod'
alias kl='kubectl logs'
alias kex='kubectl exec -it'

Quick Reference

Task Command
List pods kubectl get pods
Pod details kubectl describe pod <name>
Pod logs kubectl logs <name>
Shell access kubectl exec -it <name> -- /bin/bash
Port forward kubectl port-forward <name> 8080:80
Check events kubectl get events
Node status kubectl get nodes
All namespaces kubectl get pods -A