Fenil Sonani

Kubernetes Complete Guide: Architecture, Concepts & Practical Implementation

1 min read

Kubernetes Complete Guide: Architecture, Concepts & Practical Implementation

Kubernetes has become the de facto standard for container orchestration, enabling organizations to deploy, scale, and manage containerized applications at enterprise scale. This comprehensive guide covers everything from basic concepts to advanced production patterns, giving you the knowledge to effectively leverage Kubernetes in your projects.

Table of Contents

  1. What is Kubernetes and Why Use It?
  2. Kubernetes Architecture Deep Dive
  3. Core Kubernetes Objects
  4. Setting Up Your First Cluster
  5. Working with Pods
  6. Deployments and ReplicaSets
  7. Services and Networking
  8. ConfigMaps and Secrets
  9. Storage and Persistent Volumes
  10. Advanced Scheduling and Resource Management
  11. Monitoring and Logging
  12. Security Best Practices
  13. Production Deployment Patterns

What is Kubernetes and Why Use It?

Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. Originally developed by Google and now maintained by the Cloud Native Computing Foundation (CNCF), Kubernetes has revolutionized how we run applications at scale.

Key Benefits of Kubernetes

🚀 Container Orchestration at Scale

  • Automated deployment and scaling of containers
  • Self-healing capabilities with automatic restart and replacement
  • Load balancing and service discovery
  • Rolling updates and rollbacks

📈 Resource Optimization

  • Efficient resource utilization across clusters
  • Automatic bin packing based on resource requirements
  • Horizontal and vertical pod autoscaling
  • Multi-tenancy support

🔄 Developer Productivity

  • Declarative configuration management
  • Infrastructure as Code principles
  • Consistent development and production environments
  • Simplified application deployment workflows

🏢 Enterprise Features

  • Multi-cloud and hybrid cloud support
  • Built-in security features and RBAC
  • Extensible architecture with custom resources
  • Rich ecosystem of tools and integrations

When to Use Kubernetes

✅ Ideal Use Cases:

  • Microservices architectures
  • Applications requiring high availability
  • Multi-environment deployments (dev, staging, prod)
  • Applications with varying load patterns
  • Container-native applications
  • Teams practicing DevOps and CI/CD

❌ Consider Alternatives When:

  • Simple, single-container applications
  • Legacy monolithic applications
  • Small teams without Kubernetes expertise
  • Applications with minimal scaling requirements

Kubernetes Architecture Deep Dive

Understanding Kubernetes architecture is crucial for effective cluster management and troubleshooting. Let's explore the key components and their interactions.

Cluster Architecture Overview

Loading diagram...
Tap the fullscreen button to maximize and zoom

Master Node Components

API Server (kube-apiserver)

  • Central management component
  • RESTful API for all cluster operations
  • Authentication and authorization
  • Validation and admission control

etcd

  • Distributed key-value store
  • Stores all cluster configuration and state
  • Provides consistency and high availability
  • Backup and restore capabilities

Scheduler (kube-scheduler)

  • Assigns pods to worker nodes
  • Considers resource requirements, constraints, and policies
  • Implements scheduling algorithms and priorities
  • Handles node affinity and anti-affinity rules

Controller Manager (kube-controller-manager)

  • Runs controller processes
  • Maintains desired state of the cluster
  • Handles node lifecycle, replication, and endpoints
  • Implements self-healing capabilities

Worker Node Components

kubelet

  • Primary node agent
  • Communicates with API server
  • Manages pod lifecycle on the node
  • Reports node and pod status

kube-proxy

  • Network proxy and load balancer
  • Implements Kubernetes Service abstraction
  • Manages network rules and routing
  • Handles service discovery

Container Runtime

  • Runs containers (Docker, containerd, CRI-O)
  • Pulls container images
  • Manages container lifecycle
  • Provides container isolation

Kubernetes Networking Model

Kubernetes implements a flat networking model with these requirements:

  1. Pod-to-Pod Communication: All pods can communicate without NAT
  2. Node-to-Pod Communication: Nodes can communicate with all pods
  3. Pod IP Addressing: Each pod gets a unique IP address
  4. Service Abstraction: Services provide stable endpoints for pods

Core Kubernetes Objects

Kubernetes uses a declarative API with various object types to represent your desired cluster state.

Fundamental Objects

Pods

  • Smallest deployable unit
  • One or more containers sharing network and storage
  • Ephemeral and replaceable
  • Scheduled on worker nodes

Services

  • Stable network endpoint for pods
  • Load balancing and service discovery
  • Types: ClusterIP, NodePort, LoadBalancer
  • Selector-based pod targeting

Volumes

  • Persistent storage for pods
  • Shared between containers in a pod
  • Various types: EmptyDir, HostPath, PVC
  • Lifecycle tied to pod (except persistent volumes)

Namespaces

  • Virtual cluster isolation
  • Resource organization and multi-tenancy
  • RBAC and resource quota boundaries
  • Default, system, and custom namespaces

Workload Objects

Deployments

  • Declarative pod management
  • Rolling updates and rollbacks
  • Replica management
  • Self-healing capabilities

ReplicaSets

  • Ensures desired number of pod replicas
  • Usually managed by Deployments
  • Pod template and selector specification
  • Horizontal scaling capabilities

DaemonSets

  • Runs pods on every node (or subset)
  • System-level services and monitoring
  • Automatic scheduling on new nodes
  • Use cases: logging, monitoring, networking

StatefulSets

  • Manages stateful applications
  • Stable network identities and storage
  • Ordered deployment and scaling
  • Persistent volume claims per replica

Jobs and CronJobs

  • Batch and scheduled workloads
  • Run-to-completion semantics
  • Parallel job execution
  • Automated cleanup policies

Setting Up Your First Cluster

Let's get hands-on with Kubernetes by setting up different types of clusters for various use cases.

Local Development Setup

1. Using minikube

# Install minikube
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

# Start minikube cluster
minikube start --driver=docker --memory=4096 --cpus=2

# Verify cluster status
kubectl cluster-info
kubectl get nodes

# Enable useful addons
minikube addons enable dashboard
minikube addons enable metrics-server
minikube addons enable ingress

# Access Kubernetes dashboard
minikube dashboard

2. Using kind (Kubernetes in Docker)

# Install kind
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# Create cluster configuration
cat > kind-config.yaml << EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP
- role: worker
- role: worker
EOF

# Create cluster
kind create cluster --config kind-config.yaml --name dev-cluster

# Verify cluster
kubectl cluster-info --context kind-dev-cluster

3. Installing kubectl

# Download kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

# Install kubectl
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

# Verify installation
kubectl version --client
kubectl version --short

# Configure auto-completion
echo 'source <(kubectl completion bash)' >>~/.bashrc
echo 'alias k=kubectl' >>~/.bashrc
echo 'complete -o default -F __start_kubectl k' >>~/.bashrc
source ~/.bashrc

Production Cluster Setup

1. Using kubeadm (Self-managed)

# Prepare all nodes (master and workers)
cat > setup-node.sh << 'EOF'
#!/bin/bash

# Disable swap
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# Install container runtime (containerd)
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y containerd.io

# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
sudo systemctl enable containerd

# Install kubelet, kubeadm, kubectl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

# Enable kubelet
sudo systemctl enable kubelet
EOF

chmod +x setup-node.sh
./setup-node.sh

# Initialize master node
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=<MASTER_IP>

# Configure kubectl for your user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# Install pod network (Flannel)
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

# Join worker nodes (run on each worker)
sudo kubeadm join <MASTER_IP>:6443 --token <TOKEN> --discovery-token-ca-cert-hash sha256:<HASH>

2. Using Managed Services

# AWS EKS
eksctl create cluster \
  --name production-cluster \
  --version 1.28 \
  --region us-west-2 \
  --nodegroup-name workers \
  --node-type t3.medium \
  --nodes 3 \
  --nodes-min 1 \
  --nodes-max 4 \
  --managed

# Google GKE
gcloud container clusters create production-cluster \
  --zone us-central1-a \
  --machine-type e2-medium \
  --num-nodes 3 \
  --enable-autoscaling \
  --min-nodes 1 \
  --max-nodes 10

# Azure AKS
az aks create \
  --resource-group myResourceGroup \
  --name production-cluster \
  --node-count 3 \
  --enable-addons monitoring \
  --generate-ssh-keys

Working with Pods

Pods are the fundamental execution unit in Kubernetes. Let's explore how to create, manage, and troubleshoot pods effectively.

Basic Pod Operations

1. Creating Your First Pod

# simple-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-first-pod
  labels:
    app: web
    environment: development
spec:
  containers:
  - name: web-server
    image: nginx:1.21
    ports:
    - containerPort: 80
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
# Create the pod
kubectl apply -f simple-pod.yaml

# Verify pod creation
kubectl get pods
kubectl get pod my-first-pod -o wide

# Check pod details
kubectl describe pod my-first-pod

# View pod logs
kubectl logs my-first-pod

# Execute commands in pod
kubectl exec -it my-first-pod -- /bin/bash

# Delete the pod
kubectl delete pod my-first-pod

2. Multi-Container Pod Example

# multi-container-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: web-server
    image: nginx:1.21
    ports:
    - containerPort: 80
    volumeMounts:
    - name: shared-data
      mountPath: /usr/share/nginx/html
  
  - name: content-puller
    image: alpine/git
    command: ["/bin/sh"]
    args:
    - -c
    - |
      while true; do
        git clone https://github.com/user/content.git /tmp/content
        cp -r /tmp/content/* /data/
        rm -rf /tmp/content
        sleep 300
      done
    volumeMounts:
    - name: shared-data
      mountPath: /data
      
  volumes:
  - name: shared-data
    emptyDir: {}

3. Pod with Init Containers

# pod-with-init.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-init
spec:
  initContainers:
  - name: init-database
    image: postgres:13
    command: ['sh', '-c']
    args:
    - |
      until pg_isready -h database-service -p 5432; do
        echo "Waiting for database..."
        sleep 2
      done
      echo "Database is ready!"
      
  - name: init-migrations
    image: migrate/migrate
    command: ['migrate']
    args: ['-path', '/migrations', '-database', 'postgres://user:pass@database-service/db', 'up']
    
  containers:
  - name: app
    image: my-app:latest
    ports:
    - containerPort: 3000

Advanced Pod Configuration

1. Pod Security Context

# secure-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: my-app:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      capabilities:
        drop:
        - ALL
        add:
        - NET_BIND_SERVICE
    volumeMounts:
    - name: temp-storage
      mountPath: /tmp
  volumes:
  - name: temp-storage
    emptyDir: {}

2. Pod with Probes

# pod-with-probes.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-probes
spec:
  containers:
  - name: app
    image: my-app:latest
    ports:
    - containerPort: 8080
    
    # Startup probe - for slow-starting containers
    startupProbe:
      httpGet:
        path: /startup
        port: 8080
      failureThreshold: 30
      periodSeconds: 10
      
    # Readiness probe - determines if pod is ready for traffic
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
      
    # Liveness probe - determines if pod is healthy
    livenessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20
      failureThreshold: 3

Deployments and ReplicaSets

Deployments provide declarative updates for pods and ReplicaSets, making them the preferred way to manage stateless applications.

Basic Deployment Operations

1. Creating a Deployment

# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
# Deploy the application
kubectl apply -f nginx-deployment.yaml

# Check deployment status
kubectl get deployments
kubectl get replicasets
kubectl get pods

# Scale the deployment
kubectl scale deployment nginx-deployment --replicas=5

# Check rollout status
kubectl rollout status deployment nginx-deployment

2. Rolling Updates and Rollbacks

# Update the deployment image
kubectl set image deployment/nginx-deployment nginx=nginx:1.22

# Monitor rollout
kubectl rollout status deployment/nginx-deployment

# Check rollout history
kubectl rollout history deployment/nginx-deployment

# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment

# Rollback to specific revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2

# Pause and resume rollout
kubectl rollout pause deployment/nginx-deployment
kubectl rollout resume deployment/nginx-deployment

3. Advanced Deployment Strategies

# blue-green-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
      version: blue
  template:
    metadata:
      labels:
        app: my-app
        version: blue
    spec:
      containers:
      - name: app
        image: my-app:v1.0
        ports:
        - containerPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-green
spec:
  replicas: 0  # Initially no replicas
  selector:
    matchLabels:
      app: my-app
      version: green
  template:
    metadata:
      labels:
        app: my-app
        version: green
    spec:
      containers:
      - name: app
        image: my-app:v2.0
        ports:
        - containerPort: 8080

4. Canary Deployment

# canary-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-stable
spec:
  replicas: 9  # 90% of traffic
  selector:
    matchLabels:
      app: my-app
      track: stable
  template:
    metadata:
      labels:
        app: my-app
        track: stable
    spec:
      containers:
      - name: app
        image: my-app:v1.0
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-canary
spec:
  replicas: 1  # 10% of traffic
  selector:
    matchLabels:
      app: my-app
      track: canary
  template:
    metadata:
      labels:
        app: my-app
        track: canary
    spec:
      containers:
      - name: app
        image: my-app:v2.0

Services and Networking

Services provide stable network endpoints for your pods and enable communication between different parts of your application.

Service Types and Use Cases

1. ClusterIP Service (Default)

# clusterip-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: ClusterIP
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP

2. NodePort Service

# nodeport-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx-nodeport
spec:
  type: NodePort
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
    nodePort: 30080
    protocol: TCP

3. LoadBalancer Service

# loadbalancer-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx-loadbalancer
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP

4. Headless Service

# headless-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx-headless
spec:
  clusterIP: None  # Makes it headless
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80

Ingress and Advanced Networking

1. Basic Ingress Configuration

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - myapp.example.com
    secretName: myapp-tls
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx-service
            port:
              number: 80
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 3000

2. Network Policies

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: web-netpol
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    - namespaceSelector:
        matchLabels:
          name: production
    ports:
    - protocol: TCP
      port: 80
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432
  - to: []  # Allow DNS
    ports:
    - protocol: UDP
      port: 53

3. Service Mesh with Istio

# virtual-service.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: reviews
spec:
  http:
  - match:
    - headers:
        end-user:
          exact: jason
    route:
    - destination:
        host: reviews
        subset: v2
  - route:
    - destination:
        host: reviews
        subset: v1
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

ConfigMaps and Secrets

Kubernetes provides ConfigMaps and Secrets to manage configuration data and sensitive information separately from your application code.

Working with ConfigMaps

1. Creating ConfigMaps

# app-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  database_url: "postgresql://localhost:5432/mydb"
  log_level: "info"
  max_connections: "100"
  feature_flags: |
    {
      "new_ui": true,
      "beta_features": false,
      "analytics": true
    }
  nginx.conf: |
    server {
        listen 80;
        server_name localhost;
        location / {
            root /usr/share/nginx/html;
            index index.html;
        }
    }
# Create ConfigMap from command line
kubectl create configmap app-config \
  --from-literal=database_url=postgresql://localhost:5432/mydb \
  --from-literal=log_level=info \
  --from-file=nginx.conf=./nginx.conf

# Create from directory
kubectl create configmap web-config --from-file=./config-dir/

2. Using ConfigMaps in Pods

# pod-with-configmap.yaml
apiVersion: v1
kind: Pod
metadata:
  name: app-pod
spec:
  containers:
  - name: app
    image: my-app:latest
    
    # Environment variables from ConfigMap
    env:
    - name: DATABASE_URL
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: database_url
    - name: LOG_LEVEL
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: log_level
    
    # All keys as environment variables
    envFrom:
    - configMapRef:
        name: app-config
    
    # Mount as volume
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
    - name: nginx-config
      mountPath: /etc/nginx/nginx.conf
      subPath: nginx.conf
      
  volumes:
  - name: config-volume
    configMap:
      name: app-config
  - name: nginx-config
    configMap:
      name: app-config

Managing Secrets

1. Creating Secrets

# database-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: database-secret
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded 'admin'
  password: MWYyZDFlMmU2N2Rm  # base64 encoded password
# Create secret from command line
kubectl create secret generic database-secret \
  --from-literal=username=admin \
  --from-literal=password=secretpassword

# Create TLS secret
kubectl create secret tls tls-secret \
  --cert=path/to/tls.cert \
  --key=path/to/tls.key

# Create Docker registry secret
kubectl create secret docker-registry regcred \
  --docker-server=my-registry.com \
  --docker-username=myuser \
  --docker-password=mypassword \
  --docker-email=[email protected]

2. Using Secrets in Pods

# pod-with-secrets.yaml
apiVersion: v1
kind: Pod
metadata:
  name: app-pod
spec:
  containers:
  - name: app
    image: my-app:latest
    
    # Environment variables from Secret
    env:
    - name: DB_USERNAME
      valueFrom:
        secretKeyRef:
          name: database-secret
          key: username
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: database-secret
          key: password
    
    # Mount secret as volume
    volumeMounts:
    - name: secret-volume
      mountPath: /etc/secrets
      readOnly: true
      
  # Use secret for pulling images
  imagePullSecrets:
  - name: regcred
  
  volumes:
  - name: secret-volume
    secret:
      secretName: database-secret
      defaultMode: 0400  # Read-only for owner

Storage and Persistent Volumes

Kubernetes provides several storage options for stateful applications, from temporary storage to persistent volumes that survive pod restarts.

Storage Types

1. Ephemeral Storage

# ephemeral-storage.yaml
apiVersion: v1
kind: Pod
metadata:
  name: ephemeral-storage-pod
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: cache-volume
      mountPath: /tmp/cache
    - name: config-volume
      mountPath: /etc/config
      
  volumes:
  # Temporary storage (deleted with pod)
  - name: cache-volume
    emptyDir:
      sizeLimit: 1Gi
      
  # Host path (not recommended for production)
  - name: config-volume
    hostPath:
      path: /host/config
      type: Directory

2. Persistent Volumes and Claims

# persistent-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: fast-ssd
  hostPath:
    path: /data/mysql
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

3. Dynamic Storage Provisioning

# storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dynamic-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 5Gi

StatefulSets for Stateful Applications

# mysql-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql-headless
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: mysql-storage
          mountPath: /var/lib/mysql
        - name: mysql-config
          mountPath: /etc/mysql/conf.d
      volumes:
      - name: mysql-config
        configMap:
          name: mysql-config
  volumeClaimTemplates:
  - metadata:
      name: mysql-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-headless
spec:
  clusterIP: None
  selector:
    app: mysql
  ports:
  - port: 3306

Advanced Scheduling and Resource Management

Kubernetes provides sophisticated scheduling capabilities to optimize resource utilization and meet application requirements.

Resource Requests and Limits

1. Resource Management

# resource-management.yaml
apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: app
    image: my-app:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
        ephemeral-storage: "1Gi"
      limits:
        memory: "512Mi"
        cpu: "500m"
        ephemeral-storage: "2Gi"
    
    # QoS class will be "Burstable"

2. Quality of Service Classes

# Guaranteed QoS
apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "200Mi"
        cpu: "200m"
      limits:
        memory: "200Mi"  # Same as requests
        cpu: "200m"      # Same as requests
---
# BestEffort QoS
apiVersion: v1
kind: Pod
metadata:
  name: besteffort-pod
spec:
  containers:
  - name: app
    image: nginx
    # No resource requests or limits
---
# Burstable QoS
apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "100Mi"
      limits:
        memory: "200Mi"  # Different from requests

Advanced Scheduling

1. Node Affinity and Anti-Affinity

# node-affinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: node-affinity-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64
          - key: node-type
            operator: In
            values:
            - compute-optimized
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        preference:
          matchExpressions:
          - key: zone
            operator: In
            values:
            - us-west-2a
  containers:
  - name: app
    image: my-app:latest

2. Pod Affinity and Anti-Affinity

# pod-affinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-affinity-demo
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - database
        topologyKey: kubernetes.io/hostname
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - web
          topologyKey: kubernetes.io/hostname
  containers:
  - name: app
    image: my-app:latest

3. Taints and Tolerations

# Taint a node
kubectl taint nodes node1 key1=value1:NoSchedule
kubectl taint nodes node1 dedicated=gpu:NoSchedule

# Remove taint
kubectl taint nodes node1 key1=value1:NoSchedule-
# tolerations.yaml
apiVersion: v1
kind: Pod
metadata:
  name: toleration-pod
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
  - key: "experimental"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 3600
  containers:
  - name: app
    image: gpu-app:latest

Horizontal Pod Autoscaler

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 60
      selectPolicy: Max

Monitoring and Logging

Effective monitoring and logging are crucial for maintaining healthy Kubernetes clusters and applications.

Prometheus and Grafana Setup

1. Prometheus Configuration

# prometheus.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: prom/prometheus:latest
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: prometheus-config
          mountPath: /etc/prometheus
        - name: prometheus-storage
          mountPath: /prometheus
        args:
        - --config.file=/etc/prometheus/prometheus.yml
        - --storage.tsdb.path=/prometheus
        - --web.console.libraries=/usr/share/prometheus/console_libraries
        - --web.console.templates=/usr/share/prometheus/consoles
        - --storage.tsdb.retention.time=15d
        - --web.enable-lifecycle
      volumes:
      - name: prometheus-config
        configMap:
          name: prometheus-config
      - name: prometheus-storage
        persistentVolumeClaim:
          claimName: prometheus-pvc
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
    
    rule_files:
      - "rules/*.yml"
    
    scrape_configs:
    - job_name: 'kubernetes-apiservers'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https
    
    - job_name: 'kubernetes-nodes'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
    
    - job_name: 'kubernetes-pods'
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name

2. Application Metrics Integration

// Node.js application with Prometheus metrics
const express = require('express');
const promClient = require('prom-client');

const app = express();

// Create a Registry to register the metrics
const register = new promClient.Registry();

// Add default metrics
promClient.collectDefaultMetrics({
  app: 'my-nodejs-app',
  timeout: 10000,
  gcDurationBuckets: [0.001, 0.01, 0.1, 1, 2, 5],
  register
});

// Custom metrics
const httpRequestsTotal = new promClient.Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
  registers: [register]
});

const httpRequestDuration = new promClient.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
  registers: [register]
});

// Middleware to collect metrics
app.use((req, res, next) => {
  const start = Date.now();
  
  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    const route = req.route ? req.route.path : req.path;
    
    httpRequestsTotal.inc({
      method: req.method,
      route,
      status_code: res.statusCode
    });
    
    httpRequestDuration.observe({
      method: req.method,
      route,
      status_code: res.statusCode
    }, duration);
  });
  
  next();
});

// Metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

// Health check endpoint
app.get('/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

app.listen(3000, () => {
  console.log('Server running on port 3000');
});

Centralized Logging with ELK Stack

1. Elasticsearch Deployment

# elasticsearch.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.8.0
        env:
        - name: cluster.name
          value: "elasticsearch"
        - name: node.name
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: discovery.seed_hosts
          value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
        - name: cluster.initial_master_nodes
          value: "elasticsearch-0,elasticsearch-1,elasticsearch-2"
        - name: ES_JAVA_OPTS
          value: "-Xms1g -Xmx1g"
        ports:
        - containerPort: 9200
        - containerPort: 9300
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi

2. Fluent Bit DaemonSet

# fluent-bit.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:2.1.4
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        env:
        - name: FLUENT_ELASTICSEARCH_HOST
          value: "elasticsearch"
        - name: FLUENT_ELASTICSEARCH_PORT
          value: "9200"
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        Parser            docker
        Tag               kube.*
        Refresh_Interval  5
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [OUTPUT]
        Name            es
        Match           kube.*
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        Index           kubernetes_cluster
        Type            _doc
        Logstash_Format On
        Replace_Dots    On
        Retry_Limit     False

  parsers.conf: |
    [PARSER]
        Name   docker
        Format json
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

Security Best Practices

Security should be built into every layer of your Kubernetes deployment. Let's explore comprehensive security practices.

RBAC (Role-Based Access Control)

1. Service Accounts and Roles

# rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-service-account
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: app-role
rules:
- apiGroups: [""]
  resources: ["pods", "configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-role-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: app-service-account
  namespace: production
roleRef:
  kind: Role
  name: app-role
  apiGroup: rbac.authorization.k8s.io

2. ClusterRole for Cross-Namespace Access

# cluster-rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitoring-cluster-role
rules:
- apiGroups: [""]
  resources: ["nodes", "nodes/proxy", "services", "endpoints", "pods"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
  resources: ["ingresses"]
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: monitoring-cluster-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: monitoring-cluster-role
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: monitoring

Pod Security Standards

1. Pod Security Policy (Deprecated) / Pod Security Standards

# pod-security-policy.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'

2. Security Contexts and Admission Controllers

# secure-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
  namespace: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
spec:
  serviceAccountName: app-service-account
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: my-secure-app:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 1000
      capabilities:
        drop:
        - ALL
        add:
        - NET_BIND_SERVICE
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
    - name: cache-volume
      mountPath: /app/cache
  volumes:
  - name: tmp-volume
    emptyDir: {}
  - name: cache-volume
    emptyDir: {}

Secrets Management with External Systems

1. External Secrets Operator

# external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-backend
  namespace: production
spec:
  provider:
    vault:
      server: "https://vault.example.com"
      path: "secret"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "demo-role"
          serviceAccountRef:
            name: "external-secrets-sa"
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-secrets
  namespace: production
spec:
  refreshInterval: 15s
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: app-secrets
    creationPolicy: Owner
  data:
  - secretKey: database-password
    remoteRef:
      key: database
      property: password
  - secretKey: api-key
    remoteRef:
      key: external-api
      property: key

Production Deployment Patterns

Let's explore proven patterns for deploying applications reliably in production Kubernetes environments.

GitOps with ArgoCD

1. ArgoCD Application

# argocd-application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: web-app-production
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/company/web-app-manifests
    targetRevision: HEAD
    path: overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
    - CreateNamespace=true
    - ApplyOutOfSyncOnly=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

2. Kustomization Structure

# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- deployment.yaml
- service.yaml
- configmap.yaml

commonLabels:
  app: web-app
  version: v1.0.0

images:
- name: web-app
  newTag: latest
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: production

resources:
- ../../base
- ingress.yaml
- hpa.yaml

patchesStrategicMerge:
- deployment-patch.yaml

replicas:
- name: web-app-deployment
  count: 3

images:
- name: web-app
  newTag: v1.2.3

configMapGenerator:
- name: app-config
  files:
  - config.properties
  behavior: replace

Multi-Environment Management

1. Environment-Specific Configurations

# Directory structure
k8s-manifests/
├── base/
│   ├── deployment.yaml
│   ├── service.yaml
│   └── kustomization.yaml
├── environments/
│   ├── development/
│   │   ├── kustomization.yaml
│   │   └── patches/
│   ├── staging/
│   │   ├── kustomization.yaml
│   │   └── patches/
│   └── production/
│       ├── kustomization.yaml
│       └── patches/
└── components/
    ├── monitoring/
    ├── security/
    └── networking/

2. Helm Charts for Complex Applications

# Chart.yaml
apiVersion: v2
name: web-app
description: A production-ready web application
type: application
version: 0.1.0
appVersion: "1.0.0"

dependencies:
- name: postgresql
  version: 12.1.2
  repository: https://charts.bitnami.com/bitnami
  condition: postgresql.enabled
- name: redis
  version: 17.3.7
  repository: https://charts.bitnami.com/bitnami
  condition: redis.enabled
# values.yaml
replicaCount: 3

image:
  repository: my-registry.com/web-app
  pullPolicy: IfNotPresent
  tag: ""

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: app.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: app-tls
      hosts:
        - app.example.com

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 250m
    memory: 256Mi

postgresql:
  enabled: true
  auth:
    existingSecret: postgres-secret
  primary:
    persistence:
      enabled: true
      size: 8Gi

redis:
  enabled: true
  auth:
    enabled: true
    existingSecret: redis-secret
# values-production.yaml
replicaCount: 5

image:
  tag: "v1.2.3"

resources:
  limits:
    cpu: 1000m
    memory: 1Gi
  requests:
    cpu: 500m
    memory: 512Mi

autoscaling:
  minReplicas: 5
  maxReplicas: 20

postgresql:
  primary:
    persistence:
      size: 100Gi
    resources:
      limits:
        cpu: 2000m
        memory: 2Gi
      requests:
        cpu: 1000m
        memory: 1Gi

redis:
  master:
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 250m
        memory: 256Mi

Disaster Recovery and Backup

1. Velero Backup Configuration

# velero-backup.yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  name: production-backup
  namespace: velero
spec:
  includedNamespaces:
  - production
  - monitoring
  excludedResources:
  - events
  - events.events.k8s.io
  ttl: 720h0m0s
  storageLocation: default
  volumeSnapshotLocations:
  - default
---
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: production-daily-backup
  namespace: velero
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  template:
    includedNamespaces:
    - production
    ttl: 168h0m0s  # 7 days

2. Database Backup Job

# database-backup-job.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
  namespace: production
spec:
  schedule: "0 3 * * *"  # Daily at 3 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: postgres-backup
            image: postgres:13
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: password
            command:
            - /bin/bash
            - -c
            - |
              pg_dump -h postgres-service -U postgres -d myapp > /backup/backup-$(date +%Y%m%d-%H%M%S).sql
              # Upload to S3 or other storage
              aws s3 cp /backup/backup-$(date +%Y%m%d-%H%M%S).sql s3://my-backups/postgres/
              # Cleanup old local backups
              find /backup -name "backup-*.sql" -mtime +7 -delete
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

Conclusion

Kubernetes is a powerful platform that enables modern application deployment and management at scale. This comprehensive guide has covered the essential concepts, practical implementations, and best practices needed to effectively use Kubernetes in production environments.

Key Takeaways

🏗️ Architecture Understanding

  • Master and worker node components
  • Networking model and service discovery
  • Storage abstraction and persistence
  • Security boundaries and isolation

📦 Core Concepts Mastery

  • Pods, Deployments, and Services
  • ConfigMaps and Secrets management
  • Resource management and scheduling
  • Monitoring and observability

🔒 Security Best Practices

  • RBAC and service accounts
  • Pod security standards
  • Network policies and segmentation
  • Secrets management with external systems

🚀 Production Readiness

  • GitOps and CI/CD integration
  • Multi-environment management
  • Disaster recovery and backup strategies
  • Performance optimization and scaling

Next Steps in Your Kubernetes Journey

  1. Hands-On Practice: Set up your own cluster and deploy real applications
  2. Advanced Topics: Explore service mesh, custom controllers, and operators
  3. Cloud-Native Ecosystem: Learn about Helm, ArgoCD, Istio, and other CNCF projects
  4. Certification: Consider pursuing CKA, CKAD, or CKS certifications
  5. Community Engagement: Join Kubernetes communities and contribute to open source

Recommended Learning Path

Beginner → Intermediate

  • Master basic objects and operations
  • Understand networking and storage
  • Learn debugging and troubleshooting
  • Practice with real applications

Intermediate → Advanced

  • Implement production security practices
  • Set up monitoring and logging
  • Learn cluster administration
  • Explore advanced scheduling and resource management

Advanced → Expert

  • Develop custom controllers and operators
  • Contribute to Kubernetes ecosystem
  • Design large-scale architectures
  • Mentor others in the community

Remember, Kubernetes is a journey, not a destination. The platform continues to evolve rapidly, so staying current with new features, best practices, and community developments is essential for long-term success.

Additional Resources

Happy orchestrating! ⚓🚀

Share this content

Reading time: 1 minutes
Progress: 0%
#Kubernetes#Container Orchestration#DevOps#Advanced
Kubernetes Complete Guide: Architecture, Concepts & Practical Implementation - Fenil Sonani