0% found this document useful (0 votes)
11 views26 pages

Kubernetes Architecture and Components Guide

The document provides a comprehensive overview of Kubernetes architecture, components, and functionalities, including the roles of the control plane, worker nodes, and various Kubernetes objects like Pods, Deployments, and Services. It explains key concepts such as kubelet, kube-proxy, and the significance of etcd, along with deployment strategies like rolling updates and canary deployments. Additionally, it covers networking aspects such as Ingress and Network Policies, emphasizing the importance of managing traffic and communication within a Kubernetes cluster.

Uploaded by

arungc2911
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views26 pages

Kubernetes Architecture and Components Guide

The document provides a comprehensive overview of Kubernetes architecture, components, and functionalities, including the roles of the control plane, worker nodes, and various Kubernetes objects like Pods, Deployments, and Services. It explains key concepts such as kubelet, kube-proxy, and the significance of etcd, along with deployment strategies like rolling updates and canary deployments. Additionally, it covers networking aspects such as Ingress and Network Policies, emphasizing the importance of managing traffic and communication within a Kubernetes cluster.

Uploaded by

arungc2911
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Kubernetes

1. Kubernetes Architecture & Components

1️⃣ Explain the architecture of Kubernetes.

Kubernetes follows a master–worker architecture.

 Control Plane (Master Node): Manages the cluster and makes global decisions — includes API
Server, Controller Manager, Scheduler, and etcd.

 Worker Nodes: Run your containerized workloads. Each node runs kubelet, kube-proxy, and the
container runtime.

 The API server is the entry point for all commands. Scheduler assigns Pods to nodes, and kubelet
ensures containers run as desired.

2️⃣ What are the core components of the Kubernetes control plane?

 API Server: The main entry point for all operations (kubectl, UI, REST).

 etcd: Distributed key-value store for cluster state and configuration.

 Controller Manager: Monitors cluster state and takes corrective actions (e.g., ensure desired
replicas).

 Scheduler: Assigns Pods to nodes based on resources, affinity, and constraints.

 Cloud Controller Manager (optional): Integrates with cloud-specific APIs like Azure, AWS, etc.

3️⃣ What is the role of etcd in Kubernetes?

 It stores all cluster data: Pods, ConfigMaps, Secrets, node info, etc.

 It is the single source of truth for Kubernetes.

 It uses a distributed key-value store and supports leader election for consistency.

 Losing etcd can lead to loss of cluster state, so backups are critical.

4️⃣ What happens if the control plane goes down?

 The existing workloads continue to run, as kubelets manage local Pods.

 However, no new Pods or scaling/deployments can happen since the API server and scheduler
are unavailable.

 Once the control plane recovers, it reconciles the actual state.


5️⃣ What happens if the kubelet service goes down on a node?

 That node becomes NotReady after a grace period.

 The control plane marks Pods on that node as unknown, and after timeout, reschedules them on
healthy nodes.

 The kubelet must be running for Pod health checks and node-heartbeats.

6️⃣ What is the role of kube-proxy? What if it fails?

 kube-proxy maintains network rules to route traffic to correct Pods via Services (ClusterIP,
NodePort, etc.).

 If it fails, network traffic between Services and Pods may break, but existing connections might
still work temporarily depending on the mode (iptables/IPVS).

7️⃣ What is kubeadm?

 kubeadm is a tool to bootstrap Kubernetes clusters easily.

 It helps initialize control plane (kubeadm init) and join worker nodes (kubeadm join).

 It simplifies manual cluster setup steps (certs, tokens, kubeconfig).

8️⃣ What is CNI and CRI?

 CNI (Container Network Interface): Manages Pod networking — defines how Pods get IPs and
communicate (e.g., Flannel, Calico, Azure CNI).

 CRI (Container Runtime Interface): Allows Kubernetes to use any container runtime like
containerd, Docker, CRI-O.

9️⃣ What are Init Containers and Sidecar containers?

 Init Containers: Run before main containers start. Used for setup tasks (e.g., fetch config, wait for
DB).

 Sidecar Containers: Run alongside the main app to extend functionality (e.g., logging,
monitoring, proxy).

🔟 What is the difference between DaemonSet, Deployment, and StatefulSet?

Type Purpose Example

Deployment Manages stateless Pods with replicas. Web apps

StatefulSet Manages stateful apps; maintains identity & order. Databases


Type Purpose Example

DaemonSet Ensures one Pod per node. Log collectors, monitoring agents

11️⃣ What is the role of API server, controller manager, and scheduler?

 API Server: Front-end for cluster management — validates and serves API requests.

 Controller Manager: Watches API server and ensures desired state (e.g., keeps 3 replicas if
desired).

 Scheduler: Assigns Pods to nodes based on available resources, affinity, taints/tolerations.

12️⃣ Explain worker node components.

Each worker node has:

 Kubelet: Talks to API server and ensures containers are running as specified.

 Kube-proxy: Manages network routing for Services.

 Container runtime: Actually runs containers (e.g., containerd, CRI-O).

 Pods: The actual workloads.

13️⃣ What is a node pool?

 A node pool is a group of worker nodes within a cluster that share the same configuration (size,
OS, scaling rules).

 In AKS, node pools allow mixing Linux and Windows nodes or different VM sizes.

14️⃣ How many clusters have you created and managed?

(Example answer for interviews)


“I’ve managed around 3 to 4 clusters — one for dev, one for staging, and one for production — each with
separate node pools and autoscaling enabled.”

15️⃣ Explain AKS architecture.

 Azure Kubernetes Service (AKS) is a managed control plane — Azure hosts the master
components.

 You only manage worker nodes (agent pools).

 It integrates with Azure AD, Key Vault, Monitor, Azure CNI, and Load Balancer for networking.

 Provides built-in scaling, monitoring, and upgrades.


16️⃣ What are the key components you worked on in AKS?

Possible answer:
“I worked on managing node pools, deploying applications using Helm, configuring Ingress with NGINX,
enabling autoscaling, integrating Azure Key Vault secrets via CSI driver, and monitoring with Prometheus
& Grafana.”

🧱 2. Pods, Deployments & ReplicaSets

1️⃣ What is a Pod?

A Pod is the smallest deployable unit in Kubernetes — it can contain one or more containers sharing
network, storage, and namespace.
It represents a single instance of a running application.

2️⃣ What are the different Pod statuses?

 Pending – waiting to be scheduled or image pulling.

 Running – containers are running.

 Succeeded – completed successfully (for jobs).

 Failed – containers exited with errors.

 CrashLoopBackOff – container keeps crashing.

 Unknown – node unreachable.

3️⃣ Difference between Pod, ReplicaSet, and Deployment.

Object Purpose

Pod Basic unit running a container.

ReplicaSet Ensures desired number of Pod replicas are running.

Deployment Manages ReplicaSets and allows rolling updates & rollbacks.

4️⃣ What happens internally when you create a Pod?

1. The manifest is sent to API Server.

2. API Server stores it in etcd.

3. Scheduler picks a suitable node.


4. Kubelet on that node pulls the image and starts the container.

5. Pod status updates back to API server.

5️⃣ What is CrashLoopBackOff error and how do you troubleshoot it?

It means the container keeps crashing repeatedly.


Troubleshooting:

 Check logs: kubectl logs pod-name --previous

 Describe the pod: kubectl describe pod pod-name

 Check readiness/liveness probes and app-level errors.

6️⃣ What are common reasons a Pod fails to start?

 Image not found or wrong image name.

 Insufficient node resources (CPU/memory).

 Missing ConfigMap/Secret/volume.

 Network or permission issue.

 Failing liveness probe.

7️⃣ How do you troubleshoot a Pod in Pending state?

 Check node capacity: kubectl describe pod <pod>

 See if there are taints preventing scheduling.

 Check storage/PVC binding.

 Check image pull secrets or scheduling constraints.

8️⃣ What are the commands to check Pod logs and live logs?

kubectl logs <pod-name>

kubectl logs -f <pod-name> # live logs

kubectl logs <pod> -c <container-name>

9️⃣ What happens when a Pod dies?

 If managed by a Deployment or ReplicaSet → it’s automatically recreated.

 If standalone → it remains terminated and not restarted.


🔟 How can you schedule a Pod on a specific node?

 Use nodeSelector, affinity, or taints/tolerations.


Example:

nodeSelector:

disktype: ssd

11️⃣ How do you debug a Pod issue?

 kubectl describe pod <pod> → see events & reasons.

 kubectl logs <pod> → see app logs.

 kubectl exec -it <pod> -- /bin/bash → enter Pod.

 Check node status & resources.

 Use kubectl get events.

12️⃣ How do you enter a running Pod?

kubectl exec -it <pod-name> -- /bin/bash

or /bin/sh depending on the image.

13️⃣ What are Init containers used for?

They perform setup or preconditions before main containers start — e.g., wait for a DB, or load configs.
They always run sequentially before app containers.

14️⃣ Can a Pod automatically restart?

Yes — depending on restartPolicy:

 Always (default for Deployments)

 OnFailure

 Never

15️⃣ Write a simple Pod manifest YAML.

apiVersion: v1

kind: Pod

metadata:
name: mypod

spec:

containers:

- name: nginx

image: nginx:latest

ports:

- containerPort: 80

16️⃣ How do you attach a volume to a Pod?

apiVersion: v1

kind: Pod

metadata:

name: volume-pod

spec:

containers:

- name: app

image: nginx

volumeMounts:

- name: data-vol

mountPath: /usr/share/nginx/html

volumes:

- name: data-vol

persistentVolumeClaim:

claimName: pvc-demo

3. Deployments, Scaling & Strategies

1️⃣ What is a Deployment in Kubernetes?

A Deployment is a higher-level controller that manages ReplicaSets and ensures the desired number of
Pods are running and updated gradually.
It supports rolling updates, rollbacks, and scaling of applications easily.
👉 You declare the desired state in YAML, and the Deployment controller reconciles it automatically.
2️⃣ What resources get created when you apply a Deployment YAML?

When you apply a Deployment:

 A Deployment object is created.

 It creates a ReplicaSet.

 That ReplicaSet creates the required Pods.


So the chain is: Deployment → ReplicaSet → Pods.

3️⃣ Difference between Deployment and StatefulSet.

Feature Deployment StatefulSet

Nature Stateless apps Stateful apps

Pod names Random Sequential & predictable

Shared or
Storage Dedicated persistent volume per Pod
ephemeral

Scaling Easy & parallel Sequential (maintains order)

Example Web servers, APIs Databases, Kafka, Redis

4️⃣ What is a DaemonSet and its use case?

A DaemonSet ensures that one copy of a Pod runs on every node (or selected nodes).
Used for:

 Log collectors (Fluentd, Datadog agents)

 Monitoring agents (Prometheus Node Exporter)

 Network proxies
If a new node is added, a Pod is automatically scheduled on it.

5️⃣ What deployment strategies have you used (rolling update, blue-green, canary)?

I’ve mainly used:

 Rolling update – default in Kubernetes, gradually replaces Pods.

 Blue-Green – two environments (Blue=live, Green=new); switch traffic post-validation.

 Canary – small % of traffic sent to new version before full rollout.

6️⃣ How do you configure rolling updates?


In your Deployment YAML:

strategy:

type: RollingUpdate

rollingUpdate:

maxSurge: 1

maxUnavailable: 0

 maxSurge – extra Pods allowed during update

 maxUnavailable – number of Pods that can be unavailable


This ensures zero downtime.

7️⃣ What is zero downtime deployment? How do you achieve it?

It means application remains available during updates.


Achieved by:

 Using RollingUpdate strategy

 Setting readiness probes properly

 Avoiding simultaneous Pod terminations

 Using HPA for scaling under load

8️⃣ Explain Blue-Green deployment.

 Maintain two identical environments:

o Blue → current version (live)

o Green → new version (staging)

 Deploy to Green, test, and switch traffic via Service or Ingress.

 Rollback = switch back to Blue instantly.


Used for safe production rollouts.

9️⃣ Explain Canary deployment.

 Gradually release new version to a small percentage of users (say 10%), observe performance.

 If healthy, roll out to 100%.

 Tools: Istio, Argo Rollouts, or custom CI/CD logic.


Great for detecting regressions early.
🔟 How do you rollback a failed Deployment?

kubectl rollout undo deployment <deployment-name>

or check history:

kubectl rollout history deployment <deployment-name>

Then rollback to a specific revision if needed.

11️⃣ How do you scale Pods up/down?

 Manually:

 kubectl scale deployment nginx --replicas=5

 Automatically: using Horizontal Pod Autoscaler (HPA).

12️⃣ What is Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA)?

 HPA: Scales the number of Pods based on CPU/memory or custom metrics.

 VPA: Adjusts CPU and memory requests/limits for existing Pods automatically.
HPA handles quantity, VPA handles capacity.

13️⃣ Explain event-driven scaling (KEDA).

KEDA (Kubernetes Event-driven Autoscaler) allows scaling based on external events, e.g.:

 Azure Service Bus queue length

 Kafka lag

 Prometheus metrics
It works alongside HPA by feeding external metrics.

14️⃣ How do you handle dependency order (e.g., DB before API)?

Use:

 Init containers to wait for DB connection.

 Readiness probes to delay traffic until app is ready.

 Or define dependsOn logic in CI/CD pipeline (DB → API → frontend).

15️⃣ How do you upgrade your Kubernetes or AKS cluster with minimal downtime?

 Use node pool upgrades one node at a time.

 Ensure PodDisruptionBudgets (PDB) are configured.


 Use rolling updates for Deployments.

 Take etcd snapshots (for self-managed clusters).

 Run upgrade in staging first.

🌐 4. Services, Ingress & Networking

1️⃣ What are Services in Kubernetes?

A Service provides a stable network identity and endpoint to access Pods.


Since Pods are ephemeral, Services ensure consistent access via DNS.

2️⃣ Difference between ClusterIP, NodePort, LoadBalancer, and ExternalName.

Type Purpose

ClusterIP Default; internal access only.

NodePort Exposes service on each node’s port (30000–32767).

LoadBalancer Creates an external load balancer (Azure LB).

ExternalName Maps service to an external DNS name.

3️⃣ Why do we need a Service for Pods?

Pods have dynamic IPs; Services provide a stable endpoint with load balancing across Pods using
selectors.

4️⃣ What is an Ingress? Why is it needed?

Ingress manages external HTTP/S traffic and routes it to internal Services.


Instead of creating multiple LoadBalancers, you define path-based or host-based rules using a single
public endpoint.

5️⃣ What is an Ingress Controller?

It’s a Pod that implements Ingress rules, e.g. NGINX, Traefik, Azure Application Gateway Ingress
Controller (AGIC).
It watches Ingress resources and configures routing automatically.

6️⃣ How do you configure Ingress rules?

Example:
apiVersion: [Link]/v1

kind: Ingress

metadata:

name: web-ingress

spec:

rules:

- host: [Link]

http:

paths:

- path: /api

pathType: Prefix

backend:

service:

name: backend-svc

port:

number: 80

7️⃣ How does Ingress differ from a Load Balancer?

 LoadBalancer exposes a single service externally.

 Ingress routes multiple services through a single entry point (via controller).
Ingress offers more flexibility for routing, SSL, and rewrites.

8️⃣ What is Egress?

Egress controls outbound traffic from Pods to external destinations.


You can manage it using Network Policies or Azure Firewall / NAT Gateway in AKS.

9️⃣ What are Network Policies?

They define how Pods communicate with each other and external endpoints.
You can restrict ingress/egress traffic using labels and namespaces.

🔟 How do you restrict communication between Pods or namespaces?

 Apply NetworkPolicies with specific selectors.


 Deny all by default, then whitelist required flows.
Example:

policyTypes: ["Ingress"]

ingress:

- from:

- podSelector:

matchLabels:

app: frontend

11️⃣ How do you enable SSL/TLS in Kubernetes?

 Create a TLS Secret:

 kubectl create secret tls my-tls --cert=[Link] --key=[Link]

 Reference it in your Ingress YAML:

 tls:

 - hosts:

 - [Link]

 secretName: my-tls

12️⃣ What are common network issues and how do you troubleshoot them?

 Pods can’t reach each other: check CNI plugin.

 Service unreachable: verify selectors, endpoints, and kube-proxy.

 DNS issues: check CoreDNS logs.

 External access: verify Ingress or LB rules.


Commands:

kubectl get svc,ep

kubectl logs -n kube-system coredns-*

13️⃣ How do you achieve path-based routing using Ingress?

Use multiple path rules in Ingress:

paths:

- path: /api

backend:
service:

name: backend-svc

- path: /web

backend:

service:

name: frontend-svc

→ /api → backend, /web → frontend.

14️⃣ How does Kubernetes allocate IPs to Pods and Services?

 Pods: From CNI plugin subnet (e.g., Azure CNI or kubenet).

 Services: From Service CIDR configured in cluster.


Both are defined during cluster setup.

15️⃣ What happens if DNS fails inside the cluster?

 Pods won’t resolve service names → communication breaks.

 You can debug via:

 kubectl exec -it <pod> -- nslookup [Link]

 Check CoreDNS logs or restart coredns Pods.

16️⃣ What is node affinity and pod affinity?

 Node Affinity: Schedule Pods on specific nodes using labels (e.g., disktype=ssd).

 Pod Affinity: Schedule Pods near other Pods.

 Anti-affinity: Spread Pods across nodes for HA.

17️⃣ What is taint and toleration?

 Taint: Marks a node as “restricted.”

 Toleration: Allows Pods to be scheduled on tainted nodes.


Example: run critical workloads on dedicated nodes.

18️⃣ How do you prevent Pods from being scheduled on specific nodes?

 Apply taints:

 kubectl taint nodes node1 key=value:NoSchedule


 Or use nodeSelector/affinity rules to control placement.

19️⃣ How do you troubleshoot if service cannot reach a Pod?

 Check service endpoints: kubectl get ep <svc>

 Verify selectors match Pod labels.

 Check kube-proxy logs.

 Ensure Pod is ready (readiness probe passing).

 Use curl <pod-ip>:<port> from another Pod to test.

5. Configurations, Secrets & Security

What is a ConfigMap?

 A ConfigMap stores non-confidential configuration data in key-value pairs.

 Example: app URLs, feature flags, environment variables.

 Mounted as environment variables or files inside Pods.

What is a Secret? Difference between ConfigMap and Secret.

 Secret stores sensitive data like passwords, tokens, certificates.

 Data is Base64-encoded.

 Difference:

Feature ConfigMap Secret

Data type Non-sensitive Sensitive

Encoding Plain text Base64

Security Not encrypted Can be integrated with Key Vault/Vault

How do you rotate Secrets in Kubernetes?

 Use SecretProviderClass with external secret stores (e.g., Azure Key Vault).

 Rotate at source → syncs automatically.

 Optionally use a sidecar or reloader (e.g., stakater/reloader) to restart pods when secrets
change.

How do you secure Secrets?

 Enable encryption at rest using KMS providers.


 Use Azure Key Vault CSI Driver for secret injection.

 Limit access via RBAC and namespace isolation.

 Avoid committing secrets to Git repos.

How do you integrate Azure Key Vault with AKS?

 Use Azure Key Vault Provider for Secrets Store CSI Driver.

 Steps:

1. Create a Managed Identity and grant access in Key Vault.

2. Install CSI driver using Helm.

3. Create a SecretProviderClass manifest linking Key Vault secrets.

4. Mount them into Pods as files or env variables.

What is a SecretProviderClass?

 Custom resource that defines how secrets are pulled from Key Vault.

 Example:

 apiVersion: [Link]/v1

 kind: SecretProviderClass

 metadata:

 name: azure-kv-provider

 spec:

 provider: azure

 parameters:

 keyvaultName: my-keyvault

 objects: |

 array:

 - objectName: dbPassword

 objectType: secret

How do you manage RBAC in Kubernetes?

 RBAC (Role-Based Access Control) manages who can do what in the cluster.

 Objects:
o Role → permissions in a namespace.

o ClusterRole → cluster-wide permissions.

o RoleBinding / ClusterRoleBinding → attach roles to users/service accounts.

Difference between RoleBinding and ClusterRoleBinding

Type Scope Example

RoleBinding Namespace-specific Developer can list pods in dev namespace

ClusterRoleBinding Cluster-wide Admin can access all namespaces

How do you restrict access between namespaces?

 Use NetworkPolicies to limit pod communication.

 Use RBAC to limit API access per namespace.

 Apply namespace-level resource quotas.

What is authentication vs authorization in Kubernetes?

 Authentication: Verify who you are (user, service account, Azure AD identity).

 Authorization: Verify what you can do (via roles, bindings).

How do you secure workloads in AKS?

 Use Managed Identity for Pods.

 Enable Azure Policy for AKS to enforce compliance.

 Enable network policies, encryption at rest, image scanning, and RBAC.

Best Practices to Secure Cluster & Apps

✅ Enable RBAC & audit logs


✅ Restrict privileges (no root containers)
✅ Use private container registry (ACR)
✅ Apply Pod Security Standards
✅ Regularly rotate secrets
✅ Enable TLS between services

💾 6. Storage & Persistence


What is PersistentVolume (PV) and PersistentVolumeClaim (PVC)?

 PV: A piece of storage provisioned in cluster (Azure Disk/File).

 PVC: A request for storage by user/pod.

 PVC binds to an available PV.

Difference between Volume, PV, and PVC

Concept Scope Provisioned By

Volume Ephemeral Pod definition

PV Cluster-wide resource Admin

PVC User’s claim for PV Developer

What is a StorageClass?

 Defines how storage is provisioned dynamically.

 Example: Azure Disk (Premium_LRS, StandardSSD_LRS).

How do you restore data in Kubernetes?

 Use Velero or Azure Backup for AKS.

 Backup PV snapshots → restore via new PVCs.

How do you mount a Storage Account to AKS?

 Use AzureFile CSI driver:

1. Create storage account & share.

2. Create secret with storage keys.

3. Define PersistentVolume and PersistentVolumeClaim.

4. Mount in pod spec.

How do you fetch credentials from Key Vault to access storage?

 Use CSI driver + SecretProviderClass to inject credentials as env vars or files.

How do you handle database persistence and replication?

 Use StatefulSets with PVC templates.


 Configure DB replication (e.g., primary/replica pattern).

 Back up volumes using snapshots.

How do you access or download logs stored in PVC?

 Attach the same PVC to a temporary pod and kubectl cp or kubectl exec to read logs.

🚀 7. CI/CD Integration & Deployment Automation

How do you connect AKS with CI/CD pipeline (Azure DevOps/Jenkins)?

 Use service connection (Azure Resource Manager type).

 Pipeline authenticates via Service Principal or Managed Identity.

 Use kubectl or helm tasks to deploy manifests.

What are the stages in CI/CD for a Kubernetes app?

1. Build – build image & run unit tests.

2. Scan – code + image scan (SonarQube, Trivy).

3. Push – push image to ACR.

4. Deploy – apply YAML/Helm to AKS.

5. Verify – run smoke tests, health checks.

How do you deploy to AKS via pipeline YAML?

Example:

- task: Kubectl@1

inputs:

connectionType: Azure Resource Manager

azureSubscription: $(azureConnection)

namespace: dev

command: apply

useConfigurationFile: true

configuration: manifests/[Link]

How do you pass parameters from templates to pipelines?


 Use variables and templates:

parameters:

- name: env

type: string

steps:

- script: echo "Deploying to ${{ [Link] }}"

How do you handle dependencies in CI/CD pipelines?

 Define stage dependencies using dependsOn.

 Add pre-deployment approvals for staging → production flow.

How do you ensure zero downtime deployment via CI/CD?

 Use Rolling updates or Blue-Green strategy in Deployment YAML.

 Use readiness probes to avoid routing to unready pods.

What is the flow from commit to production deployment?

1. Developer commits → triggers CI.

2. Build image → push to ACR.

3. Run SonarQube & Trivy scans.

4. Deploy manifests/Helm to AKS.

5. Post-deployment validation.

How do you handle versioning in pipeline builds?

 Use build IDs or Git commit hash for Docker image tags:

docker build -t myapp:${{ [Link] }}

8. Monitoring & Troubleshooting

Which tools do you use to monitor Kubernetes (Prometheus, Grafana, etc.)?

I’ve primarily used Prometheus and Grafana for metrics, ELK / EFK stacks for logs, and Azure Monitor +
Log Analytics + Container Insights for cluster-level visibility.
Prometheus scrapes metrics from Kube components, and Grafana visualizes them in dashboards.

How does Prometheus and Grafana work together?


 Prometheus: Collects and stores time-series metrics (CPU, memory, latency, etc.).

 Grafana: Connects to Prometheus as a data source and visualizes the metrics using dashboards
and alerts.

What metrics do you monitor for Pods and nodes?

 Pod level: CPU/memory usage, restarts, latency, response time, and pod health.

 Node level: CPU/memory utilization, disk I/O, network usage, node pressure.

 Cluster level: API server latency, scheduler queue time, etc.

How do you receive alerts from Kubernetes or Azure Monitor?

 Prometheus Alertmanager → Slack, Teams, or email.

 Azure Monitor alerts → Action Groups for notifications or Logic Apps for automation.

What is the process to troubleshoot CrashLoopBackOff?

1. kubectl describe pod <pod> → check reason.

2. kubectl logs <pod> --previous → check last run logs.

3. Common causes: misconfiguration, missing dependency, wrong env vars, or readiness probe
failure.

4. Fix the error → re-deploy.

What if Pods fail due to insufficient resources?

 Check with kubectl describe node and kubectl top nodes.

 Increase node size, add autoscaling, or update [Link]/limits in Deployment.

What steps if API server is slow or unresponsive?

 Check etcd health and API server logs.

 Verify control plane resource usage.

 Reduce frequent API calls from controllers or external integrations.

How do you debug imagePullBackOff errors?

 Check image name/tag in YAML.

 Verify registry credentials (imagePullSecrets).


 Ensure ACR permissions (Managed Identity or Service Principal).

 Try kubectl describe pod for error details.

How do you troubleshoot high latency or 500 errors?

 Check app logs & ingress controller metrics.

 Validate service routing and DNS resolution.

 Use kubectl exec + curl to test internal communication.

 Review HPA configuration and pod resource limits.

What are common Kubernetes troubleshooting issues you’ve faced?

 Pod pending (no resources), CrashLoopBackOff, imagePull errors, DNS failures, misconfigured
Ingress, and node not ready due to taints.

What was a recent real-world issue you solved in production?

Example: One service kept restarting due to memory limits. I checked kubectl describe pod, increased
the memory limit from 256Mi to 512Mi, and stabilized the service. Later, tuned autoscaling and resource
requests.

What commands do you commonly use for debugging?

kubectl get pods -A, kubectl describe pod, kubectl logs, kubectl exec, kubectl top nodes/pods, kubectl
get events, and kubectl port-forward.

🧠 9. Helm, Kustomize & Templates

What is Helm and how does it work?

Helm is a package manager for Kubernetes. It bundles YAML manifests into reusable charts for easy
versioning, upgrades, and rollbacks.

How do you install Helm?

 On CLI:

 curl [Link] | bash

 Add repo: helm repo add bitnami [Link]

What is a Helm Chart?


A Helm Chart is a collection of templates + values + metadata that defines how to deploy an app (like a
package).

Structure:

[Link]

[Link]

templates/

How do you use a single Helm repo for multiple environments?

Use different values files:

helm install myapp -f [Link]

helm install myapp -f [Link]

What are key Helm commands you’ve used?

helm install, helm upgrade, helm rollback, helm list, helm status, helm uninstall, helm repo add.

Difference between Helm and Kustomize

Feature Helm Kustomize

Template engine Yes (Go templates) No templates

Values handling [Link] Patches/overlays

Package reuse Charts YAML overlays

Complexity High Simple

How do you handle environment values in Helm?

Use [Link], [Link], etc., and specify with -f flag.


Also use templating for dynamic values like image tags.

Have you written custom Helm charts?

Yes, for microservices with configurable replicas, environment variables, and ingress.
Used _helpers.tpl for label consistency and versioning.

How do you rollback using Helm?

helm rollback <release_name> <revision_number>


Helm keeps release history, so rollback is instant.

🧩 10. Advanced Topics & Scenarios

What is Istio and how does it work?

Istio is a service mesh that manages microservice traffic, security, and observability.
It uses a sidecar proxy (Envoy) to intercept all service-to-service traffic.

Difference between Kong and Istio

 Kong → API Gateway for north-south (external) traffic.

 Istio → Service mesh for east-west (internal) traffic.

What are custom resources (CRDs)?

Custom Resource Definitions allow extending Kubernetes API with custom objects — e.g., KafkaTopic,
HelmRelease.

What is a Spinner (AKS auto-scaler)?

Refers to the Cluster Autoscaler, which adds/removes nodes based on pending pods or resource
utilization.

How do you detect drift in cluster configuration (manual changes)?

 Use GitOps tools like ArgoCD or Flux to compare live vs Git state.

 Use kubectl diff or terraform plan for infra drift detection.

How do you achieve zero downtime for critical applications?

 Rolling updates with readiness probes.

 PreStop hooks before termination.

 Blue-Green or Canary strategies during release.

How do you ensure microservices communicate securely?

 Use mTLS via Istio or Linkerd.

 Enforce NetworkPolicies.

 Use private DNS & internal load balancers.


How do you handle multi-cluster or hybrid AKS environments?

 Use Azure Arc for Kubernetes or Rancher for multi-cluster management.

 Centralized policy + monitoring via Azure Monitor & Azure Policy.

What are the security recommendations for deploying a banking app in K8s?

✅ Use managed identity for secrets access.


✅ Encrypt at rest & in transit.
✅ Use RBAC + Azure Policy.
✅ Private cluster (no public API).
✅ Enable WAF & TLS everywhere.

What steps do you take before upgrading an AKS cluster?

1. Backup etcd and workloads (Velero).

2. Test upgrade in staging.

3. Drain one node and test workloads.

4. Perform rolling upgrade.

How do you backup AKS before upgrade?

Use Velero or Azure Backup for AKS to back up cluster objects and persistent volumes.

How do you troubleshoot if application is not accessible externally?

 Check Ingress or LoadBalancer status.

 Verify DNS and firewall rules.

 Ensure target port and service selector match.

 Inspect ingress logs and pod readiness.

What are ways to expose services with a single public IP?

 Use Ingress Controller with multiple host/path rules.

 Use Azure Application Gateway Ingress Controller (AGIC) for advanced routing.

What are your cost-optimization strategies in AKS?


 Use Cluster Autoscaler + Spot nodes.

 Right-size pods and nodes.

 Use Azure Reservations for long-term cost efficiency.

 Schedule non-prod clusters to auto-shutdown.

What are real-time production challenges you faced?

Examples:

 Node pool scaling delay due to quota limit.

 Pod restarts due to resource misconfigurations.

 DNS resolution issues in private clusters fixed by CoreDNS patch.

You might also like