Microservices, System Design &
Design Patterns
Interview Preparation Guide
Revised & Enhanced for 4+ Years Experience
Covers: Microservices • System Design • HLD • SOLID Principles • Design Patterns
SECTION 1: MICROSERVICES
1.1 Microservice Basics
Q: What is a Microservice?
A Microservice is an architectural style where an application is built as a collection of small,
independently deployable services, each owning its own data and running in its own process. Services
communicate over lightweight protocols like HTTP/REST or messaging queues.
[4yr] How did you break a monolith in your last project? Which bounded context did you identify first? How did you manage
shared libraries?
Q: Monolithic vs Microservice — key differences?
Monolith: single deployable unit, shared DB, simple to develop initially but hard to scale independently.
Microservice: separate deployables, database-per-service, independent scaling and deployment, but
complex distributed coordination.
• Deployment: Monolith = one release train; Microservice = independent CI/CD per service
• Fault isolation: Monolith — one bug can crash everything; Microservice — failure is contained
• Tech stack: Monolith = uniform; Microservice = polyglot allowed
[4yr] What challenges did you face during a monolith-to-microservice migration? How did you handle shared database
tables?
Q: Is Microservice an Architectural Pattern?
Yes. Microservice architecture is an architectural style (not just a design pattern). It defines how system
components are organized and interact — it is a higher-order decision than a design pattern, which
operates at code/class level.
Q: Difference between Architectural Pattern and Design Pattern?
Architectural Pattern: high-level structure of the entire system (e.g., Microservices, MVC, Event-Driven).
Design Pattern: reusable solution to a recurring code-level problem (e.g., Singleton, Factory, Observer).
Architectural patterns guide system organization; design patterns guide implementation.
[4yr] In 4 years, you should be able to discuss why you chose a specific architectural pattern for a use case — not just name
it.
Q: What is a Distributed System?
A system where components run on multiple networked computers and communicate by passing
messages. Key challenges: network latency, partial failures, consistency, and ordering of events.
Microservices are inherently distributed systems.
Q: Advantages of Microservices?
• Independent deployability and scalability per service
• Fault isolation — one service failure does not cascade
• Technology flexibility (polyglot persistence and programming)
• Smaller, focused teams (Conway's Law alignment)
• Faster release cycles with CI/CD pipelines
[4yr] Give a real example: 'We scaled only the payment service during sale season without touching other services.'
Q: Disadvantages / Challenges of Microservices?
• Distributed system complexity: network failures, latency, partial failures
• Data consistency across services (no ACID across boundaries)
• Service discovery and inter-service communication overhead
• Operational complexity: requires container orchestration, centralized logging, distributed tracing
• Testing complexity: integration and contract testing is harder
[4yr] Interviewers love: 'We used the Saga pattern to handle distributed transactions. Let me walk you through the flow.'
Q: When to choose Monolithic vs Microservices?
Choose Monolith when: team is small, product is early-stage/startup, domain is not well understood yet.
Choose Microservices when: team is large, bounded contexts are clear, you need independent scaling,
or deployment frequency matters.
Note: 4yr expectation: Don't say 'Microservices is always better.' Show trade-off thinking.
1.2 Load Balancing
Q: What is a Load Balancer and why is it important in Microservices?
A load balancer distributes incoming network traffic across multiple service instances to ensure no
single instance is overwhelmed. It improves availability, throughput, and fault tolerance.
[4yr] 4yr question: 'How did you configure load balancing in your Kubernetes cluster?' or 'Did you use client-side or server-
side LB?'
Q: Types of Load Balancers?
• Layer 4 (Transport Layer): routes based on IP/TCP. Fast but no application awareness.
• Layer 7 (Application Layer): routes based on HTTP headers, URL, cookies. Used in API
Gateways.
• Software LB: NGINX, HAProxy. Hardware LB: F5.
• Client-side LB: Ribbon, Spring Cloud LoadBalancer — consumer decides instance.
• Server-side LB: AWS ALB, Kubernetes Service — infrastructure decides.
Q: Load Balancing Algorithms?
• Round Robin: requests distributed evenly in sequence
• Weighted Round Robin: higher-capacity instances get more requests
• Least Connections: route to instance with fewest active connections
• IP Hash / Sticky Sessions: same client always goes to same instance
• Random: random selection (rarely used)
[4yr] Say: 'For stateful sessions we used sticky sessions, but for stateless services round-robin worked best.'
1.3 Decomposition Design Patterns
Q: What is Domain-Driven Design (DDD) and how does it relate to Microservices?
DDD is an approach to software design where the model reflects the business domain. In
Microservices, DDD's Bounded Context maps naturally to a microservice — each service owns one
bounded context, its own data model, and its own ubiquitous language.
• Bounded Context = a defined boundary within which a model is consistent
• Aggregate = cluster of objects treated as a unit for data changes
• Context Map = relationships between bounded contexts (shared kernel, anti-corruption layer, etc.)
[4yr] 4yr: 'Walk me through how you identified bounded contexts in your system. Did you use Event Storming?'
Q: What is the Decomposition Pattern?
Decompose by Business Capability: each service aligns to a business function (e.g., Order Service,
Inventory Service). Decompose by Subdomain (DDD): based on domain analysis. Both ensure high
cohesion and low coupling between services.
Q: What is the Strangler Fig Pattern?
A migration strategy where new microservices gradually replace parts of the monolith. An API facade
(or Gateway) sits in front, routing calls to either the monolith or the new service. Over time, the monolith
is 'strangled' as services take over functionality.
• Benefits: low risk, incremental migration, rollback possible
• Challenges: facade becomes a bottleneck; dual maintenance cost; data synchronization
[4yr] Real example expected: 'We used Strangler for our billing module. We built the new billing microservice, routed 10%
traffic via feature flag, then gradually increased to 100% over 3 months.'
Q: What is the Sidecar Pattern?
A sidecar is a separate container/process deployed alongside the main service container in the same
pod. It handles cross-cutting concerns like logging, monitoring, service mesh proxy (Envoy), config
reloading. The main service stays focused on business logic.
• Examples: Envoy proxy sidecar in Istio, Fluentd sidecar for log shipping
[4yr] 4yr: 'We used Istio with Envoy sidecars for mutual TLS between services and traffic observability without modifying
application code.'
1.4 Database Design Patterns
Q: What are Database Patterns in Microservices?
• Database Per Service: each microservice owns its DB — enables loose coupling
• Shared Database: multiple services share a DB — simpler but creates coupling (anti-pattern)
• CQRS: separate read and write models
• Event Sourcing: store state changes as events rather than current state
• Saga: manage distributed transactions across services
[4yr] 'We used database-per-service with PostgreSQL for writes and read-replicas for queries in our order service.'
Q: What is a Distributed Transaction and why is it hard?
A distributed transaction spans multiple services/databases and must maintain ACID properties across
all. Traditional 2PC (two-phase commit) works but is blocking and introduces availability risks. In
microservices, we favor eventual consistency via Saga pattern instead.
Q: What is the Saga Pattern?
Saga breaks a distributed transaction into a sequence of local transactions, each with a compensating
transaction for rollback. Two styles:
• Choreography: services publish/subscribe to events — no central coordinator. Simple but hard to
track.
• Orchestration: a central Saga Orchestrator (e.g., a dedicated service or state machine) directs the
flow. Easier to visualize but adds coupling to the orchestrator.
Note: Compensating transactions undo previous steps on failure (e.g., refund payment if inventory fails).
[4yr] 4yr: 'In our order service, we used an orchestrator-based Saga: Order Service coordinates Payment → Inventory →
Shipping. On shipping failure, it triggers compensation — reverting inventory and refunding payment.'
Q: Choreography vs Orchestration — when to choose?
• Choreography: simple flows, few services, want decoupling. Risk: event spaghetti at scale.
• Orchestration: complex flows, multiple services, need central visibility and error handling.
Q: What is Two-Phase Commit (2PC)?
2PC is a distributed protocol with two phases — Prepare (all participants vote yes/no) and
Commit/Abort. The coordinator commits only if all vote yes. Problem: blocking protocol — if coordinator
fails, participants stay locked.
• Saga vs 2PC: Saga uses eventual consistency with compensations; 2PC enforces strict atomicity
but sacrifices availability.
[4yr] When to use 2PC: only when strict ACID is non-negotiable AND all participants support XA (rare in microservices).
Generally avoid.
Q: What is CQRS Pattern?
Command Query Responsibility Segregation separates write operations (Commands) from read
operations (Queries) using different models — often different data stores. Writes go to a normalized
store; reads go to a denormalized, query-optimized projection.
• Command side: handles business logic, strong consistency
• Query side: read-optimized projections, eventual consistency, can use Elasticsearch or Redis
• Event sourcing often paired with CQRS to rebuild read models from event log
[4yr] 'We used CQRS for our product catalog: writes go to PostgreSQL, an event triggers a projection update to
Elasticsearch for fast full-text search queries.'
Q: What is Event Sourcing?
Instead of storing current state, event sourcing stores a log of all state-changing events. The current
state is rebuilt by replaying events. Benefits: full audit log, ability to reconstruct past states, natural fit for
CQRS.
[4yr] 4yr: Know the tradeoffs — event replay time, schema evolution of events, snapshot strategy for performance.
Q: What is Eventual Consistency?
In a distributed system, after a write, all replicas will eventually converge to the same value — but not
immediately. Acceptable for many use cases (social feeds, search indexes) but not for financial
transactions needing strong consistency.
Q: Two microservices using the same database — recommended approach?
Refactor to database-per-service. If migration is not yet possible: use an anti-corruption layer (ACL),
introduce an API contract between services, and plan gradual data ownership migration. Never let two
services directly write to shared tables.
1.5 Communication Patterns
Q: What is Synchronous Communication in Microservices?
Caller waits for the response before continuing. Implemented via REST (HTTP/HTTPS), gRPC, or
GraphQL. Suitable for real-time queries where the result is needed immediately.
[4yr] 4yr: 'We used Feign Client with circuit breaker and retry for synchronous REST calls between services.'
Q: What is GraphQL and when to use it?
GraphQL is a query language for APIs that lets clients request exactly the data they need. Reduces
over-fetching and under-fetching. A single GraphQL endpoint replaces multiple REST endpoints.
• Use when: frontend needs flexible queries, multiple client types (mobile/web) need different data
shapes
• Avoid when: simple CRUD APIs, streaming, file uploads, strong HTTP caching is needed
[4yr] 4yr: Know N+1 problem in GraphQL and how DataLoader solves it via batching.
Q: What is gRPC?
gRPC is a high-performance RPC framework using Protocol Buffers (protobuf) for serialization over
HTTP/2. Faster than REST for internal service-to-service calls due to binary protocol, multiplexed
streams, and strong typing via proto contracts.
• Best for: high-throughput internal microservice calls, polyglot environments, streaming
• Not ideal for: browser-to-server (use gRPC-Web), when JSON debugging is important
[4yr] 4yr: 'We replaced REST with gRPC for our real-time location tracking service — 40% latency reduction due to binary
serialization.'
Q: What is Asynchronous Communication?
Caller sends a message and does not wait for response. The producer and consumer are decoupled in
time. Implemented via message brokers: Kafka, RabbitMQ, ActiveMQ. Best for event-driven flows,
background processing, and when loose coupling matters.
Q: Sync vs Async — when to use?
• Sync: user-facing requests needing immediate response (login, payment status check)
• Async: background tasks, notifications, event broadcasting, high-throughput data pipelines
• Async improves resilience: producer continues even if consumer is down
[4yr] 'Order placement was sync (confirm to user), but order processing (inventory, email) was async via Kafka.'
1.6 Message Brokers & Streaming (Kafka)
Q: How does Apache Kafka work and why is it fast?
Kafka is a distributed, partitioned, replicated log. Producers write to topic partitions sequentially
(append-only). Consumers read from partitions using offsets. Speed comes from:
• Sequential disk I/O (append-only log) — HDD sequential writes are fast
• Zero-copy transfer: OS sends data directly from page cache to network socket
• Batch processing: producers and consumers batch messages
• Partitioning: horizontal parallelism
[4yr] 4yr: Know partition key selection, consumer group rebalancing, and at-least-once vs exactly-once semantics.
Q: Explain Kafka Architecture?
• Broker: Kafka server storing log partitions
• Topic: logical channel; divided into partitions for parallelism
• Partition: ordered, immutable log segment; unit of parallelism and replication
• Producer: writes messages with a key (determines partition)
• Consumer Group: consumers in a group share partitions — each partition assigned to one
consumer
• ZooKeeper/KRaft: manages broker metadata and leader election
• Offset: position of consumer in partition log — stored in __consumer_offsets topic
[4yr] 4yr: 'We had 12 partitions and 6 consumer instances in a group — 2 partitions per consumer. When we added 3 more
instances, Kafka rebalanced automatically.'
Q: Kafka vs RabbitMQ vs Pub/Sub — when to use?
• Kafka: high-throughput event streaming, log replay, event sourcing, audit logs
• RabbitMQ: task queues, complex routing, priority queues, request-reply patterns
• Google Pub/Sub: managed, serverless, GCP ecosystem integration
[4yr] 'We used Kafka for event streaming (order events) and RabbitMQ for task distribution (email notifications with
retry/DLQ).'
Q: Message Queues vs Message Broker — difference?
Message Queue: point-to-point, one consumer gets the message. Message Broker: intermediary that
routes messages to multiple subscribers using pub/sub or routing rules. Kafka is a log-based broker.
RabbitMQ is a traditional AMQP broker.
1.7 API Gateway
Q: What is API Gateway and why use it?
API Gateway is the single entry point for all client requests in a microservice architecture. It handles
cross-cutting concerns so individual services don't have to.
• Authentication & Authorization
• Rate Limiting & Throttling
• Load Balancing & Routing
• Request/Response transformation
• SSL termination
• Caching & Logging
[4yr] 4yr: 'We implemented Spring Cloud Gateway with custom filters for JWT validation, rate limiting via Redis, and request
logging.'
Q: API Gateway vs Load Balancer — difference?
• Load Balancer: L4/L7 traffic distributor; no business logic
• API Gateway: L7 with full request awareness — auth, routing by URL, aggregation, transformation
Can API Gateway replace Load Balancer? Partially. Gateway handles L7 LB but for raw TCP/UDP or
between services at L4, a dedicated LB is still needed.
Q: How does API Gateway handle millions of requests without being a bottleneck?
• Horizontal scaling: multiple gateway instances behind a cloud LB
• Non-blocking I/O: reactive (Spring WebFlux / Netty) handles many concurrent requests
• Caching: frequently accessed responses cached at gateway
• Async processing: offload heavy tasks to backend asynchronously
[4yr] 'Our gateway ran 10 instances, each handling ~50k req/s via reactive Netty. Rate limiting was distributed via Redis
cluster.'
Q: API Composition / Aggregator Pattern?
API Composition calls multiple downstream services and merges results into one response. The
Aggregator pattern is similar but focuses on parallel calls and result combination. Different from
GraphQL which delegates aggregation to a query resolver.
[4yr] 'Our product page API called Catalog, Inventory, Reviews, and Recommendations in parallel using CompletableFuture,
aggregated at the gateway.'
1.8 Service Discovery & Registry
Q: What is Service Discovery?
In dynamic environments (Kubernetes, cloud), service instances have ephemeral IPs. Service
Discovery allows services to find each other without hardcoded addresses. Two types:
• Client-Side Discovery: client queries registry (Eureka), selects instance, calls directly. More
control; requires client-side LB (Ribbon).
• Server-Side Discovery: client calls a router (AWS ALB, K8s Service); router queries registry.
Simpler clients; more infrastructure.
[4yr] 4yr: 'We migrated from Eureka (client-side) to Kubernetes Service DNS (server-side) when we moved to K8s.'
Q: How Eureka Server works internally?
• Services register on startup, send heartbeats every 30 seconds
• Eureka marks service DOWN if no heartbeat for 90 seconds (3 missed intervals)
• Peer-to-peer replication between Eureka server instances
• Clients cache the registry locally and refresh every 30 seconds
• Self-preservation mode: if > 85% of heartbeats lost, Eureka stops evicting (protects against
network partition)
[4yr] 'We saw false evictions during GC pauses. We tuned Eureka's eviction threshold and added health check endpoints.'
1.9 Deployment Patterns
Q: End-to-End Deployment Steps in Microservices?
• 1. Developer pushes code → triggers CI pipeline (GitHub Actions / Jenkins)
• 2. Build: Maven/Gradle build → Docker image created
• 3. Test: Unit, integration, contract tests (Pact)
• 4. Push image to registry (ECR / DockerHub)
• 5. CD pipeline: deploy to K8s (kubectl apply / Helm chart)
• 6. K8s does rolling update, health checks pass, old pods terminated
• 7. Smoke tests run post-deploy; alerts configured
[4yr] 4yr: 'We used GitOps with ArgoCD — every merge to main auto-synced the K8s cluster from Helm charts in Git.'
Q: Types of Deployment Strategies?
• Rolling Update: replace instances one by one. Zero downtime, simple. Risk: mixed versions
coexist.
• Blue-Green: two identical environments. Switch traffic all at once. Safe rollback; expensive
(double infra).
• Canary: route small % of traffic to new version, ramp up. Gradual risk; requires feature flags /
traffic splitting.
• A/B Testing: route based on user attributes for feature experiments
• Recreate: shut down old, start new. Downtime, only for non-critical.
[4yr] 'We used canary deployments via Istio traffic splitting — 5% → 20% → 100% over 2 days, with automatic rollback on
error spike.'
Q: Docker vs Kubernetes — relationship?
Docker packages applications into container images. Kubernetes orchestrates running many containers
at scale: scheduling, scaling, self-healing, networking, and service discovery. Docker runs single
containers; K8s manages clusters of containers.
• K8s components: API Server, etcd, Scheduler, Controller Manager (Master Node)
• Worker Node: kubelet, kube-proxy, container runtime
• Pod: smallest unit — one or more containers sharing network/storage
• Deployment, Service, ConfigMap, Ingress are key K8s resources
1.10 Logging, Monitoring & Observability
Q: What is the Observability Design Pattern?
Observability means understanding system internal state from external outputs. Three pillars:
• Logs: structured events per service (ELK Stack: Elasticsearch, Logstash, Kibana)
• Metrics: time-series data (Prometheus + Grafana)
• Traces: end-to-end request flows across services (Zipkin, Jaeger, OpenTelemetry)
[4yr] 4yr: 'We instrumented Spring Boot services with Micrometer → Prometheus → Grafana dashboards. For tracing, Spring
Cloud Sleuth injected trace/span IDs, shipped to Jaeger.'
Q: What is Distributed Tracing?
A technique to track a request as it flows across multiple microservices. Each request gets a unique
Trace ID; each service leg gets a Span ID. Tools: Zipkin, Jaeger, OpenTelemetry. Correlation ID must
be propagated through all service calls and logged.
Q: What is Centralized Logging?
All microservices ship logs to a central system. Correlation ID (request ID) ties logs from different
services for the same request. ELK Stack: Logstash collects, Elasticsearch stores/indexes, Kibana
visualizes.
[4yr] 4yr: 'We used Fluentd sidecar → Elasticsearch. Added correlation ID in Spring via MDC (Mapped Diagnostic Context),
propagated via HTTP headers.'
1.11 Resilience, Fault Tolerance & Circuit Breaker
Q: What is Resilience and how do you achieve it?
Resilience is the ability of a system to handle failures gracefully and recover. Achieved through:
• Circuit Breaker: stop calling a failing service
• Retry with exponential backoff: retry failed calls with increasing delay
• Timeout: don't wait indefinitely for a response
• Bulkhead: isolate resources per service to prevent cascading failures
• Fallback: return cached/default response when service is down
• Rate Limiting: protect services from overload
[4yr] 'We used Resilience4j in Spring Boot for circuit breaker, retry, bulkhead, and time limiter — all configured via
[Link].'
Q: What is the Circuit Breaker Pattern?
Inspired by electrical circuit breakers. When failures exceed a threshold, the circuit 'opens' and calls are
immediately rejected (fail-fast). After a configured wait, it goes to 'half-open' to test recovery.
• CLOSED: normal operation, calls pass through, failures counted
• OPEN: circuit tripped, calls rejected immediately, no load on failing service
• HALF-OPEN: limited calls allowed to test if service recovered
[4yr] 4yr: 'When our payment gateway had issues, the circuit breaker opened after 50% failure rate in 10s window, served
cached last-known status, alerted PagerDuty.'
Q: What is Bulkhead Pattern?
Bulkhead isolates thread pools (or connection pools) per downstream service. If one service is slow, its
thread pool gets exhausted but does not consume threads meant for other services — preventing a
slow service from starving the whole system.
[4yr] 'We set threadPoolBulkhead with maxConcurrentCalls=20 for the inventory service so slow responses there couldn't
block order service threads.'
Q: Rate Limiting — algorithms?
• Token Bucket: tokens added at fixed rate; each request consumes a token. Allows bursts.
• Leaky Bucket: requests processed at fixed rate; excess queued or dropped. Smooth output.
• Fixed Window Counter: count requests per time window. Simple but edge-case burst.
• Sliding Window Log: precise but memory-intensive.
• Sliding Window Counter: hybrid — efficient and accurate.
[4yr] 'We implemented token bucket rate limiting at API Gateway using Redis atomic operations (Lua script) — allows burst
but caps sustained rate.'
1.12 Security in Microservices
Q: Security Challenges in Microservices?
• Service-to-service authentication (mutual TLS, JWT, API keys)
• Enlarged attack surface — each service is an endpoint
• Centralized auth vs per-service auth decision
• Token propagation across service calls
• Secret management (credentials must not be in code)
Q: How do you secure microservices? (OAuth2, JWT, mTLS)
• OAuth2 + JWT: API Gateway validates JWT; services trust gateway or validate themselves
• Mutual TLS (mTLS): both client and server present certificates — used in service mesh (Istio)
• Role-Based Access Control (RBAC): roles in JWT claims; services check scope/role
• Secret Management: HashiCorp Vault, AWS Secrets Manager — never hardcode secrets
• API Gateway as security perimeter: auth/authz at gateway, services trust internal network
[4yr] 4yr: 'Our public APIs went through Kong Gateway for JWT validation. Internal services used Istio mTLS — no app-level
auth code needed.'
1.13 Configuration Management
Q: How do you manage configuration in distributed systems?
Use a Centralized Config Server (Spring Cloud Config) backed by a Git repository. Services fetch
config on startup. For runtime changes, use @RefreshScope + Spring Actuator /refresh endpoint or
Spring Cloud Bus for broadcast refresh.
• Profiles: dev / qa / stage / prod configs in separate branches or files
• Secrets: externalize via Vault integration, not in Git
[4yr] 4yr: 'We used Spring Cloud Config with Git backend. Config changes triggered a Kafka event (Spring Cloud Bus), all
service instances refreshed without restart.'
1.14 Caching in Microservices
Q: Local Cache vs Distributed Cache?
• Local Cache: in-process (Caffeine, Guava Cache) — ultra fast, no network. Risk: each instance
has its own cache; updates don't propagate instantly.
• Distributed Cache: Redis, Memcached — shared across all instances. Consistent but adds
network hop.
[4yr] 'We used Caffeine as L1 cache (TTL 30s) and Redis as L2 cache (TTL 5min) for product catalog reads.'
Q: Cache Strategies — Aside, Write-Through, Write-Behind?
• Cache-Aside (Lazy Loading): app checks cache first, on miss fetches from DB and populates
cache. App controls caching. Risk: cache miss storm (thundering herd).
• Write-Through: write to cache AND DB simultaneously. Consistent; write latency added.
• Write-Behind (Write-Back): write to cache first, DB updated asynchronously. Fast writes; risk of
data loss on crash.
Q: Cache Invalidation — challenges?
One of the hardest problems in CS. Strategies: TTL (simple, may serve stale), Event-Based Invalidation
(event on update triggers cache eviction), Write-Through (always consistent).
[4yr] 4yr: 'We published a cache-invalidation event on Kafka when product price changed. All services listening evicted that
cache key immediately.'
1.15 Feign Client, Spring Cloud & Service Mesh
Q: What is Feign Client and how does it work internally?
Feign Client is a declarative REST client in Spring. You define an interface with @FeignClient and
annotations matching REST endpoints. Spring generates a proxy at runtime. Feign integrates with
Eureka (for service discovery), Ribbon (load balancing), and Resilience4j (circuit breaker).
• Internally: creates a JDK dynamic proxy → builds HTTP request from annotations → sends via
OkHttp/Apache → deserializes response
[4yr] 4yr: 'We configured Feign with interceptor for JWT token propagation, retry with exponential backoff, and Resilience4j
circuit breaker as fallback factory.'
Q: Feign vs RestTemplate vs WebClient?
• Feign: declarative, clean, auto-integrates with service discovery. Best for most cases.
• RestTemplate: imperative, synchronous, being deprecated in newer Spring. Use for legacy.
• WebClient: reactive, non-blocking, needed for async/streaming or high concurrency.
[4yr] 'We migrated from RestTemplate to Feign for most services. Used WebClient for our streaming analytics service.'
Q: What is Service Mesh?
A Service Mesh (Istio, Linkerd) adds an infrastructure layer for service-to-service communication
without changing app code. Via sidecar proxies (Envoy), it provides:
• mTLS between all services automatically
• Traffic management (retries, timeouts, circuit breaking at infra level)
• Distributed tracing (span propagation)
• Canary deployments / traffic splitting
[4yr] 4yr: 'We deployed Istio in our K8s cluster. mTLS was automatic, removing JWT validation from each service. Grafana
showed service-to-service latency per route.'
Q: Service Mesh vs API Gateway — difference?
API Gateway: north-south traffic (client to services), user-facing concerns — auth, rate limiting, SSL.
Service Mesh: east-west traffic (service to service), infra concerns — mTLS, observability, traffic
management. They are complementary, not alternatives.
SECTION 2: MICROSERVICE SCENARIOS (Interview Scenarios)
Key Scenarios & Expected Answers
Q: Monolith to Microservice Migration
Use Strangler Fig Pattern. Identify bounded contexts via DDD. Extract one service at a time. Use API
Gateway as facade. Sync data via events during migration. Start with lowest-risk service.
Q: Bank Account Transfer Across Two Databases
Use Saga Pattern (orchestration). Step 1: Debit account A (local TX). Step 2: Credit account B (local
TX). Compensation: re-credit A if step 2 fails. Use idempotency keys to prevent double debit on retry.
Q: Authentication Token Refresh in Streaming App
Use refresh token stored in HttpOnly cookie. Access token short-lived (15min). On 401, client calls
/auth/refresh with refresh token. If refresh token expired, force re-login. Rotate refresh tokens on each
use (rolling refresh tokens).
Q: Booking at Same Time (BookMyShow / Hotel)
Use optimistic locking (version column) or pessimistic locking for seat reservation. Queue requests via
Redis lock (SETNX). Idempotency key prevents double booking. Saga to roll back payment if seat
confirmed by someone else.
Q: Zero Downtime Deployment
Blue-Green or Canary deployment. Backward-compatible API changes (never break consumers).
Feature flags to toggle new behavior. DB migrations: additive only (new columns, never remove
immediately).
Q: API is Failing — Debugging Steps
1. Check circuit breaker state. 2. Distributed trace (Jaeger/Zipkin) to find which span is slow/failing. 3.
Check logs with correlation ID. 4. Check downstream service health. 5. Check DB connection pool. 6.
Check memory/CPU metrics in Grafana.
Q: Handling Traffic Spikes
Auto-scaling (K8s HPA on CPU/RPS). Rate limiting at Gateway. Queue requests via Kafka (async
buffer). Circuit breaker to shed load gracefully. CDN for static content. Pre-warm instances before
known events.
Q: Real-Time Fraud Detection
Kafka stream of transactions → Flink/Spark Streaming for pattern detection → fraud score computed →
if score > threshold, publish fraud event → Payment Service subscribes and blocks transaction.
Q: Preventing Double Payment (Idempotency)
Generate idempotency key on client (UUID). Send with payment request. Server stores key in
Redis/DB. If same key received again, return cached result without processing twice.
Q: Read-Heavy Service Optimization
Add Redis caching layer. Use read replicas for DB. CQRS: separate read model (Elasticsearch or
Redis). CDN for API responses where appropriate. Pagination + lazy loading.
SECTION 3: SYSTEM DESIGN CONCEPTS
3.1 Fundamentals
Q: What is System Design?
System design is the process of defining architecture, components, interfaces, and data flows to satisfy
specified requirements (functional and non-functional). It involves trade-off decisions around scalability,
availability, consistency, latency, and cost.
Q: HTTP vs HTTPS vs HTTP/2?
• HTTP: plaintext, no encryption, HTTP/1.1 uses one connection per request
• HTTPS: HTTP over TLS — encrypted, authenticated, integrity-protected
• HTTP/2: multiplexed streams over one TCP connection, header compression (HPACK), server
push, binary framing. Reduces latency significantly.
[4yr] 4yr: 'We enabled HTTP/2 on our API Gateway — reduced connection overhead by 60% for mobile clients.'
Q: What is TLS/SSL?
TLS (Transport Layer Security) encrypts data in transit. Handshake: client sends hello → server sends
certificate → client verifies against CA → session key negotiated → encrypted channel established.
SSL is the deprecated predecessor.
Q: What happens when you enter [Link]?
• 1. Browser checks local DNS cache, OS cache
• 2. DNS resolver queries root → TLD → authoritative DNS → returns IP
• 3. TCP 3-way handshake with Google's IP
• 4. TLS handshake (for HTTPS)
• 5. HTTP GET request sent
• 6. Load balancer routes to nearest server (Anycast routing)
• 7. Response returned, HTML rendered, sub-resources fetched
3.2 Database Design
Q: SQL vs NoSQL — when to choose?
• SQL (PostgreSQL, MySQL): structured data, complex joins, ACID transactions, financial data
• Document DB (MongoDB): flexible schema, nested documents, catalogs, user profiles
• Columnar DB (Cassandra, HBase): wide rows, high write throughput, time-series, IoT
• Key-Value (Redis, DynamoDB): ultra-fast lookups, sessions, caching, leaderboards
• Data Warehouse (Redshift, BigQuery): analytical queries, aggregations, BI reports
[4yr] 'For our ride-hailing app: user profiles in MongoDB, location in Cassandra (high write rate), payments in PostgreSQL
(ACID), cache in Redis.'
Q: What is CAP Theorem?
In a distributed system, you can guarantee at most 2 of 3: Consistency (every read returns latest write),
Availability (every request gets a response), Partition Tolerance (system works despite network
partition). Since P is mandatory in real distributed systems, you choose CA or CP or AP.
• CP systems: ZooKeeper, HBase, MongoDB (with strong consistency) — bank transactions
• AP systems: Cassandra, CouchDB, DynamoDB — high availability, eventual consistency
[4yr] 4yr: 'We used Cassandra (AP) for our event store and PostgreSQL (CP) for financial records — different consistency
needs, different databases.'
Q: Database Sharding vs Partitioning?
• Partitioning: splitting a table within a single DB instance (horizontal = row-based, vertical =
column-based)
• Sharding: distributing partitions across multiple DB servers. Horizontal scaling of data.
• Sharding strategies: Range (by date, ID range), Hash (hash of key), Directory (lookup table)
• Challenges: cross-shard joins, rebalancing, hot shards, distributed transactions
[4yr] 4yr: Know resharding challenges and consistent hashing as a solution to avoid mass key migration.
Q: What is Database Indexing?
Index is a data structure (B-Tree, Hash, GiST) that speeds up reads at the cost of write overhead and
storage. Composite indexes: column order matters — leading column must match query predicate.
Covering index: index includes all columns needed by query (index-only scan).
[4yr] 4yr: 'We identified slow queries with EXPLAIN ANALYZE, added a composite index on (user_id, created_at DESC) —
query went from 8s to 50ms.'
Q: Master-Slave (Primary-Replica) Architecture?
Master handles writes; replicas handle reads (read scaling). Replication can be synchronous (strong
consistency, latency) or asynchronous (eventual consistency, lower latency). On master failure,
promote a replica (failover).
[4yr] 4yr: Know replication lag issues, read-your-writes problem, and how to handle it (sticky sessions or read from master).
3.3 Communication & APIs
Q: How to design a REST API?
• Use nouns for resources: /orders, /users/{id}/orders
• Use HTTP verbs correctly: GET (read), POST (create), PUT/PATCH (update), DELETE
• Versioning: /v1/orders or via Accept header
• Pagination: cursor-based (for large datasets) or offset-based
• Error responses: standard HTTP codes + structured error body with code/message
• Idempotency: PUT and DELETE must be idempotent; POST use idempotency key
[4yr] 4yr: 'We used cursor-based pagination for our news feed — offset pagination breaks when rows inserted between
pages.'
Q: WebSocket vs Webhook vs Server-Sent Events (SSE)?
• WebSocket: full-duplex, persistent connection, bidirectional. Use for chat, live collaboration,
gaming.
• Webhook: HTTP callback — server POSTs to your URL when event occurs. One-way, event-
driven. Use for payment notifications, GitHub webhooks.
• SSE: server pushes data to client over HTTP/1.1. One-way, client to server via HTTP. Use for live
dashboards, news feed updates.
[4yr] 'We used WebSocket for live chat, SSE for order status updates, and Webhooks for payment provider callbacks.'
Q: What is Polling and its types?
• Short Polling: client polls every N seconds. Simple but wasteful if no updates.
• Long Polling: client holds connection open; server responds when event occurs or timeout.
Reduces unnecessary calls.
• Prefer WebSocket or SSE over polling for real-time requirements.
Q: Proxy vs Reverse Proxy?
• Forward Proxy: sits between client and internet — hides client identity. Used for content filtering,
VPNs.
• Reverse Proxy: sits between client and servers — hides server identity. NGINX as reverse proxy:
load balancing, SSL termination, caching.
3.4 Performance & Scalability
Q: Horizontal vs Vertical Scaling?
• Vertical (Scale Up): add CPU/RAM to existing server. Limit: max hardware; single point of failure.
• Horizontal (Scale Out): add more servers. No theoretical limit; requires stateless services and
load balancer.
[4yr] 'We scaled our API layer horizontally (K8s HPA) but kept the database vertical until we hit limits — then sharded.'
Q: Latency vs Throughput?
Latency: time to complete one request (lower = better). Throughput: number of requests/second (higher
= better). They can conflict — batching increases throughput but latency per item. Optimize based on
SLA: user-facing = optimize latency; batch processing = optimize throughput.
Q: What is Consistent Hashing and where is it used?
Consistent hashing maps keys to nodes on a virtual ring. When nodes are added/removed, only
adjacent keys are remapped — minimizing data movement. Used in: distributed caches (Redis
Cluster), load balancers (sticky sessions), database sharding, DHT (Distributed Hash Tables like
Chord).
• Virtual nodes: each physical node gets multiple positions on the ring for uniform distribution
[4yr] 4yr: 'Our Redis Cluster uses consistent hashing with 16384 hash slots. Adding a new node rebalances only those slots,
not all keys.'
Q: What is CDN and how does it work?
Content Delivery Network caches content at edge servers geographically close to users. Request goes
to nearest PoP (Point of Presence). On cache hit: served instantly. On miss: fetched from origin,
cached at edge.
• Pull CDN: origin fetched on first miss (Cloudflare, Akamai)
• Push CDN: you push content to CDN proactively (good for static assets)
[4yr] 'We used CloudFront for static assets and API caching. 80% of GET requests served from edge — reduced API load
significantly.'
Q: Stateless vs Stateful Architecture?
• Stateless: no session stored on server — each request carries all needed context (JWT). Scales
horizontally trivially.
• Stateful: server stores session state — requires sticky sessions or distributed session store
(Redis).
Common patterns for stateless: JWT tokens, externalizing state to Redis, event sourcing.
Q: What is Serverless Architecture?
Functions-as-a-Service (AWS Lambda, Azure Functions). Auto-scales to zero, pay-per-execution. Best
for: event-driven functions, background jobs, infrequent workloads. Not ideal for: long-running
processes, latency-sensitive APIs (cold start), stateful operations.
Q: What is Elasticsearch and Full-Text Search?
Elasticsearch is a distributed search engine built on Apache Lucene. Stores data in inverted index:
maps terms to documents. Supports full-text search, fuzzy matching, relevance scoring, aggregations.
• Full-text search: tokenize text → build inverted index → score via TF-IDF / BM25
• Fuzzy search: Levenshtein distance for typo tolerance
• Geospatial: geo-point type for location queries
[4yr] 'We sync product data from PostgreSQL to Elasticsearch via Debezium CDC → Kafka → Elasticsearch connector.
Product search runs on ES, orders on Postgres.'
3.5 High-Level Design Approach
Q: How to approach a System Design Interview?
• 1. Clarify requirements: functional (what it does) and non-functional (scale, latency, availability)
• 2. Estimate scale: users, requests/sec, data size, bandwidth
• 3. Define API: endpoints, request/response schema
• 4. High-level architecture: draw components and data flows
• 5. Database design: schema, choice of DB type
• 6. Deep dive: bottleneck areas (cache, LB, async queues)
• 7. Address non-functional: scaling, fault tolerance, consistency
[4yr] Always: mention trade-offs. Say 'We could do X but Y is better here because...' Interviewers reward trade-off thinking.
SECTION 4: HIGH LEVEL DESIGNS (HLD)
URL Shortener (TinyURL)
• Requirements: shorten URL, redirect, analytics. Scale: 100M URLs, 10B redirects/day.
• API: POST /shorten → short code. GET /{code} → 301/302 redirect.
• ID Generation: Base62 encode auto-increment ID or MD5 hash (first 7 chars).
• DB: URLs in MySQL (write once read many). short_code → original_url mapping.
• Cache: Redis for hot short codes (90% traffic served from cache).
• CDN: Not applicable (redirect must be tracked). Use 302 (not 301) if analytics needed.
[4yr] Custom aliases, expiry TTL, click analytics via Kafka consumer.
Rate Limiter
• Token Bucket algorithm with Redis. Each API key has a bucket in Redis (counter + timestamp).
• Lua script for atomic check-and-decrement (prevents race conditions).
• Config: per-user, per-route, per-IP limits stored in Redis Hash.
• Response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.
[4yr] Distributed rate limiting across gateway instances via shared Redis cluster. Handle Redis failure: fail-open (allow) or
fail-closed (block).
Notification System
• Components: Notification Service receives trigger events, fanout logic, channel adapters
(Email/SMS/Push).
• Queue per channel: Kafka topics for Email, SMS, Push with different consumer groups.
• Retry with exponential backoff + Dead Letter Queue for failed notifications.
• User preferences: store notification opt-in/out per channel per type.
• Template engine: message templates with variable substitution.
[4yr] Priority queues (transactional > marketing). Idempotency to prevent duplicate sends. APNs/FCM for push, SES for
email, Twilio for SMS.
Payment System
• Idempotency: unique payment ID per transaction to prevent double charge.
• Saga Orchestration: Payment → Inventory → Fulfillment with compensation.
• Two-phase: reserve funds → capture on fulfillment success.
• Database: PostgreSQL with ACID. Event outbox pattern for reliable event publishing.
• PCI DSS compliance: no raw card data stored; use payment gateway tokenization.
[4yr] Reconciliation jobs to detect discrepancies. Audit trail for every state change. Distributed lock to prevent concurrent
payment on same order.
Hotel / Ticket Booking System
• Inventory: use Redis for real-time seat/room availability (fast reads).
• Reservation: pessimistic lock on seat/room record for short window during checkout.
• Saga: Reserve → Payment → Confirm. On payment failure → release reservation.
• Overbooking prevention: atomic decrement in Redis; DB as source of truth.
[4yr] Waitlist feature with event-driven release notification. Show only N% seats to manage false scarcity UX. Separate read
model for search.
Design Instagram / Social Feed
• Post Service: stores posts in Cassandra (wide rows by user_id + created_at).
• Follow Graph: stored in graph DB (Neo4j) or denormalized in Cassandra.
• Feed Generation: Fan-out on write (push to followers' feed cache on post). For celebrities (10M
followers), use fan-out on read (pull and merge).
• Feed stored in Redis sorted set (score = timestamp) per user.
• Media: images stored in S3, served via CloudFront CDN.
[4yr] Hybrid approach: push for < 1M followers, pull for celebrities. Rank feed by ML score, not just time.
Design Uber / Ride Hailing
• Driver location: drivers send GPS every 5s → Location Service → Redis Geo (geospatial index).
• Rider requests trip → Matching Service finds nearest available drivers within radius (Redis
GEORADIUS).
• Trip matching: distributed lock prevents two riders matching same driver.
• Payment: Saga — trip end → charge → update driver earnings.
• Surge pricing: demand/supply ratio per geohash cell → dynamic pricing engine.
[4yr] Geohash partitioning of driver locations for efficient range queries. ETA computed via road graph (OSRM). Driver state
machine (AVAILABLE → MATCHED → ON_TRIP).
Chat System (WhatsApp / Messenger)
• WebSocket for persistent connection between client and chat server.
• Message routing: client → WebSocket server → message broker (Kafka) → recipient's server →
WebSocket to recipient.
• Offline messages: stored in Cassandra (user_id + conversation_id + timestamp). Delivered on
reconnect.
• Message ordering: Snowflake ID (time-ordered, globally unique) for consistent ordering.
• Group chat: fanout to all group members via Kafka topic per group.
[4yr] End-to-end encryption: Signal Protocol. Message delivery receipts (sent/delivered/read). Media via S3 pre-signed URL.
Netflix / OTT Video Streaming
• Upload: video → S3 → transcoding job (FFmpeg) → multiple resolutions (240p → 4K) → CDN.
• Streaming: HLS (HTTP Live Streaming) — segment video into chunks. Client downloads chunks
adaptively based on bandwidth.
• CDN: video chunks cached at edge. 95%+ of traffic served from edge.
• Recommendation: collaborative filtering or ML model on watch history.
• Billing: subscription via Saga (subscribe → payment auth → activate).
[4yr] DASH vs HLS. DRM (Widevine, FairPlay) for content protection. Analytics: viewing duration, quality drops, buffering
events.
Google Search / Autocomplete
• Indexing: crawler → HTML parser → inverted index builder → index shards by term hash.
• Search: query → tokenize → inverted index lookup → ranking (PageRank + TF-IDF + ML) → top
results.
• Autocomplete: Trie data structure on popular search terms. Backed by prefix search in
Elasticsearch.
[4yr] Bloom filter to skip non-existent terms. Distributed Trie sharded by prefix. Real-time trending via time-windowed
counters.
SECTION 5: REAL-WORLD SYSTEM DESIGN SCENARIOS
Q: User clicks 'Pay Now' twice — prevent double payment?
Idempotency key (UUID) sent with payment request. Server stores key in Redis with short TTL. Second
request with same key returns cached response without reprocessing. DB-level unique constraint on
idempotency_key as backup.
Q: How does Flipkart survive Big Billion Days?
Pre-scaling: K8s nodes scaled up before event. CDN caches static pages. Async checkout via Kafka
queue. Inventory held in Redis (fast atomic decrement). Circuit breakers on non-critical services. DB
read replicas for product queries. Flash sale window limits concurrent users via virtual queue (waiting
room).
Q: How Zomato handles millions of orders?
Order Service publishes OrderPlaced event to Kafka. Restaurant, Delivery, Notification services are
Kafka consumers. Saga manages restaurant accept → rider assigned → delivery confirmed. Redis for
restaurant availability and ETA.
Q: How BookMyShow handles simultaneous booking?
Pessimistic lock on seat row during checkout (short TTL, e.g., 10 min hold). Saga: reserve → payment
→ confirm. Idempotency to prevent double booking on retry. Redis atomic operations for inventory
count.
Q: How WhatsApp maintains message order in group?
Server assigns monotonically increasing sequence number per conversation. Clients buffer out-of-order
messages and reorder before display. Lamport timestamps or hybrid logical clocks for distributed
ordering.
Q: How WhatsApp shows 'typing...' indicator?
Client sends typing event via WebSocket to server every 3s while typing. Server fans out to group
members' WebSocket connections. Typing indicator cleared after 5s of no event (TTL).
Q: How Hotstar scales for India vs Pakistan cricket?
Pre-warm auto-scaling. CDN takes 95%+ HLS chunk requests. Adaptive bitrate streaming reduces
bandwidth demand. Viewer count peaks managed via WebSocket load balancers with connection
limits. Chaos testing before major events.
Q: How Uber updates driver location without refresh?
Driver app sends location via WebSocket every 5s. Location Service writes to Redis Geo index. Rider
app polls via long polling or SSE for nearby driver positions. Map updated in real time on client.
Q: How Elastic Search works in milliseconds?
Inverted index: terms → document list. Relevance scoring pre-computed. Shards enable parallel query
execution. In-memory segment cache. BM25 scoring is O(1) per shard. Results merged and top-K
returned.
Q: How YouTube handles millions of uploads?
Upload → S3 (chunked multipart). Transcoding job queue (Kafka) → distributed FFmpeg workers
convert to multiple resolutions. Metadata stored in Spanner. CDN caches popular videos. Long-tail
content cached on demand.
SECTION 6: SOLID PRINCIPLES
S — Single Responsibility Principle (SRP)
A class should have only one reason to change — only one responsibility. Violation: a class that
handles both business logic AND database operations AND email sending. Fix: separate into
OrderService, OrderRepository, EmailNotifier.
[4yr] 4yr: 'Our Invoice class was doing PDF generation + emailing + DB save. We split into InvoiceGenerator,
InvoicePersister, InvoiceEmailer — each testable independently.'
O — Open/Closed Principle (OCP)
Software entities should be open for extension, closed for modification. Add new behavior via new
classes/interfaces, not by editing existing code. Use Strategy, Template Method, or Decorator patterns.
Example: adding new payment methods without touching existing PaymentProcessor.
[4yr] 4yr: 'We used Strategy pattern for discount calculation — adding a FestiveDiscountStrategy never touched existing
code, just registered a new bean.'
L — Liskov Substitution Principle (LSP)
Subtypes must be substitutable for their base types without breaking the program. Violation: Square
extends Rectangle but setWidth() and setHeight() break area assumptions. Fix: don't force inheritance
where behavior changes.
[4yr] 4yr: Know the classic Rectangle/Square violation and how to fix it via composition or interface segregation.
I — Interface Segregation Principle (ISP)
Clients should not be forced to depend on methods they don't use. Split large interfaces into smaller,
focused ones. Example: don't put print(), fax(), scan() in one Printer interface — devices that only print
are forced to stub fax/scan.
[4yr] 4yr: 'We split our UserService interface into UserAuthService, UserProfileService, UserPreferenceService — each
REST controller depends only on what it needs.'
D — Dependency Inversion Principle (DIP)
High-level modules should not depend on low-level modules — both should depend on abstractions.
Inject dependencies via interfaces, not concrete classes. Spring's IoC container is built on DIP.
@Service + @Autowired by interface = DIP in practice.
[4yr] 4yr: 'Our OrderService depends on NotificationPort interface, not EmailService directly — we swap in SmsService or
WhatsAppService without touching OrderService.'
SECTION 7: DESIGN PATTERNS
7.1 Creational Patterns
Q: Singleton Pattern — what, when, how, thread-safety?
Ensures only one instance of a class exists. Use for: shared resource (DB connection pool, config,
logger). Implementation options:
• Eager initialization: instance created at class load. Thread-safe, but always created.
• Lazy initialization with synchronized: thread-safe but slow (synchronized on every call).
• Double-Checked Locking with volatile: check lock only when null — best performance.
• Enum Singleton: simplest, thread-safe, serialization-safe. Preferred by Effective Java.
• Bill Pugh (static inner class): class loaded lazily by JVM, thread-safe via class loader.
[4yr] How to break Singleton: Reflection (access private constructor), Serialization (creates new object), Cloning. Fix: throw
exception in constructor if instance exists, use readResolve(), don't implement Cloneable.
Q: Factory Pattern?
Defines an interface for creating objects, but lets subclasses decide which class to instantiate. Use
when: object creation logic is complex, creation depends on runtime type. Example:
[Link]('circle') returns Circle instance.
[4yr] 4yr: 'We used Factory pattern for our payment gateway adapters — [Link](type) returns
StripeGateway, RazorpayGateway, or PayPalGateway.'
Q: Abstract Factory Pattern?
Factory of factories. Creates families of related objects without specifying concrete classes. Use when:
system must be independent of how products are created and composed. Example:
UIComponentFactory for Windows vs Mac — each returns themed Button, Checkbox, TextInput.
Q: Builder Pattern?
Separates complex object construction from its representation. Use when: objects with many optional
parameters (avoid telescoping constructors). Example: StringBuilder, Lombok @Builder,
[Link].
[4yr] 'We used Lombok @Builder for our NotificationRequest — optional fields like attachments, cc, priority without 10
constructor overloads.'
Q: Prototype Pattern?
Create new objects by copying existing ones (cloning). Use when: object creation is expensive and a
similar object already exists. Implement Cloneable or copy constructor. Deep copy vs shallow copy is a
key interview point.
Q: Differences between Creational Patterns?
• Singleton: one instance globally
• Factory: decide which class to instantiate at runtime
• Abstract Factory: decide which family of classes to instantiate
• Builder: construct complex object step-by-step
• Prototype: clone an existing object
7.2 Structural Patterns
Q: Adapter Pattern?
Converts one interface to another expected by the client. Like a plug adapter. Use when: integrating
legacy code or third-party library with different interface. Example: adapting a CSV data reader to work
with a system expecting JSON.
[4yr] 'We used Adapter pattern to wrap legacy payment API response into our domain model without changing the legacy
code.'
Q: Proxy Pattern?
A substitute that controls access to another object. Types: Virtual Proxy (lazy loading), Protection Proxy
(access control), Remote Proxy (remote object), Logging Proxy (AOP-style). Spring AOP uses dynamic
proxies (JDK or CGLIB) for @Transactional, @Cacheable, @Async.
[4yr] 4yr: 'Spring's @Transactional works via CGLIB proxy that wraps your method in begin/commit/rollback. Know why
calling a @Transactional method from within the same class bypasses it.'
Q: Decorator Pattern?
Adds behavior to objects dynamically without modifying their class. Wraps the original object. Java I/O
streams are the classic example: BufferedInputStream wraps FileInputStream adds buffering;
GZIPInputStream wraps that to add compression.
[4yr] 'We used Decorator to add retry and logging to our HTTP client without modifying the client class.'
Q: Facade Pattern?
Provides a simplified interface to a complex subsystem. Reduces coupling between client and
subsystem. Example: [Link]() coordinates TV, Amplifier, Bluray, Lights
without exposing individual APIs.
[4yr] 'Our [Link]() coordinates InventoryService, PaymentService, NotificationService — one call for the
client.'
Q: Flyweight Pattern?
Shares common state (intrinsic) among many objects to reduce memory. Extrinsic state passed per-
use. Example: in a text editor, character objects share font/style (intrinsic) and only store position
(extrinsic). Java String Pool is flyweight.
Q: Composite Pattern?
Treats individual objects and compositions uniformly via a common interface. Tree structures: File
System (File and Directory both implement FileSystemNode). Client treats files and folders the same
way.
Q: Bridge Pattern?
Decouples abstraction from implementation so they can vary independently. Instead of: RemoteControl
+ TV, RemoteControl + Radio (exponential subclasses) → Bridge: RemoteControl has-a Device
interface, TV and Radio implement Device.
Q: Differences between Structural Patterns?
• Adapter: makes incompatible interfaces work together
• Facade: simplifies complex subsystem
• Decorator: adds behavior dynamically
• Proxy: controls access to object
• Bridge: separates abstraction from implementation
• Flyweight: shares state to reduce memory
• Composite: uniform treatment of leaf and composite objects
7.3 Behavioral Patterns
Q: Observer Pattern?
Define one-to-many dependency — when subject state changes, all observers are notified. Use for:
event systems, MVC (Model notifies View), DOM events. Spring ApplicationEvent, @EventListener,
Kafka consumers are all observer implementations.
[4yr] 'We used Spring Events (ApplicationEventPublisher) for in-process events like OrderPlaced → InventoryReserved,
AnalyticsUpdated — all in same service.'
Q: Strategy Pattern?
Defines a family of algorithms, encapsulates each, and makes them interchangeable. Eliminates if-else
chains for algorithm selection. Use when: multiple variants of a behavior exist and you want to switch at
runtime.
[4yr] 'PaymentStrategy: StripeStrategy, PayPalStrategy, WalletStrategy — selected at runtime based on user's payment
method. No if-else in PaymentService.'
Q: Command Pattern?
Encapsulates a request as an object. Supports undo/redo, queuing, logging. Use for: undo/redo
operations, transaction scripts, job queues. Example: TextEditor commands (TypeCommand,
DeleteCommand, BoldCommand) all implement [Link]() / undo().
Q: State Pattern?
Allows an object to alter its behavior when its internal state changes. Appears to change class. Use for:
state machines. Example: TrafficLight (Red→Green→Yellow→Red). Order state machine
(PLACED→CONFIRMED→SHIPPED→DELIVERED).
[4yr] 'We modeled the Driver state machine using State pattern: AVAILABLE, MATCHED, ON_TRIP, OFFLINE — each state
handles ride-request events differently.'
Q: Chain of Responsibility Pattern?
Passes request along a chain of handlers. Each handler decides to process or pass. Use for:
middleware chains, filter chains, approval workflows. Servlet Filter chain in Java, Spring Security filter
chain, logging handlers.
[4yr] 4yr: 'Our API Gateway implemented a filter chain: AuthFilter → RateLimitFilter → LoggingFilter → RoutingFilter. Each
filter passes to next or short-circuits.'
Q: Template Method Pattern?
Defines skeleton of algorithm in base class, defers some steps to subclasses. Avoids code duplication.
Use when: steps of algorithm are fixed but implementations differ. Spring's JdbcTemplate,
RestTemplate — template methods handle connection/error boilerplate, you provide query logic.
Q: Iterator Pattern?
Provides a way to access elements of a collection sequentially without exposing underlying
representation. Java Iterator, Iterable, for-each loop. Custom iterators for tree traversal, paginated API
results.
SECTION 8: DESIGN PATTERN INTERVIEW QUESTIONS (Scenario-Based)
Q: Design a logging system that supports multiple log levels and outputs (file, console, DB)?
Chain of Responsibility for log level filtering (ERROR handler → WARN handler → INFO). Strategy for
log output (FileOutput, ConsoleOutput, DBOutput). Singleton for Logger instance. Builder for LogEntry
construction.
Q: Design a notification system supporting Email, SMS, Push?
Strategy pattern for notification channel (EmailStrategy, SmsStrategy, PushStrategy). Factory to create
the right strategy. Observer to decouple event source from notification trigger. Decorator to add retry
behavior.
Q: Design a payment processing system?
Strategy for payment method. Factory for gateway selection. Facade for PlaceOrder. State machine for
payment lifecycle (INITIATED → AUTHORIZED → CAPTURED → REFUNDED).
Q: How to implement undo/redo in a text editor?
Command pattern. Each action is a Command object with execute() and undo(). Maintain two stacks:
undo stack and redo stack. Push to undo on execute. On undo, pop from undo → undo() → push to
redo stack.
Q: Design a cache with LRU eviction?
LinkedHashMap with accessOrder=true (Java built-in LRU). Or: HashMap + Doubly Linked List for O(1)
get and put. Decorator pattern to add cache behavior to existing service. Thread safety:
ReentrantReadWriteLock or ConcurrentHashMap with computeIfAbsent.
Q: Design a plugin system where new implementations can be added without modifying existing
code?
OCP + Strategy + Factory. Define Plugin interface. Use ServiceLoader (Java SPI) or Spring
@ConditionalOnProperty for plugin registration. Factory instantiates correct plugin. New plugin = new
JAR, no core code change.
SECTION 9: KAFKA DEEP DIVE QUESTIONS
Q: What is consumer group rebalancing and when does it happen?
Rebalancing reassigns partitions to consumers in a group when a consumer joins, leaves, or fails
(missed heartbeat). During rebalancing, consumption pauses (stop-the-world). To minimize: use sticky
partition assignor, increase [Link], use incremental cooperative rebalancing (Kafka 2.4+).
Q: At-least-once vs exactly-once semantics in Kafka?
At-least-once: default. Consumer commits offset after processing. On crash, may reprocess last
messages. Make consumer idempotent to handle duplicates. Exactly-once: Kafka Transactions +
Idempotent Producer + Transactional Consumer. Producer writes with transaction ID, consumer reads
only committed. Higher latency.
Q: What is Kafka's retention and compaction?
Retention: delete segments after time ([Link]) or size ([Link]). Log
Compaction: retain only latest value per key (useful for changelog/event sourcing). Compacted topics
keep last record per key indefinitely.
Q: How do you handle poison messages (messages that always fail processing)?
After max retries, move to Dead Letter Topic (DLT). Consumer error handler publishes to DLT. Monitor
DLT, alert ops, and replay/fix manually or via DLT consumer.
Q: How to ensure ordering in Kafka?
Ordering guaranteed within a partition. Use the same key for messages that must be ordered (e.g.,
userId as key → all user events go to same partition → ordered). Cross-partition ordering not
guaranteed — design to avoid it.
Q: Kafka vs database as event store?
Kafka: time-limited retention (default), optimized for streaming, not queryable by arbitrary fields, no
random access by event ID. Event DB (EventStoreDB): permanent, queryable, supports snapshots.
Use Kafka for streaming pipelines; use EventStoreDB for long-term event sourcing.
SECTION 10: KEY TIPS FOR 4-YEAR EXPERIENCE INTERVIEWS
At 4 years, interviewers expect you to go beyond definitions and demonstrate real project
decisions, trade-offs, and war stories from production.
1. Always frame answers with STAR: Situation → Task → Action → Result. Attach numbers
('reduced latency by 40%', '10k TPS').
2. Know the WHY behind every pattern choice: 'We chose Saga over 2PC because our services
couldn't support XA transactions and we needed availability over strict consistency.'
3. Be ready for follow-up questions: 'How would you handle a failure in step 3 of your Saga?' or
'What if Redis goes down?'
4. Understand trade-offs: 'CQRS adds operational complexity — we only used it where query
patterns differed significantly from write models.'
5. Know your tools deeply: Spring Cloud Gateway filters, Kafka consumer group rebalancing,
Resilience4j configuration, K8s HPA triggers.
6. Practice system design out loud. Draw diagrams. Name components. State assumptions.
7. Show ownership: 'I led the migration of X service from REST to Kafka' shows you can drive
changes, not just implement assigned tasks.
8. Discuss failures too: 'We had a cascading failure because our circuit breaker threshold was too
high — we tuned it and added bulkhead isolation.'
9. Know Spring Boot internals: @Transactional proxy, how @Cacheable works, bean scopes,
Spring Security filter chain.
10. For HLD: Start with functional requirements, estimate scale (users, RPS, storage), then draw
architecture. Always mention bottlenecks and how you'd address them.
Best of luck in your interviews! You've got this.