Gateway Routing & Gateway Offloading
Use the gateway as a smart router: path-based routing, header-based routing. Offload cross-cutting concerns: auth, rate limiting, SSL termination, compression.
Gateway Routing
Gateway Routing treats the gateway as an intelligent reverse proxy that dispatches requests to the correct backend based on request attributes. Rather than clients knowing each service's address, a single gateway endpoint handles all traffic and routes it based on configurable rules. This is sometimes called the Gateway as a Router pattern to distinguish it from aggregation.
Routing Strategies
| Strategy | Route Decision Based On | Example |
|---|---|---|
| Path-based routing | URL path prefix or exact match | `/api/users/*` → User Service, `/api/orders/*` → Order Service |
| Header-based routing | HTTP request header value | `X-API-Version: v2` → new service version |
| Host-based routing | HTTP `Host` header / subdomain | `mobile.api.example.com` → Mobile BFF |
| Method-based routing | HTTP method (GET, POST, DELETE) | GET `/items` → read replica, POST → write primary |
| Query param routing | URL query parameter value | `?region=eu` → EU cluster |
| Weighted routing | Percentage split | 10% → canary service, 90% → stable service |
Gateway Offloading
Gateway Offloading moves cross-cutting concerns that would otherwise be duplicated across every microservice into the gateway. Each service team no longer needs to implement authentication middleware, TLS configuration, compression, or distributed tracing individually. This is DRY (Don't Repeat Yourself) applied at the infrastructure level.
- SSL/TLS Termination — Decrypt HTTPS at the gateway; services communicate over HTTP internally on a trusted network.
- Authentication & Authorization — Validate JWTs or API keys centrally; inject `X-User-Id` headers for services.
- Rate Limiting — Enforce per-API-key, per-IP, or per-endpoint quotas at the edge.
- Request/Response Compression — Gzip or Brotli compress responses without each service implementing it.
- Distributed Tracing — Inject `X-Trace-Id` headers for correlation across services (OpenTelemetry, Zipkin).
- Caching — Cache `GET` responses with appropriate TTLs to reduce backend load.
- CORS Headers — Add `Access-Control-Allow-Origin` headers centrally instead of in every service.
- IP Allow/Deny Lists — Block or allow traffic by IP range before requests reach any service.
Zero-Trust vs Perimeter Security
Offloading auth to the gateway is a perimeter security model. In a zero-trust architecture, services also validate tokens internally, even after the gateway has checked them. For high-security systems, layer both: gateway validates the token is genuine, individual services validate permissions for their specific resources.
Canary Deployments via Weighted Routing
Gateway Routing enables canary releases: deploying a new service version and gradually shifting a percentage of traffic to it. At 1% traffic, monitor error rates and latency. If metrics are healthy, shift to 10%, then 50%, then 100%. The gateway configuration (no code change) controls the split. This is how AWS CodeDeploy, Argo Rollouts, and Istio implement canary deployments in practice.
# Kong Gateway example: weighted routing for canary deployment
services:
- name: order-service-stable
url: http://orders-v1:8080
- name: order-service-canary
url: http://orders-v2:8080
plugins:
- name: request-termination
route: orders-canary
config:
status_code: 200
routes:
- name: orders-stable
service: order-service-stable
paths: ["/api/orders"]
# Use traffic-splitter plugin: 90% to stable, 10% to canaryInterview Tip
If an interviewer asks 'How do you deploy a new version of a service with zero downtime?', mention gateway-based weighted routing (canary deployments) as the mechanism. Then show you understand the trade-offs: the gateway must support weighted routing (not all gateways do by default), and you need observability to know when to advance the canary percentage. Tools like Istio, Envoy, and AWS ALB weighted target groups support this natively.