Menu
Course/Architectural Styles/Serverless Architecture

Serverless Architecture

Functions-as-a-Service and managed services: AWS Lambda, cold starts, execution limits, cost models, and when serverless is (and isn't) the answer.

12 min read

What Is Serverless?

Serverless is a cloud execution model where the cloud provider dynamically manages server infrastructure. Developers deploy functions or application logic; the provider handles provisioning, scaling, patching, and billing — charging only for actual compute time consumed, not for idle servers. The term 'serverless' is a misnomer — there are still servers, but they are fully abstracted away from the developer.

Serverless encompasses two related concepts: Functions-as-a-Service (FaaS) — event-triggered, short-lived compute functions (AWS Lambda, Google Cloud Functions, Azure Functions) — and Backend-as-a-Service (BaaS) — managed services that replace entire backend components (Auth0 for authentication, Firebase for real-time database, AWS DynamoDB for database).

Loading diagram...
Serverless pattern: API Gateway routes to Lambda functions; managed services handle data and messaging

Key Characteristics

CharacteristicDetail
No server managementProvider handles OS, runtime updates, security patches, capacity planning
Pay-per-executionBilled per invocation and per GB-second of compute (not idle time)
Auto-scalingScales from 0 to thousands of concurrent executions automatically
Event-triggeredFunctions are invoked by events: HTTP requests, queue messages, file uploads, timer schedules
StatelessEach invocation is independent; state must be stored externally (DynamoDB, Redis, S3)
Execution limitsAWS Lambda: 15 min max, 10 GB memory, 512 MB–10 GB ephemeral storage

Cold Starts

The most significant performance characteristic of FaaS is the cold start: when a function has not been invoked recently, the provider must spin up a new execution environment (download the function package, initialize the runtime, execute any global/module-level code). This adds latency ranging from ~100ms (for Node.js/Python) to several seconds (for JVM-based runtimes like Java or Scala).

ℹ️

Cold Start Mitigation Strategies

1. Use 'warm-up' scheduled pings that invoke the function every few minutes. 2. Enable AWS Lambda Provisioned Concurrency (keeps N instances warm — costs money even at idle). 3. Choose a lightweight runtime (Node.js, Python start faster than Java/Kotlin). 4. Minimize function package size and avoid heavy initialization in the global scope.

When Serverless Excels

  • Spiky or unpredictable traffic: A function that handles 0 requests at 3 AM and 50,000 requests during a flash sale costs nothing at 3 AM.
  • Event-driven pipelines: Image resizing on S3 upload, log processing from CloudWatch, nightly data exports — triggered, short-lived, perfectly suited.
  • Rapid prototyping and MVPs: No infrastructure to set up; focus on business logic.
  • Microservices with infrequent invocations: Functions that are called rarely don't need a persistent server consuming resources.
  • Webhook handlers: Third-party webhook endpoints (Stripe payment events, GitHub webhooks) are an ideal Lambda use case.

When Serverless Is a Poor Fit

  • Long-running processes: Lambda's 15-minute limit makes it unsuitable for video encoding, ML training, or ETL jobs that run for hours.
  • Latency-sensitive real-time systems: Cold starts introduce unpredictable latency spikes unacceptable for sub-100ms SLA requirements.
  • High-throughput, steady-state workloads: At sustained high request rates, per-invocation billing exceeds the cost of a dedicated EC2 instance.
  • Stateful protocols: WebSockets and long-lived connections are awkward — you need AWS API Gateway WebSocket support or a separate WebSocket server.
  • Vendor lock-in concerns: Lambda functions tied to AWS-specific triggers (S3 events, DynamoDB streams) are difficult to port to another cloud.

Cost Model Comparison

📌

Cost Break-Even Example

AWS Lambda costs approximately $0.20 per 1M invocations + $0.0000166667 per GB-second. A 512 MB function running for 200ms = 0.1 GB-second = $0.0000016667 per invocation. At 10 million invocations/month, that's about $18.67. A t3.medium EC2 ($0.0416/hr) costs about $30/month at 100% utilization. For 10M requests/month, serverless is cheaper — but at 500M requests/month, a dedicated fleet is significantly cheaper.

💡

Interview Tip

In system design interviews, mention serverless when the problem involves: event-driven triggers (file uploads, queue messages), spiky traffic, or a desire to minimize operational overhead. Always follow up by addressing its limitations: cold starts for latency-sensitive paths, execution time limits for long-running work, and cost at high sustained throughput. This trade-off awareness is what separates strong candidates.

📝

Knowledge Check

4 questions

Test your understanding of this lesson. Score 70% or higher to complete.

Ask about this lesson

Ask anything about Serverless Architecture