Serverless API Design with Lambda and DynamoDB

Rutagon builds serverless APIs that handle production traffic without provisioned servers, without capacity planning guesswork, and without the operational overhead of managing container orchestration for straightforward API workloads. Our serverless API design with Lambda powers core backend services — including our production SaaS platform where 25+ AWS services work together to serve iOS, Android, and web clients through 24 Lambda functions backed by Aurora Serverless v2 (PostgreSQL).

This is not a tutorial on getting started with Lambda. This is how we design serverless APIs that perform at scale: single-table DynamoDB patterns that eliminate joins, Lambda function architectures that minimize cold starts, API Gateway configurations that enforce security at the edge, and cost optimization strategies that keep serverless economical as traffic grows.

Single-Table DynamoDB Design

DynamoDB is not a relational database, and treating it like one — one table per entity, scan-heavy queries, filter expressions as a substitute for joins — leads to expensive, slow APIs.

Rutagon uses single-table design. All entities for a given service share one table, with carefully designed partition keys and sort keys that support every access pattern the API requires.

# Entity key patterns for a property management API
KEY_PATTERNS = {
    "property":         {"pk": "PROP#{property_id}",    "sk": "METADATA"},
    "property_images":  {"pk": "PROP#{property_id}",    "sk": "IMG#{image_id}"},
    "property_history": {"pk": "PROP#{property_id}",    "sk": "HIST#{timestamp}"},
    "user":             {"pk": "USER#{user_id}",        "sk": "PROFILE"},
    "user_properties":  {"pk": "USER#{user_id}",        "sk": "PROP#{property_id}"},
    "listing_by_zip":   {"pk": "ZIP#{zip_code}",        "sk": "PROP#{property_id}"},
}

The partition key (pk) groups related data for efficient queries. The sort key (sk) enables range queries within a partition. To retrieve a property and all its images in a single query:

import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table("property-service")

def get_property_with_images(property_id: str) -> dict:
    response = table.query(
        KeyConditionExpression=(
            Key("pk").eq(f"PROP#{property_id}")
            & Key("sk").begins_with("IMG#")
            | Key("sk").eq("METADATA")
        ),
        # In practice, we use two queries or a batch — shown simplified
    )

    items = response["Items"]
    metadata = next((i for i in items if i["sk"] == "METADATA"), None)
    images = [i for i in items if i["sk"].startswith("IMG#")]

    return {"property": metadata, "images": images}

A Global Secondary Index (GSI) with sk as partition key and pk as sort key enables inverse lookups — finding all properties for a given user, or all listings in a zip code — without a second table or a scan.

def get_listings_by_zip(zip_code: str, limit: int = 20) -> list:
    response = table.query(
        KeyConditionExpression=Key("pk").eq(f"ZIP#{zip_code}"),
        ScanIndexForward=False,
        Limit=limit,
    )
    return response["Items"]

This single-table approach means the property service reads from one table with predictable, consistent latency regardless of data volume. No joins, no cross-table transactions for read operations, no scan-and-filter anti-patterns.

Lambda Function Architecture

Lambda function design directly impacts cold start time, maintainability, and cost. We follow several principles:

One function per route group, not one per endpoint. A single Lambda handles all /properties/* routes, dispatched internally. This reduces the number of cold start targets while keeping functions focused.

import json
from typing import Any

def handler(event: dict, context: Any) -> dict:
    http_method = event["httpMethod"]
    path = event["resource"]

    routes = {
        ("GET", "/properties"): list_properties,
        ("GET", "/properties/{id}"): get_property,
        ("POST", "/properties"): create_property,
        ("PUT", "/properties/{id}"): update_property,
        ("DELETE", "/properties/{id}"): delete_property,
    }

    route_handler = routes.get((http_method, path))
    if not route_handler:
        return response(404, {"error": "Not found"})

    try:
        return route_handler(event)
    except ValidationError as e:
        return response(400, {"error": str(e)})
    except PermissionError:
        return response(403, {"error": "Forbidden"})
    except Exception as e:
        print(f"Unhandled error: {e}")
        return response(500, {"error": "Internal server error"})

def response(status_code: int, body: dict) -> dict:
    return {
        "statusCode": status_code,
        "headers": {
            "Content-Type": "application/json",
            "Access-Control-Allow-Origin": "*",
        },
        "body": json.dumps(body),
    }

Initialize outside the handler. DynamoDB clients, configuration, and shared resources are initialized at module level so they persist across warm invocations.

import boto3

# Initialized once per container, reused across invocations
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table("property-service")

Keep dependencies minimal. Every megabyte of deployment package adds to cold start time. We audit dependencies aggressively — the AWS SDK is available in the Lambda runtime, so we do not bundle it. Utility libraries are evaluated by size before inclusion.

API Gateway Patterns

API Gateway sits in front of Lambda functions and handles concerns that do not belong in application code: authentication, rate limiting, request validation, and CORS.

Request validation at the API Gateway level rejects malformed requests before they invoke a Lambda function, saving cost and reducing attack surface:

# OpenAPI spec fragment for API Gateway validation
paths:
  /properties:
    post:
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [address, price, userId]
              properties:
                address:
                  type: string
                  minLength: 5
                  maxLength: 200
                price:
                  type: number
                  minimum: 0
                userId:
                  type: string
                  pattern: "^[a-zA-Z0-9-]+$"

Usage plans and API keys enforce rate limits per client. Government APIs get dedicated usage plans with higher thresholds and stricter monitoring. Commercial APIs scale with tiered plans.

Custom authorizers validate JWTs and inject user context into the Lambda event, keeping authentication logic centralized and consistent across all endpoints.

Cold Start Mitigation

Cold starts are the primary performance concern with Lambda. A cold start occurs when AWS provisions a new execution environment — downloading the deployment package, initializing the runtime, and executing module-level code.

Our cold start mitigation strategy:

Provisioned concurrency for latency-sensitive endpoints. Our SaaS platform's core API endpoints maintain provisioned concurrency during business hours, eliminating cold starts for the majority of requests.
Minimal deployment packages. Python Lambda functions use Lambda layers for shared dependencies and keep the function package under 5MB. TypeScript functions are bundled with esbuild, tree-shaken, and minified.
ARM64 runtime. Graviton2-based Lambda functions start faster and cost 20% less than x86. All new functions deploy on arm64.
Connection reuse. HTTP keep-alive is enabled for the AWS SDK, and DynamoDB connections are reused across invocations.

import os
os.environ["AWS_LAMBDA_EXEC_WRAPPER"] = ""  # Ensure clean runtime

import boto3
from botocore.config import Config

config = Config(
    retries={"max_attempts": 3, "mode": "adaptive"},
    tcp_keepalive=True,
)
dynamodb = boto3.resource("dynamodb", config=config)

Cost Optimization at Scale

Serverless is cost-effective at low to moderate scale but requires attention at high volume. Our approach:

DynamoDB on-demand capacity for unpredictable workloads, provisioned capacity with auto-scaling for steady-state workloads. For relational data with complex queries, Aurora Serverless v2 (PostgreSQL) provides automatic scaling with the familiarity of SQL — which is the approach used in our production SaaS platform.

Lambda power tuning — we run AWS Lambda Power Tuning to find the optimal memory configuration for each function. A function configured with 512MB might execute faster and cheaper at 1024MB because the additional CPU allocation reduces execution time enough to offset the higher per-millisecond cost.

API Gateway caching for GET endpoints that return data changing infrequently. Property metadata cached for 60 seconds reduces Lambda invocations by 80% for popular listings.

Response compression at the API Gateway level reduces data transfer costs and improves client-perceived latency.

For how we manage the underlying infrastructure with Terraform, see Terraform multi-account AWS patterns. For how security scanning integrates with our deployment pipeline, see security compliance in CI/CD.

Frequently Asked Questions

When should you choose serverless over containers?

Serverless excels for API workloads with variable traffic, event-driven processing, and services where you want zero infrastructure management. Containers are better for long-running processes, workloads that need persistent connections (WebSockets), or applications that exceed Lambda's 15-minute timeout. Rutagon uses both — serverless APIs for CRUD operations and event processing, containers for real-time features and background workers.

How does single-table DynamoDB design handle evolving access patterns?

New access patterns are supported by adding GSIs or adjusting key patterns. The critical discipline is designing keys around access patterns from the start — not around entity relationships. When we add a new query pattern, we evaluate whether existing keys support it, whether a new GSI is needed, or whether the pattern warrants a separate table. Most access patterns fit within two to three GSIs.

What is the real cost of Lambda cold starts?

For Python functions with minimal dependencies, cold starts typically add 200-400ms. For functions with provisioned concurrency, cold starts are eliminated entirely. The cost of provisioned concurrency is approximately $0.015 per GB-hour — trivial for business-critical endpoints. We apply provisioned concurrency selectively to latency-sensitive paths, not universally.

How do you handle transactions across multiple DynamoDB items?

DynamoDB TransactWriteItems supports ACID transactions across up to 100 items within the same table or across tables. We use transactions for operations that must be atomic — creating a property listing while updating the user's property count, for example. For operations where eventual consistency is acceptable, we use DynamoDB Streams to trigger downstream updates asynchronously.

How do you test serverless APIs locally?

We use SAM CLI for local Lambda invocation and DynamoDB Local for database operations during development. Integration tests run against a dedicated AWS account with real services — not mocks. The cost of running integration tests against real Lambda and DynamoDB is negligible, and it catches issues that local emulation misses.

Discuss your project with Rutagon