Real-time data isn't a luxury anymore — it's a competitive requirement. Organizations that make decisions on yesterday's data are making yesterday's decisions. But building real-time dashboards that actually work at scale requires more than plugging a charting library into a database query.
At Rutagon, we've architected data pipeline systems that process millions of events daily and surface them in dashboards with sub-second latency. This article covers the production architecture patterns we use on AWS: ingestion with Kinesis, transformation with Glue, querying with Athena, and visualization with QuickSight — along with the infrastructure patterns from our work with serverless APIs and multi-account Terraform.
Architecture Overview
A real-time dashboard system has four distinct layers, each with different latency, throughput, and cost characteristics:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Ingestion │───▶│ Processing │───▶│ Storage │───▶│ Query & │
│ (Kinesis) │ │ (Glue) │ │ (S3 + DDB) │ │ Display │
└─────────────┘ └─────────────┘ └─────────────┘ │(Athena + QS)│
└─────────────┘ The separation matters. Each layer scales independently, fails independently, and costs differently. Coupling them — which is the default approach most teams take — creates a system that can't scale any dimension without scaling all of them.
Data Ingestion with Kinesis
Amazon Kinesis Data Streams handles the ingestion layer. It's a managed streaming service that ingests data at any scale with millisecond latency. For enterprise dashboards, we typically see two ingestion patterns:
Direct Producer Pattern
Applications send events directly to Kinesis using the Kinesis Producer Library (KPL) or the AWS SDK:
import boto3
import json
from datetime import datetime, timezone
kinesis = boto3.client('kinesis', region_name='us-west-2')
def publish_event(stream_name: str, event_type: str, payload: dict):
event = {
'event_type': event_type,
'timestamp': datetime.now(timezone.utc).isoformat(),
'payload': payload,
'source': 'api-gateway',
'version': '1.0'
}
kinesis.put_record(
StreamName=stream_name,
Data=json.dumps(event),
PartitionKey=payload.get('tenant_id', 'default')
) The partition key is critical for ordering guarantees. Events with the same partition key are delivered to the same shard in order. For multi-tenant dashboards, the tenant ID is the natural partition key — it ensures each tenant's events are processed sequentially while enabling parallel processing across tenants.
Kinesis Data Firehose for Batch-Friendly Sources
Not every data source produces real-time events. Many enterprise systems generate data in batches — CSV exports, database dumps, API polling results. Kinesis Data Firehose bridges this gap by accepting records and buffering them before delivery to S3:
firehose = boto3.client('firehose', region_name='us-west-2')
def batch_ingest(delivery_stream: str, records: list[dict]):
batch = [
{'Data': json.dumps(record) + '\n'}
for record in records
]
for i in range(0, len(batch), 500):
chunk = batch[i:i + 500]
firehose.put_record_batch(
DeliveryStreamName=delivery_stream,
Records=chunk
) Firehose handles buffering, compression, encryption, and delivery to S3 in Parquet format — which is essential for query performance downstream.
Data Transformation with AWS Glue
Raw event data is rarely in the format you need for dashboards. AWS Glue handles the transformation layer — cleaning, enriching, aggregating, and partitioning data for efficient querying.
Streaming ETL with Glue
For real-time dashboards, Glue Streaming ETL reads directly from Kinesis and writes transformed data to S3:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.context import SparkContext
from pyspark.sql.functions import from_json, col, window, count, avg
from pyspark.sql.types import StructType, StructField, StringType, TimestampType, DoubleType
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
job.init(args['JOB_NAME'], args)
event_schema = StructType([
StructField("event_type", StringType()),
StructField("timestamp", TimestampType()),
StructField("payload", StructType([
StructField("tenant_id", StringType()),
StructField("metric_name", StringType()),
StructField("metric_value", DoubleType()),
StructField("dimensions", StringType())
]))
])
kinesis_df = spark.readStream \
.format("kinesis") \
.option("streamName", "dashboard-events") \
.option("region", "us-west-2") \
.option("startingPosition", "TRIM_HORIZON") \
.load()
parsed_df = kinesis_df \
.select(from_json(col("data").cast("string"), event_schema).alias("event")) \
.select("event.*")
aggregated_df = parsed_df \
.withWatermark("timestamp", "5 minutes") \
.groupBy(
window(col("timestamp"), "1 minute"),
col("payload.tenant_id"),
col("payload.metric_name")
) \
.agg(
count("*").alias("event_count"),
avg("payload.metric_value").alias("avg_value")
)
query = aggregated_df.writeStream \
.format("parquet") \
.option("path", "s3://dashboard-data/aggregated/") \
.option("checkpointLocation", "s3://dashboard-data/checkpoints/") \
.partitionBy("tenant_id") \
.trigger(processingTime="1 minute") \
.start()
query.awaitTermination()
job.commit() This job reads events from Kinesis, parses the JSON payload, aggregates metrics into one-minute windows, and writes partitioned Parquet files to S3. The watermark handles late-arriving data — events that arrive up to 5 minutes late are still included in the correct window.
Partitioning Strategy
Partitioning is the single most important decision for query performance. A well-partitioned dataset turns a full-table scan into a targeted read of a few files:
s3://dashboard-data/aggregated/
├── tenant_id=tenant-001/
│ ├── year=2026/
│ │ ├── month=03/
│ │ │ ├── day=02/
│ │ │ │ ├── hour=00/
│ │ │ │ │ └── part-00000.parquet
│ │ │ │ ├── hour=01/
│ │ │ │ │ └── part-00000.parquet Partitioning by tenant first enables row-level security in QuickSight — each tenant only sees their own partition. Time-based sub-partitioning enables efficient time-range queries without scanning irrelevant data.
Querying with Athena
Amazon Athena provides serverless SQL queries over the S3 data lake. For dashboards, Athena serves two roles: ad-hoc exploration and scheduled reporting.
Table Definition
Athena tables are defined in the Glue Data Catalog, which Glue ETL jobs populate automatically:
CREATE EXTERNAL TABLE IF NOT EXISTS dashboard_metrics (
window_start TIMESTAMP,
window_end TIMESTAMP,
metric_name STRING,
event_count BIGINT,
avg_value DOUBLE
)
PARTITIONED BY (
tenant_id STRING,
year INT,
month INT,
day INT,
hour INT
)
STORED AS PARQUET
LOCATION 's3://dashboard-data/aggregated/'
TBLPROPERTIES (
'parquet.compression' = 'SNAPPY',
'projection.enabled' = 'true',
'projection.tenant_id.type' = 'enum',
'projection.tenant_id.values' = 'tenant-001,tenant-002,tenant-003',
'projection.year.type' = 'integer',
'projection.year.range' = '2024,2030',
'projection.month.type' = 'integer',
'projection.month.range' = '1,12',
'projection.day.type' = 'integer',
'projection.day.range' = '1,31',
'projection.hour.type' = 'integer',
'projection.hour.range' = '0,23',
'storage.location.template' =
's3://dashboard-data/aggregated/tenant_id=${tenant_id}/year=${year}/month=${month}/day=${day}/hour=${hour}/'
);
Partition projection eliminates the need for MSCK REPAIR TABLE — Athena infers partitions from the template pattern without scanning S3. This reduces query startup time from seconds to milliseconds for datasets with thousands of partitions.
Optimized Dashboard Queries
Dashboard queries need to be fast and cost-efficient. Athena charges per data scanned, so reducing scan volume is both a performance and cost optimization:
SELECT
metric_name,
DATE_TRUNC('hour', window_start) AS hour,
SUM(event_count) AS total_events,
AVG(avg_value) AS avg_metric_value
FROM dashboard_metrics
WHERE tenant_id = 'tenant-001'
AND year = 2026
AND month = 3
AND day = 2
GROUP BY metric_name, DATE_TRUNC('hour', window_start)
ORDER BY hour DESC; This query scans only one tenant's data for one day — roughly 24 Parquet files instead of the entire dataset. With SNAPPY compression and columnar storage, a day's worth of metrics for a single tenant typically scans under 10 MB.
Visualization with QuickSight
Amazon QuickSight is the visualization layer. It connects directly to Athena (or to SPICE, QuickSight's in-memory cache) and provides interactive dashboards with drill-down, filtering, and alerting.
SPICE for Sub-Second Queries
For dashboards that need sub-second response times, SPICE (Super-fast, Parallel, In-memory Calculation Engine) pre-loads data from Athena on a schedule:
- Hourly refresh for operational dashboards showing today's data
- Daily refresh for trend dashboards showing weekly/monthly patterns
- On-demand refresh triggered by data pipeline completion
SPICE eliminates Athena query latency for interactive dashboard use — users see results in milliseconds rather than seconds.
Row-Level Security
Multi-tenant dashboards require strict data isolation. QuickSight's row-level security (RLS) uses a rules dataset that maps users to the tenants they can access:
UserName,tenant_id
user@tenant-001.com,tenant-001
user@tenant-002.com,tenant-002
admin@rutagon.com,tenant-001
admin@rutagon.com,tenant-002
admin@rutagon.com,tenant-003 RLS is enforced at the QuickSight layer, not the Athena layer. This means even if a user constructs a custom Athena query through the QuickSight interface, they can only see data for their authorized tenants.
Real-Time Layer with DynamoDB
For truly real-time metrics — latency under one second from event to display — the Athena/S3 path is too slow. We add a DynamoDB layer for the most recent data:
import boto3
from decimal import Decimal
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('realtime-metrics')
def update_realtime_metric(tenant_id: str, metric_name: str, value: float):
table.update_item(
Key={
'pk': f'TENANT#{tenant_id}',
'sk': f'METRIC#{metric_name}#CURRENT'
},
UpdateExpression='SET metric_value = :val, updated_at = :ts, '
'event_count = if_not_exists(event_count, :zero) + :one',
ExpressionAttributeValues={
':val': Decimal(str(value)),
':ts': datetime.now(timezone.utc).isoformat(),
':zero': 0,
':one': 1
}
) A Lambda function consumes from Kinesis Data Streams and updates DynamoDB in real-time. The dashboard front-end queries DynamoDB for the current minute's data and Athena (via SPICE) for historical data. The transition between real-time and historical is seamless to the user.
Infrastructure as Code
The entire pipeline is defined in Terraform, following the multi-account patterns we use across production systems:
module "dashboard_pipeline" {
source = "./modules/data-pipeline"
stream_name = "dashboard-events"
stream_shard_count = 4
firehose_buffer_size = 128
firehose_buffer_interval = 60
glue_job_name = "dashboard-streaming-etl"
athena_workgroup = "dashboard-queries"
quicksight_namespace = "production"
s3_bucket_name = "dashboard-data-${var.environment}"
dynamodb_table_name = "realtime-metrics-${var.environment}"
tags = {
Environment = var.environment
Project = "enterprise-dashboard"
ManagedBy = "terraform"
}
} Every component — Kinesis streams, Firehose delivery streams, Glue jobs, Athena workgroups, QuickSight datasets, DynamoDB tables — is provisioned, configured, and version-controlled through Terraform. No console clicks, no manual configuration, no configuration drift.
Cost Optimization
Real-time data pipelines can get expensive quickly. Production cost management requires understanding the cost model of each component:
| Component | Cost Driver | Optimization |
|---|---|---|
| Kinesis Data Streams | Shard hours + PUT payload units | Right-size shards, use PutRecords batching |
| Kinesis Firehose | Data ingested (GB) | Compress before ingestion |
| Glue Streaming | DPU hours | Minimize processing window |
| S3 | Storage + requests | Lifecycle policies, Parquet compression |
| Athena | Data scanned (TB) | Partition pruning, columnar format, SPICE |
| QuickSight | Per-user/month + SPICE capacity | Reader roles for view-only users |
| DynamoDB | RCU/WCU or on-demand | TTL for expiring real-time data |
The biggest cost lever is Athena query optimization. A poorly partitioned dataset can cost 100x more per query than a well-partitioned one — the difference between scanning 10 MB and 1 TB for the same result.
Production Lessons
After building these systems across multiple production deployments, a few lessons stand out:
Late-arriving data is normal. Design for it. Watermarks, event-time processing, and reprocessing pipelines handle the 1-5% of events that arrive after their processing window.
Schema evolution is inevitable. Use Parquet with schema evolution support. Add columns freely; removing or renaming columns requires a migration.
Dashboard load time is user trust. If the dashboard takes more than 2 seconds to load, users will revert to spreadsheets. SPICE, aggressive caching, and pre-aggregation are not optional.
Monitoring the pipeline is as important as monitoring the application. Dead-letter queues, Glue job failure alerts, SPICE refresh monitoring, and data freshness checks should all be in place before the first user sees the dashboard.
How much latency should I expect from a Kinesis-to-QuickSight pipeline?
For the SPICE-cached path (historical data), expect 1-5 seconds for data to appear in dashboards, depending on SPICE refresh frequency. For the real-time DynamoDB path (current data), expect sub-second latency from event ingestion to dashboard display. The Athena direct-query path typically returns results in 2-10 seconds depending on data volume and partition efficiency.
What's the cost of running a real-time dashboard on AWS?
Cost varies dramatically based on data volume. A typical enterprise dashboard ingesting 1 million events per day with 50 dashboard users costs roughly $500-800/month across all services. The largest cost drivers are Kinesis shard hours, Glue DPU hours, and QuickSight user licenses. Proper partitioning and SPICE caching can reduce Athena costs by 90% or more.
Can this architecture handle multi-tenant dashboards?
Yes. The architecture uses tenant-based partitioning in S3, DynamoDB partition keys scoped to tenants, and QuickSight row-level security (RLS) to enforce strict data isolation. Each tenant sees only their own data, and the partition-based storage ensures that tenant isolation doesn't degrade query performance.
Why use Parquet instead of JSON for S3 storage?
Parquet is a columnar format that compresses 5-10x better than JSON and enables Athena to read only the columns needed for a query. A dashboard query that reads 3 columns from a 50-column dataset scans 94% less data in Parquet than in JSON — directly reducing both query time and Athena costs.
How do you handle schema changes in the data pipeline?
Parquet supports additive schema evolution natively — new columns can be added without breaking existing queries or data. The Glue Data Catalog tracks schema versions, and Athena automatically resolves schema differences across Parquet files. For breaking changes (column renames, type changes), we run migration Glue jobs that rewrite affected partitions.
Discuss your project with Rutagon
Contact Us →