Mission Data Pipelines for Satellite Programs

A satellite is only as useful as the data it returns. The ground segment's mission data pipeline — the software that ingests raw downlink telemetry, deframes and decodes it, calibrates sensor readings, generates data products, and delivers them to end users — is as mission-critical as the spacecraft itself. Without a functioning pipeline, the satellite may be operating perfectly while end users receive nothing useful.

Here's how Rutagon approaches cloud-native mission data pipeline architecture for satellite programs.

What a Mission Data Pipeline Must Do

Satellite mission data pipelines have a defined processing sequence:

Level 0 — Raw data: Raw bitstream received from the spacecraft. Unprocessed, compressed (if onboard compression was applied). Stored immediately — Level 0 is the archive of record.

Level 1 — Telemetry deframed: Raw bits decoded into spacecraft telemetry packets according to the mission's CCSDS (Consultative Committee for Space Data Systems) packet structure. Housekeeping telemetry (spacecraft health and status) separated from mission data telemetry.

Level 1B — Calibrated sensor data: Raw sensor readings converted to physical units using calibration coefficients. For an imaging sensor, this might mean converting digital number counts to radiance values. Calibration correction algorithms run at this stage.

Level 2 — Geophysical data products: Sensor data transformed into scientifically/operationally useful data products — imagery georegistered to Earth coordinates, atmospheric profiles, thermal maps, target detection results.

Level 3+ — Derived products and analysis: Higher-order analysis products computed from Level 2 data — composites, trend analysis, change detection.

Cloud-Native Pipeline Architecture

Each processing level maps to a pipeline stage. The cloud-native architecture:

[Ground Station Downlink] → [Ingest Service (S3 raw bucket)]
     → [Level 0 Archive (S3 + Glacier)]
     → [Deframe Worker (ECS Fargate or Lambda)]
          → [Level 1 Storage (S3 l1-bucket)]
          → [Calibration Worker]
               → [Level 1B Storage (S3 l1b-bucket)]
               → [L2 Processing Worker (compute-intensive — EC2 or EKS)]
                    → [Level 2 Product Storage (S3 l2-bucket)]
                    → [Distribution Service (CloudFront + signed URLs)]

Event-driven triggers (S3 Event Notifications → SQS → Lambda or ECS Task) connect each stage. A new Level 0 file arriving in S3 automatically triggers deframing; a completed Level 1B file triggers Level 2 processing. No polling, no scheduled batch jobs — the pipeline is reactive to data arrival.

# Lambda handler for Level 0 → Level 1 processing trigger
import boto3
import json

def handler(event, context):
    s3 = boto3.client('s3')

    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Parse pass metadata from key path convention
        # e.g., raw/2026/180/PASSID123/gs01_pass123.bin
        pass_metadata = parse_key_metadata(key)

        # Submit deframe job to ECS
        ecs = boto3.client('ecs')
        ecs.run_task(
            cluster='mission-processing',
            taskDefinition='deframe-worker',
            overrides={
                'containerOverrides': [{
                    'name': 'deframe',
                    'environment': [
                        {'name': 'INPUT_BUCKET', 'value': bucket},
                        {'name': 'INPUT_KEY', 'value': key},
                        {'name': 'OUTPUT_BUCKET', 'value': 'l1-processed'},
                        {'name': 'PASS_ID', 'value': pass_metadata['pass_id']},
                    ]
                }]
            },
            launchType='FARGATE',
            networkConfiguration={...}
        )

Handling Pass-Rate Data Volume

Satellite ground contact windows are brief and concentrated. A low-Earth orbit satellite completes a ground station pass in 5-15 minutes, transmitting gigabytes of compressed data during the contact window. Ground software must sustain high throughput during the pass, then transition to steady-state processing during the orbit interval.

For high-throughput downlink ingestion, the ground station interface writes directly to S3 using multipart uploads — S3's multipart API handles large file uploads without memory constraints and provides checksums for data integrity verification. For high-rate low-latency downlinks, a dedicated ground station server with local buffering handles burst rate before forwarding to cloud.

Data integrity: Every file at every processing level carries a checksum. Processing workers verify input checksums before processing and generate output checksums. The pipeline tracks checksum verification status — processing failures triggered by data corruption are distinguishable from algorithm errors.

GovCloud Compliance for Defense Satellite Programs

Defense satellite mission data is frequently CUI or higher classification. Cloud storage and processing must meet IL requirements:

S3 buckets encrypted with KMS customer-managed keys (CMKs); key access logged via CloudTrail
IAM roles with least-privilege access — processing workers assume task roles with access scoped to their specific input/output buckets
VPC-restricted processing workers — no internet access from processing containers
CloudTrail + CloudWatch Logs for all API calls and processing events
All data products tagged with classification marking per the program's data handling plan

View Rutagon's cloud capabilities → rutagon.com/government

Frequently Asked Questions

How do you handle processing failures and retries in the pipeline?

Each processing stage has a dead-letter queue (SQS DLQ) for failed processing attempts. After a configurable number of retries (typically 3), failed items are routed to the DLQ for investigation. Pipeline monitoring includes DLQ depth as a key metric — items in the DLQ trigger alerts to the operations team. Failed processing events are logged with input file metadata, error type, and stack trace for forensic analysis.

What's the storage strategy for long-term mission data archives?

Level 0 data (raw downlink) is stored indefinitely — it's the archive of record from which all derived products can be regenerated if processing algorithms are improved. Rutagon uses S3 lifecycle policies to transition Level 0 data to Glacier Deep Archive after 90 days, reducing long-term storage costs by 80-90% vs. standard S3 while maintaining retrieval capability. Higher-level data products have shorter retention policies based on mission requirements.

Can the pipeline handle multiple spacecraft feeding the same ground system?

Yes — the pipeline architecture is spacecraft-agnostic at the infrastructure level. Each spacecraft has its own processing configuration (CCSDS packet structure definition, calibration coefficients, L2 algorithm parameters), stored as configuration artifacts. The processing workers load the spacecraft-specific configuration based on the pass metadata. Multi-spacecraft pipelines use the same infrastructure with configuration-driven differentiation.

How are calibration updates handled mid-mission?

Calibration updates (new calibration coefficients from periodic calibration campaigns) are versioned configuration artifacts. The pipeline stores calibration version alongside each processed data product. When calibration is updated, affected historical data can be reprocessed through a batch job that retrieves Level 0 or Level 1 data from the archive and reapplies the updated calibration. Reprocessed data products carry the new calibration version identifier.

What observability does the pipeline provide for mission operations teams?

Real-time pipeline dashboards show: current pass status (active contact, processing queue depth, last received data), data product availability by orbit (confirming Level 2 products have been delivered for each pass), processing latency (time from Level 0 ingest to Level 2 product availability), and data volume trends. Alerts notify the operations team of processing failures, missed contacts (no data received during an expected pass), and product delivery delays. CloudWatch dashboards with custom metrics from processing workers provide this visibility.