Satellite C2 Cloud-Native Architecture

Satellite command and control systems are among the most demanding software environments to modernize. Decades of heritage ground software — often monolithic CORBA architectures or proprietary C++ applications — must be replaced without losing the operational continuity that mission owners depend on. The move to cloud-native does not reduce reliability requirements; it must meet or exceed them while improving deployment velocity, scalability, and interoperability.

Rutagon's approach to satellite C2 modernization applies the same cloud-native microservices, managed Kubernetes, and DevSecOps pipeline patterns we use in production software delivery — adapted for the unique demands of space operations.

Why Cloud-Native for Satellite C2

Traditional satellite ground systems are purpose-built for single-mission, single-satellite operations. As satellite constellations grow — whether commercial LEO networks or government surveillance assets — monolithic architectures fail to scale. Command queuing for a single satellite is manageable; orchestrating contact windows across 30 satellites with overlapping pass times from multiple ground sites requires dynamic scheduling that a monolith can't handle efficiently.

Cloud-native architecture solves this through horizontal scaling, event-driven processing, and service decomposition. A telemetry processing spike during a critical contact window doesn't take down command uplink capability — they're independent services with independent scaling policies.

Additional drivers for cloud-native satellite C2:

Multi-site operation: Cloud-native services run identically across regions, enabling multi-site ground station presence without custom site-specific software
COTS innovation rate: AWS, Google, and Azure add capabilities that custom C2 platforms can't keep pace with — ML, time-series analytics, container orchestration
Compliance automation: Government satellite programs increasingly require FedRAMP/DISA IL5 authorization — cloud-native architectures on GovCloud have an established ATO path that custom hardware does not

Reference Architecture: Microservices Decomposition

A cloud-native satellite C2 system decomposes into functional microservices, each independently deployable and scalable:

┌─────────────────────────────────────────────────────────────────┐
│                    Ground System Architecture                    │
├──────────────────┬──────────────────┬──────────────────────────┤
│  Acquisition     │   Processing     │   Operations             │
├──────────────────┼──────────────────┼──────────────────────────┤
│ • Contact Mgmt   │ • Frame Sync     │ • Command Planning       │
│ • Track Service  │ • Demodulation   │ • Sequence Execution     │
│ • Station Ctrl   │ • CCSDS Decode   │ • Anomaly Monitor        │
├──────────────────┴──────────────────┴──────────────────────────┤
│                    Message Bus (Apache Kafka)                    │
├─────────────────────────────────────────────────────────────────┤
│               Time-Series Telemetry Store (InfluxDB / Timestream)│
├─────────────────────────────────────────────────────────────────┤
│               Kubernetes (EKS) — Managed Orchestration           │
└─────────────────────────────────────────────────────────────────┘

Each service publishes and subscribes to a central message bus. Telemetry received from a ground station antenna publishes to a raw-frame topic; the frame sync service consumes it, synchronizes frames, and publishes decoded frames; the CCSDS decode service processes Space Data Link Protocol frames into application packets; the monitoring service consumes application data and drives the operations display.

Telemetry Ingestion Pipeline

Raw telemetry arrives from the ground station antenna interface at high rates during contact windows. The ingestion pipeline must handle burst loads during contacts and idle gracefully between passes.

from kafka import KafkaProducer
from ccsds import TMFrame, decode_primary_header
import asyncio
import logging

class TelemetryIngestionService:
    def __init__(self, kafka_bootstrap_servers: list, topic: str):
        self.producer = KafkaProducer(
            bootstrap_servers=kafka_bootstrap_servers,
            value_serializer=lambda v: v  # raw bytes
        )
        self.topic = topic
        self.logger = logging.getLogger(__name__)
    
    async def ingest_frame(self, raw_frame: bytes, source_id: str, timestamp_utc: str):
        """
        Publish a raw TM frame to the ingestion topic with metadata.
        Kafka handles the burst buffering during contact windows.
        """
        try:
            # Validate frame sync marker before publishing
            if not self._validate_frame_marker(raw_frame):
                self.logger.warning(f"Frame sync marker invalid from {source_id}")
                return
            
            metadata = {
                'source_ground_station': source_id,
                'receive_time_utc': timestamp_utc,
                'frame_length': len(raw_frame)
            }
            
            self.producer.send(
                self.topic,
                key=source_id.encode(),
                value=raw_frame,
                headers=[(k, v.encode()) for k, v in metadata.items()]
            )
            
        except Exception as e:
            self.logger.error(f"Telemetry ingestion error: {e}")
            # Dead letter queue for failed frames
            self._send_to_dlq(raw_frame, source_id, str(e))
    
    def _validate_frame_marker(self, frame: bytes) -> bool:
        """Validate CCSDS frame sync marker (publicly defined in CCSDS 131.0-B-4)"""
        SYNC_MARKER = b'\x1a\xcf\xfc\x1d'  # Standard CCSDS sync marker
        return len(frame) > 4 and frame[:4] == SYNC_MARKER

Kafka provides the burst buffering — during a 10-minute contact window, the antenna might deliver 100 MB/s. Downstream processing services consume at their own pace without backpressure causing upstream drops.

CCSDS Telemetry Decoding

The Consultative Committee for Space Data Systems defines the telemetry transfer frame structure used by virtually all satellite programs. The decode layer transforms raw frames into engineering values:

from dataclasses import dataclass
from struct import unpack

@dataclass
class TMTransferFrame:
    spacecraft_id: int
    virtual_channel_id: int
    frame_count: int
    data_field: bytes
    
def decode_tm_frame(raw_frame: bytes) -> TMTransferFrame:
    """
    Decode CCSDS TM Transfer Frame per CCSDS 132.0-B-2.
    Primary header is 6 bytes.
    """
    if len(raw_frame) < 6:
        raise ValueError("Frame too short for CCSDS primary header")
    
    # Primary header fields
    header = unpack('>HHH', raw_frame[:6])
    
    spacecraft_id = (header[0] >> 4) & 0x3FF
    virtual_channel_id = header[0] & 0x7
    frame_count = (header[1] << 8) | (header[2] >> 8)
    
    return TMTransferFrame(
        spacecraft_id=spacecraft_id,
        virtual_channel_id=virtual_channel_id,
        frame_count=frame_count,
        data_field=raw_frame[6:]
    )

The decoded frames flow through subsequent services for mission-specific packet extraction, limit checking, and archival — each running as an independent Kubernetes deployment scaled by Kafka consumer group lag.

Command Planning and Execution

Command uplink architecture requires strict sequencing and verification. Commands must be confirmed delivered, executed, and verified through telemetry. The command execution service implements a state machine:

PLANNED → VALIDATED → QUEUED → UPLINKED → EXECUTED → VERIFIED
                                                         ↓
                                                    ANOMALY (if telemetry doesn't confirm)

Contact window scheduling integrates with JPL HORIZONS-style ephemeris calculations (publicly available) to predict contact windows and pre-stage command sequences. The scheduler runs as a Kubernetes CronJob, publishing contact window predictions to the operations queue.

Kubernetes Configuration for Space Operations

Space operations require high availability that standard Kubernetes configurations don't always provide by default:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: telemetry-decode-service
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0  # Zero-downtime: never reduce below target
      maxSurge: 1
  template:
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: telemetry-decode-service
      containers:
      - name: telemetry-decode
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "2"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

The topologySpreadConstraints ensure telemetry processing pods are spread across availability zones — a single AZ failure doesn't interrupt contact window processing.

ATO Path for Space Programs

Space Force and DoD satellite programs are increasingly pursuing cATO under the DoD DevSecOps Reference Design. Cloud-native ground software on GovCloud has an established path:

Architecture on EKS with IL4/IL5 boundary maps to existing DISA PA
Container security through Iron Bank and STIG-compliant base images
CI/CD pipeline generating ATO evidence on every commit
ConMon through AWS Security Hub and Config rules

See our Space Force ground software architecture insights and continuous ATO automation patterns for the full approach.

Rutagon's engineering experience spans commercial aerospace-scale software and defense cloud infrastructure. Contact Rutagon to discuss satellite C2 modernization for your program.

Frequently Asked Questions

Why use cloud-native architecture for satellite ground systems?

Cloud-native ground systems offer horizontal scalability for multi-satellite constellations, rapid deployment via CI/CD rather than site-specific installation, built-in high availability through managed Kubernetes, and an established FedRAMP/DISA authorization path. For programs managing multiple satellites with overlapping contact windows, event-driven microservices scale dynamically in ways that monolithic heritage ground software cannot.

How does CCSDS fit into a cloud-native ground system?

CCSDS (Consultative Committee for Space Data Systems) standards define the framing, encoding, and packet structures used by most satellite telemetry and command links. Cloud-native ground systems implement CCSDS decode as microservices that consume raw frames from ingestion topics and publish decoded packets to downstream processing. The CCSDS layer is isolated in a service — changes to how packets are structured don't require modifying the telemetry archive or the operations display.

What database is appropriate for satellite telemetry archival?

Time-series databases are optimal for telemetry — AWS Timestream (serverless, scales to billions of rows) and InfluxDB are both used in aerospace telemetry systems. For government programs requiring IL4/IL5, AWS Timestream in GovCloud is the most straightforward ATO path. Mission data requiring long-term archival flows to S3 with Glacier lifecycle policies for cost-effective cold storage.

How does a cloud-native ground system handle the multi-site antenna scenario?

Telemetry ingestion services run at each ground site (or connect to commercial SaaS antenna networks via API), publishing raw frames to a central Kafka cluster in the cloud. Geographic separation between ingestion and processing is handled by Kafka's replication — frames are durably buffered even if a processing region has issues. Multiple sites contribute frames for the same satellite, and frame deduplication handles overlapping coverage.

What's the latency impact of cloud-based telemetry processing?

For most satellite programs, processing latency of 100–500ms is acceptable — telemetry displays don't require sub-millisecond updates. Edge cases: real-time commanding requiring <1 second turnaround from telemetry receipt to command uplink decision may require co-location of the command planning service with the ground station. This is a hybrid architecture where time-critical functions stay near the antenna while analytics and archival run centrally in the cloud.

Discuss your project with Rutagon