Skip to main content
INS // Insights

PostgreSQL High Availability in GovCloud

Updated April 2026 · 7 min read

Relational database resilience in government cloud programs is not optional — NIST 800-53 CP-9 (Information System Backup) and CP-10 (Information System Recovery and Reconstitution) require documented, tested recovery procedures. PostgreSQL on AWS GovCloud provides a path to HA that meets these requirements through managed services, but the configuration decisions matter significantly.

Rutagon runs Aurora Serverless v2 (PostgreSQL-compatible) in production — here's how to architect PostgreSQL HA for federal workloads.

PostgreSQL HA Options on GovCloud

GovCloud offers three PostgreSQL HA paths:

OptionRTORPOUse Case
RDS PostgreSQL Multi-AZ~1-2 min~0Steady workloads, predictable size
Aurora PostgreSQL Multi-AZ~30 sec~0Variable load, read replicas needed
Aurora Serverless v2~30 sec~0Spiky or unpredictable load
Self-managed on EC2 + PatroniVariableConfigurableCustom requirements, cost optimization

Rutagon's recommendation for most government programs: Aurora PostgreSQL with Multi-AZ for production workloads, or Aurora Serverless v2 when workload is variable or development cost of provisioning EC2 clusters is unjustifiable. Self-managed Patroni on EC2 is appropriate when specific PostgreSQL version control or extension support is required.

Aurora PostgreSQL Multi-AZ Configuration

resource "aws_rds_cluster" "gov_database" {
  cluster_identifier          = "mission-system-db"
  engine                      = "aurora-postgresql"
  engine_version              = "15.4"
  database_name               = "mission_data"
  master_username             = "admin"
  manage_master_user_password = true  # Secrets Manager managed

  # Multi-AZ with 2 replicas
  availability_zones = [
    "us-gov-west-1a",
    "us-gov-west-1b",
    "us-gov-west-1c"
  ]

  # FIPS 140-2: KMS CMK encryption required for IL4/IL5
  storage_encrypted = true
  kms_key_id       = aws_kms_key.database_cmk.arn

  # Backup configuration: CP-9 compliance
  backup_retention_period   = 35  # 35 days per FedRAMP High baseline
  preferred_backup_window   = "03:00-04:00"  # UTC — off-peak
  copy_tags_to_snapshot     = true

  # Deletion protection: prevents accidental data loss
  deletion_protection = true

  # Enhanced monitoring: SC-5, AU-14
  enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]

  vpc_security_group_ids = [aws_security_group.database.id]
  db_subnet_group_name   = aws_db_subnet_group.private.name

  tags = {
    Classification  = "CUI"
    Environment     = "production"
    BackupFrequency = "continuous"
    RetentionDays   = "35"
  }
}

# Primary writer instance
resource "aws_rds_cluster_instance" "writer" {
  identifier          = "mission-db-writer"
  cluster_identifier  = aws_rds_cluster.gov_database.id
  instance_class      = "db.r6g.xlarge"
  engine              = aws_rds_cluster.gov_database.engine
  publicly_accessible = false

  monitoring_interval = 60
  monitoring_role_arn = aws_iam_role.rds_enhanced_monitoring.arn

  performance_insights_enabled          = true
  performance_insights_retention_period = 7
  performance_insights_kms_key_id       = aws_kms_key.database_cmk.arn
}

# Read replica in separate AZ for failover
resource "aws_rds_cluster_instance" "reader" {
  identifier          = "mission-db-reader"
  cluster_identifier  = aws_rds_cluster.gov_database.id
  instance_class      = "db.r6g.xlarge"
  engine              = aws_rds_cluster.gov_database.engine
  publicly_accessible = false

  availability_zone = "us-gov-west-1b"  # Separate AZ from writer
}

The manage_master_user_password = true parameter tells Aurora to store and rotate the master password in Secrets Manager automatically — implementing IA-5(1) without manual credential management. See our secrets management in GovCloud guide for the full credential management pattern.

Aurora Serverless v2: Variable Workload HA

Rutagon uses Aurora Serverless v2 in production for workloads with variable load profiles. For government programs with unpredictable query volumes — mission planning bursts, reporting windows — Serverless v2 scales ACU (Aurora Capacity Units) from a minimum to maximum within seconds:

resource "aws_rds_cluster" "serverless_gov" {
  cluster_identifier = "mission-serverless-db"
  engine             = "aurora-postgresql"
  engine_version     = "15.4"

  serverlessv2_scaling_configuration {
    min_capacity = 0.5   # Minimum: reduces cost when idle
    max_capacity = 64    # Maximum: scales to handle peak
  }

  storage_encrypted = true
  kms_key_id       = aws_kms_key.database_cmk.arn

  # Multi-AZ for HA: one writer, one reader
  availability_zones = ["us-gov-west-1a", "us-gov-west-1b"]

  backup_retention_period = 35
  deletion_protection     = true
}

resource "aws_rds_cluster_instance" "serverless_writer" {
  cluster_identifier = aws_rds_cluster.serverless_gov.id
  instance_class     = "db.serverless"
  engine             = aws_rds_cluster.serverless_gov.engine
}

resource "aws_rds_cluster_instance" "serverless_reader" {
  cluster_identifier = aws_rds_cluster.serverless_gov.id
  instance_class     = "db.serverless"
  engine             = aws_rds_cluster.serverless_gov.engine
  availability_zone  = "us-gov-west-1b"  # Different AZ for failover
}

The reader instance in a separate AZ means Aurora can fail over in ~30 seconds — acceptable for most government mission applications, and measurably better than the 1–2 minute RDS Multi-AZ failover.

FIPS 140-2 Compliance

GovCloud Aurora automatically uses FIPS 140-2 validated cryptographic modules for data at rest when a CMK is specified. For data in transit:

# Enforce SSL/TLS connection from application
import psycopg2
import boto3
import json

def get_gov_db_connection():
    """Connect to Aurora with TLS enforcement and Secrets Manager credentials"""
    
    # Retrieve credentials from Secrets Manager
    sm = boto3.client('secretsmanager', region_name='us-gov-west-1')
    secret = json.loads(
        sm.get_secret_value(SecretId='/production/database/credentials')['SecretString']
    )
    
    # sslmode=verify-full enforces certificate validation
    # sslrootcert points to the RDS CA certificate bundle for GovCloud
    conn = psycopg2.connect(
        host=secret['host'],
        port=5432,
        database=secret['dbname'],
        user=secret['username'],
        password=secret['password'],
        sslmode='verify-full',
        sslrootcert='/etc/ssl/certs/rds-us-gov-west-1-bundle.pem'
    )
    return conn

The RDS CA certificate for GovCloud is available from the AWS RDS documentation. Download and bundle it in your application container image — sslmode=verify-full ensures the server certificate is validated against this CA.

Backup and Recovery Testing: CP-9 and CP-10

NIST CP-9 requires regular backups. CP-10 requires testing recovery. Aurora satisfies CP-9 with continuous backup to S3 and 35-day point-in-time recovery. For CP-10, document and automate recovery testing:

#!/bin/bash
# Automated quarterly PITR test — restore to non-production environment
CLUSTER_ID="mission-system-db"
TEST_CLUSTER_ID="mission-system-db-recovery-test-$(date +%Y%m%d)"
RESTORE_TO_TIME="$(date -u -d '2 hours ago' '+%Y-%m-%dT%H:%M:%S+00:00')"

# Restore to point in time
aws rds restore-db-cluster-to-point-in-time \
  --source-db-cluster-identifier $CLUSTER_ID \
  --db-cluster-identifier $TEST_CLUSTER_ID \
  --restore-to-time $RESTORE_TO_TIME \
  --vpc-security-group-ids sg-xxxxxx \
  --db-subnet-group-name private-subnet-group \
  --region us-gov-west-1

echo "Recovery test cluster $TEST_CLUSTER_ID created"
echo "Validate application connectivity before deleting"

Record the RTO/RPO achieved in the test. This documentation becomes CP-10 evidence in your ATO package and satisfies the "test and document" requirement.

Connection Pooling for Production Workloads

Government applications under load need connection pooling to prevent Aurora from exhausting the connection limit. RDS Proxy provides managed connection pooling:

resource "aws_db_proxy" "main" {
  name                   = "mission-db-proxy"
  debug_logging          = false
  engine_family          = "POSTGRESQL"
  idle_client_timeout    = 1800
  require_tls            = true
  role_arn               = aws_iam_role.db_proxy_role.arn
  vpc_security_group_ids = [aws_security_group.db_proxy.id]
  vpc_subnet_ids         = aws_db_subnet_group.private.subnet_ids

  auth {
    auth_scheme = "SECRETS"
    description = "Aurora credentials from Secrets Manager"
    iam_auth    = "REQUIRED"  # IAM authentication for all proxy connections
    secret_arn  = aws_secretsmanager_secret.db_credentials.arn
  }
}

iam_auth = "REQUIRED" means all connections to the proxy authenticate with IAM — no password at the application layer. Combined with IRSA, the full stack has zero static credentials.

Related Patterns

Database HA integrates with secrets management in GovCloud for credential handling, infrastructure compliance scanning with Terraform for pre-deploy policy checks, and Terraform state management for GovCloud for safe infrastructure as code.

Rutagon has built PostgreSQL-backed production SaaS platforms handling concurrent users across Aurora Serverless v2 with zero credential exposure. Contact Rutagon to architect database resilience for your federal program.

Frequently Asked Questions

What's the RTO/RPO for Aurora PostgreSQL in GovCloud?

Aurora PostgreSQL Multi-AZ achieves ~30-second RTO (automatic failover to read replica) with near-zero RPO (continuous transaction log replication). Standard RDS Multi-AZ achieves ~1–2 minute RTO. For CP-10 documentation, conduct quarterly recovery tests and record actual observed RTO/RPO rather than citing theoretical values — assessors expect empirical evidence.

Does Aurora in GovCloud meet FIPS 140-2 requirements?

Yes. AWS GovCloud regions use FIPS 140-2 validated hardware and modules for cryptographic operations. Aurora with CMK encryption at rest and TLS for data in transit (enforced with sslmode=verify-full) satisfies SC-8(1) and SC-28 controls. Use a customer-managed KMS key (CMK) — not the AWS-managed default key — for full compliance with SC-12 (key management) requirements.

Should government programs use Aurora Serverless v2 or provisioned Aurora?

Aurora Serverless v2 is appropriate for variable workloads where provisioning a fixed instance size creates either over-provisioning waste or under-provisioning risk. It scales within seconds and maintains HA through the same multi-AZ reader configuration as provisioned Aurora. For steady, predictable workloads, provisioned Aurora is cost-predictable and equally compliant. Rutagon uses Serverless v2 in production and recommends it for most government application databases.

How do we handle database schema migrations in a government CI/CD pipeline?

Schema migrations need the same ATO evidence generation as code deployments. Use a migration tool like Flyway or Alembic with sequential versioned migrations checked into source control. Run migrations as a pre-deployment step in the CI/CD pipeline, with migration logs captured as audit evidence. Never apply manual schema changes in production — all changes through the pipeline, all changes logged.

What's the backup retention period for FedRAMP High government databases?

FedRAMP High baseline requires CP-9(1): automated backup with 90-day retention for High-impact systems. Aurora's maximum retention is 35 days (native). For 90-day retention, configure automated Aurora snapshot export to S3 with a lifecycle policy that keeps snapshots for the full 90-day requirement. Document this in your SSP — the native 35-day retention + S3 archival together satisfy the CP-9 requirement.

Discuss your project with Rutagon

Contact Us →

Ready to discuss your project?

We deliver production-grade software for government, defense, and commercial clients. Let's talk about what you need.

Initiate Contact