Amazon Elastic Kubernetes Service (EKS) on AWS GovCloud is increasingly the standard Kubernetes platform for DoD cloud programs — its managed control plane eliminates master node management, integrates natively with GovCloud IAM and networking, and supports STIG compliance at the node level. But "managed" doesn't mean "secure by default." Production EKS for defense programs requires specific security configuration that goes well beyond the defaults.
Here's Rutagon's production EKS configuration for government cloud deployments.
Terraform: EKS Cluster Baseline
The cluster itself is configured through Terraform from the start. Key non-default settings:
resource "aws_eks_cluster" "main" {
name = var.cluster_name
role_arn = aws_iam_role.cluster.arn
version = var.kubernetes_version
# Private endpoint only — STIG K8s-001 (approximately)
# No public API server endpoint
kubernetes_network_config {
ip_family = "ipv4"
service_ipv4_cidr = "10.100.0.0/16"
}
vpc_config {
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.cluster.id]
endpoint_private_access = true
endpoint_public_access = false # Disable public endpoint
}
# Enable all control plane log types — required for audit trail
enabled_cluster_log_types = [
"api", "audit", "authenticator",
"controllerManager", "scheduler"
]
# Secrets encryption — EKS STIG requirement
encryption_config {
resources = ["secrets"]
provider {
key_arn = var.kms_secrets_key_arn
}
}
tags = merge(var.common_tags, {
"environment" = var.environment
"il-level" = var.il_level
})
}
The private-only API endpoint is a critical STIG requirement — access to the Kubernetes API is restricted to the VPC, requiring workers and CI/CD runners to be within the VPC (or connected via VPN/Direct Connect).
Node Group Configuration
Worker nodes require STIG-compliant configuration at the OS level. Rutagon uses managed node groups with a custom AMI built on a STIG-hardened EKS-optimized Amazon Linux 2 base:
resource "aws_eks_node_group" "application" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "application-${var.environment}"
node_role_arn = aws_iam_role.node.arn
subnet_ids = var.private_subnet_ids
# Use custom STIG-hardened AMI
release_version = var.stig_ami_release_version
scaling_config {
desired_size = var.desired_nodes
max_size = var.max_nodes
min_size = var.min_nodes
}
instance_types = [var.node_instance_type]
# Disk encryption for STIG compliance
launch_template {
id = aws_launch_template.node.id
version = aws_launch_template.node.latest_version
}
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
create_before_destroy = true
}
}
resource "aws_launch_template" "node" {
name_prefix = "eks-node-${var.cluster_name}-"
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 50
volume_type = "gp3"
encrypted = true
kms_key_id = var.kms_ebs_key_arn
delete_on_termination = true
}
}
metadata_options {
http_endpoint = "enabled"
http_tokens = "required" # IMDSv2 — prevents SSRF-based metadata attacks
http_put_response_hop_limit = 1
}
}
IMDSv2 enforcement (http_tokens = "required") is required — it prevents SSRF vulnerabilities from being used to access instance metadata and steal IAM credentials.
IRSA: Pod-Level IAM Without Long-Lived Credentials
IAM Roles for Service Accounts (IRSA) allows pods to assume IAM roles through OIDC federation — no long-lived IAM credentials in pods. This is fundamental to Rutagon's zero-long-lived-credentials posture:
# OIDC provider for IRSA
data "tls_certificate" "cluster" {
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
resource "aws_iam_openid_connect_provider" "cluster" {
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.tls_certificate.cluster.certificates[0].sha1_fingerprint]
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
# IAM role for a specific service account
resource "aws_iam_role" "app_service" {
name = "${var.cluster_name}-app-service"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.cluster.arn
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringEquals = {
"${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:sub" =
"system:serviceaccount:${var.namespace}:${var.service_account_name}"
}
}
}]
})
}
Each microservice gets a dedicated IAM role scoped to exactly the permissions it needs. The trust policy is locked to the specific service account — a compromised pod can't assume a different pod's IAM role.
Network Policy Enforcement
EKS doesn't enforce Kubernetes NetworkPolicies without a CNI plugin that supports them. On GovCloud, the AWS VPC CNI now supports network policies natively (as of recent versions). Configure default-deny:
# Default deny all ingress and egress for namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Then explicitly allow only required communication paths. This satisfies NIST SC-7 (Boundary Protection) control requirements and the Kubernetes STIG's network isolation controls.
View Rutagon's cloud engineering capabilities → rutagon.com/government
Frequently Asked Questions
What Kubernetes version is recommended for DoD GovCloud programs?
Use the latest EKS-supported Kubernetes version that your application components support. EKS supports versions for approximately 14 months before end-of-life. Staying within 1-2 minor versions of the latest supported release ensures access to security patches and STIG updates. DISA releases EKS STIG updates tracking Kubernetes versions — verify STIG coverage for the specific version before committing to a Kubernetes version for a long-running program.
How do you handle EKS node upgrades without downtime?
Managed node group rolling updates: configure maxUnavailable = 1 in the node group update settings. AWS rotates one node at a time, draining pods before terminating the old node and adding the replacement. Combined with PodDisruptionBudgets on critical workloads (minimum 1 replica must be running during disruption), node upgrades are typically zero-downtime.
What's the STIG compliance burden for EKS?
The Kubernetes STIG (DISA releases separate STIGs for Kubernetes, EKS, and underlying OS) has approximately 90 controls. Using a managed EKS cluster reduces the burden — AWS manages control plane node security; the program is responsible for worker node OS, Kubernetes API configuration, and workload-level controls. Using a STIG-hardened node AMI and Iron Bank images eliminates the majority of worker node and container findings.
Can EKS on GovCloud reach the internet for package downloads?
Private clusters with public endpoint disabled cannot reach the internet by default. For air-gap requirements, this is intentional — all images must be mirrored to an ECR repository in the same GovCloud account before deployment. For programs that need some internet access (pulling from approved external registries, for example), a NAT Gateway with restricted egress security group rules allows controlled outbound access while maintaining private API endpoint security.
How does EKS integrate with GovCloud IAM and CloudTrail?
EKS control plane audit logs are delivered to CloudWatch Logs automatically when enabled (as shown in the Terraform configuration above). CloudTrail captures all EKS API calls (CreateCluster, DescribeNodegroup, etc.) as management events. Application-level audit logging (pod access to S3, DynamoDB, SSM Parameter Store) is captured in CloudTrail through the IAM activity on the IRSA roles. This combination provides the audit trail required for NIST AU (Audit and Accountability) controls.