Skip to main content
INS // Insights

Cloud Sub Knowledge Transfer on Defense Programs

Updated May 2026 · 6 min read

Cloud infrastructure knowledge is often one of the least-documented assets in a defense program. When a cloud engineering sub transitions off a program — through contract end, re-competition, or deliberate transition — the knowledge gap left behind can disrupt program operations, delay ATO renewals, and create security incidents from misconfigured infrastructure operated by teams who don't fully understand it.

Here's how Rutagon approaches knowledge transfer to protect program continuity.

Why Cloud Knowledge Transfer Fails

Cloud infrastructure knowledge exists at three levels, and most programs only document the obvious one:

Level 1 — Procedural documentation: How to perform specific operations (deploy, scale, patch). This is what most runbooks cover.

Level 2 — Architecture decisions: Why the system is designed the way it is — which design alternatives were considered and rejected, what constraints drove specific choices, what breaking changes would invalidate assumptions.

Level 3 — Tribal knowledge: Undocumented operational knowledge — the specific CloudFormation quirk that requires a certain workaround, the IAM permission that was added for an obscure edge case and is now load-bearing, the alert that fires spuriously on Monday mornings because of a scheduled batch job.

Level 2 and 3 knowledge is what creates post-transition problems. A new team that has the runbooks but not the architecture decision records will make changes that violate hidden assumptions — and not know why things break.

Rutagon's Documentation Standards

Rutagon builds documentation as a continuous deliverable, not an end-of-program activity. Documentation artifacts produced throughout delivery:

Architecture Decision Records (ADRs): Structured documents capturing each significant architecture decision — the context, alternatives considered, decision made, and consequences. ADRs are stored in the same repository as the infrastructure code, versioned alongside it.

# ADR-007: GovCloud region selection for east-coast data residency

## Status: Accepted

## Context
The program requires data residency in the eastern US geographic area.
AWS GovCloud has two regions: us-gov-west-1 (Oregon) and us-gov-east-1 (Virginia).

## Decision
Use us-gov-east-1 (Virginia) as primary region.
Use us-gov-west-1 (Oregon) as disaster recovery region.

## Consequences
- East-1 has a smaller service subset than West-1 — some services must use alternatives
- Lower latency for east-coast operator access
- Cross-region replication cost for DR scenario
- Emergency fallback procedures must account for service gaps in west-1

Runbook library: Operational procedures for all routine and emergency operations — deployment procedures, scaling events, security incident response steps, ConMon reporting procedures, ATO evidence collection. Runbooks are living documents, updated when procedures change.

Infrastructure inventory: Auto-generated from Terraform state — all deployed resources, their configuration, and their dependencies. Updated on every deployment.

Tribal knowledge capture: At quarterly intervals, Rutagon engineers conduct an "undocumented knowledge audit" — team members verbally walk through the system, specifically calling out things they know that aren't in the documentation. These get written into ADRs or runbook addendums.

The Knowledge Transfer Process

When a program transition is planned, Rutagon executes a structured knowledge transfer:

Phase 1 (60 days before transition): Documentation completeness audit. Identify gaps between what's documented and what should be documented. Prioritize by operational risk — Level 3 knowledge that would cause a security incident if lost is highest priority.

Phase 2 (30 days before transition): Shadow operations. Incoming team members or replacement sub personnel shadow Rutagon engineers on all routine operations. They perform operations with Rutagon engineers present for validation — not observe, perform.

Phase 3 (transition week): Knowledge transfer sessions. Rutagon engineers present each system component in detail to the incoming team — architecture, operational considerations, known issues, monitoring interpretation, escalation paths. Session recordings retained in the program artifact repository.

Phase 4 (30 days post-transition): Q&A support. Rutagon remains available for questions during the 30 days after handoff. This is standard in Rutagon teaming agreements as a transition support provision.

What Primes Should Build Into Sub Management Plans

Primes should include knowledge transfer provisions in sub management plans from the start of the program, not at transition:

Ongoing documentation requirement: Documentation is a deliverable on a defined cadence (monthly or quarterly), not an optional activity. ADRs are submitted to the prime's document management system as they're created.

Documentation audit at option year renewals: If the prime re-competes at each option year, the documentation baseline at option year exercise allows the prime to evaluate whether knowledge is sufficiently captured to manage a potential transition without program disruption.

Transition support period in teaming agreement: A defined 30-60 day transition support period after sub offboarding, during which the exiting sub provides Q&A support, is standard in well-structured teaming agreements. Failing to include this leaves the prime exposed if a transition becomes contentious.

Explore teaming with Rutagon → rutagon.com/contact

Frequently Asked Questions

How long should a cloud infrastructure knowledge transfer take on a typical defense program?

A small-to-medium cloud program (2-5 engineers, 12-24 months of infrastructure development) typically requires 4-8 weeks for a thorough knowledge transfer. This includes the shadow operations phase and knowledge transfer sessions. Rushing knowledge transfer to 1-2 weeks creates unacceptable continuity risk. Primes should build 4-6 weeks of transition time into re-competition schedules.

Who owns documentation on a prime-led program — the prime or the sub?

Work product ownership is defined in the teaming agreement and ultimately governed by the prime's contract data requirements list (CDRL). Typically, all documentation produced during the program becomes the government's data rights. As a practical matter, Rutagon produces documentation in the prime's document management system (SharePoint, Confluence, or equivalent) so it's accessible to the prime regardless of sub transition.

What happens to cloud credentials and access when a sub transitions off?

Access revocation should be immediate and complete. The transition plan includes an access inventory (all credentials, IAM roles, API keys, repository access, network access) and a revocation checklist executed on the transition date. For OIDC-federated access (the Rutagon standard), role trust policies are updated to remove sub personnel; no credentials need to be rotated. For legacy long-lived credentials, rotation is immediate. The prime's security team should verify revocation through an access review within 30 days of transition.

How do you document undiscovered or informal system dependencies?

Dependency discovery is part of the pre-transition documentation audit. The method: take the production environment completely offline in a test scenario and see what breaks; review all outbound network connections from infrastructure components; review CloudWatch logs for external call patterns; interview all engineers on the team about dependencies they've encountered. The audit surface undocumented dependencies that were never formally recorded.

What if documentation is inadequate when a surprise transition is needed?

Surprise transitions (contract termination, dispute, emergency replacement) are higher-risk knowledge transfer scenarios. In these cases, Rutagon's infrastructure-as-code practices provide a baseline — the Terraform state and code in the repository captures most of the deployed configuration even without explicit documentation. The gap is Level 2 and Level 3 knowledge. For forced transitions, intensive knowledge transfer sessions (daily, over 1-2 weeks) are the fastest path to acceptable continuity risk.

Ready to discuss your project?

We deliver production-grade software for government, defense, and commercial clients. Let's talk about what you need.

Initiate Contact