# Helm Charts for Production Kubernetes Deployments
Production Helm charts for Kubernetes are where many teams struggle — not because Helm is complicated, but because the gap between a working chart and a production-grade chart is filled with edge cases, security requirements, and operational patterns that only emerge under real workloads. A chart that deploys successfully is table stakes. A chart that rolls back cleanly, manages secrets safely, passes security scanning, and templates correctly across environments — that's production-grade.
We deploy to Kubernetes in regulated environments where a bad rollout isn't just an inconvenience — it's a compliance incident. These Helm patterns come from that production experience.
Chart Structure That Scales
The standard Helm chart structure works for simple applications. Production charts need more:
my-service/
├── Chart.yaml
├── Chart.lock
├── values.yaml
├── values-staging.yaml
├── values-production.yaml
├── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── hpa.yaml
│ ├── pdb.yaml
│ ├── networkpolicy.yaml
│ ├── serviceaccount.yaml
│ ├── configmap.yaml
│ ├── secret.yaml
│ ├── servicemonitor.yaml
│ └── tests/
│ ├── test-connection.yaml
│ └── test-health.yaml
├── ci/
│ ├── staging-values.yaml
│ └── production-values.yaml
└── README.md The _helpers.tpl file is where reusable template functions live. Invest time here — good helpers eliminate duplication and enforce consistency:
{{/* templates/_helpers.tpl */}}
{{- define "app.fullname" -}}
{{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- define "app.labels" -}}
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/component: {{ .Values.component | default "backend" }}
{{- end -}}
{{- define "app.selectorLabels" -}}
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end -}}
{{- define "app.securityContext" -}}
runAsNonRoot: true
runAsUser: {{ .Values.securityContext.runAsUser | default 1000 }}
runAsGroup: {{ .Values.securityContext.runAsGroup | default 1000 }}
fsGroup: {{ .Values.securityContext.fsGroup | default 1000 }}
seccompProfile:
type: RuntimeDefault
{{- end -}} Values Templating for Multiple Environments
The values.yaml file defines defaults. Environment-specific files override them. The key principle: defaults should be safe for production. Development overrides should loosen constraints, not the other way around.
# values.yaml — production-safe defaults
replicaCount: 3
image:
repository: registry.internal/my-service
tag: "" # Set by CI/CD pipeline
pullPolicy: IfNotPresent
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
podDisruptionBudget:
enabled: true
minAvailable: 2
networkPolicy:
enabled: true
ingress:
- from:
- namespaceSelector:
matchLabels:
purpose: ingress
ports:
- protocol: TCP
port: 8080
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
serviceAccount:
create: true
annotations: {}
probes:
liveness:
path: /health/live
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readiness:
path: /health/ready
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
startup:
path: /health/live
initialDelaySeconds: 0
periodSeconds: 5
failureThreshold: 30 # values-staging.yaml — overrides for staging
replicaCount: 1
autoscaling:
enabled: false
podDisruptionBudget:
enabled: false
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi Notice that staging overrides reduce resources and disable HA features. Production defaults are the baseline. This prevents the common mistake of building charts that work in staging but fail in production because someone forgot to configure replicas, PDBs, or resource limits.
The Deployment Template
The deployment template is the heart of the chart. A production deployment template handles probes, security contexts, resource management, and graceful shutdown:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "app.fullname" . }}
labels:
{{- include "app.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "app.selectorLabels" . | nindent 6 }}
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
template:
metadata:
labels:
{{- include "app.selectorLabels" . | nindent 8 }}
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
spec:
serviceAccountName: {{ include "app.fullname" . }}
securityContext:
{{- include "app.securityContext" . | nindent 8 }}
terminationGracePeriodSeconds: 60
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 8080
protocol: TCP
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
livenessProbe:
httpGet:
path: {{ .Values.probes.liveness.path }}
port: http
initialDelaySeconds: {{ .Values.probes.liveness.initialDelaySeconds }}
periodSeconds: {{ .Values.probes.liveness.periodSeconds }}
failureThreshold: {{ .Values.probes.liveness.failureThreshold }}
readinessProbe:
httpGet:
path: {{ .Values.probes.readiness.path }}
port: http
initialDelaySeconds: {{ .Values.probes.readiness.initialDelaySeconds }}
periodSeconds: {{ .Values.probes.readiness.periodSeconds }}
failureThreshold: {{ .Values.probes.readiness.failureThreshold }}
startupProbe:
httpGet:
path: {{ .Values.probes.startup.path }}
port: http
initialDelaySeconds: {{ .Values.probes.startup.initialDelaySeconds }}
periodSeconds: {{ .Values.probes.startup.periodSeconds }}
failureThreshold: {{ .Values.probes.startup.failureThreshold }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
{{- include "app.selectorLabels" . | nindent 14 }} Key production details: maxUnavailable: 0 ensures zero-downtime deploys. The configmap checksum annotation triggers a rollout when configuration changes. readOnlyRootFilesystem: true with a writable /tmp volume prevents filesystem-based attacks. Topology spread constraints distribute pods across availability zones.
Rollback Strategies
Helm maintains release history. When a deployment goes wrong, rollback is straightforward:
# View release history
helm history my-service -n production
# Rollback to previous release
helm rollback my-service 0 -n production
# Rollback to specific revision
helm rollback my-service 5 -n production --wait --timeout 5m But rollback is the recovery mechanism. Prevention is better. Implement deployment gates:
Pre-Upgrade Hooks
Run validation before the upgrade proceeds:
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "app.fullname" . }}-pre-upgrade
annotations:
helm.sh/hook: pre-upgrade
helm.sh/hook-weight: "-5"
helm.sh/hook-delete-policy: before-hook-creation
spec:
template:
spec:
containers:
- name: db-migration-check
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
command: ["./migrate", "--dry-run"]
restartPolicy: Never
backoffLimit: 0 If the migration dry-run fails, the upgrade is aborted before any pods are replaced.
Post-Upgrade Validation
After deployment, validate that the new version is healthy before considering the release successful:
helm upgrade my-service ./my-service \
-n production \
-f values-production.yaml \
--set image.tag=$IMAGE_TAG \
--wait \
--timeout 10m \
--atomic The --atomic flag automatically rolls back if the deployment doesn't become healthy within the timeout. This is essential for CI/CD pipelines where human intervention may not be immediate.
Secrets Management
Secrets in Helm charts are the most common security mistake. Putting secrets in values.yaml, committing them to Git, or base64-encoding them in templates (which is encoding, not encryption) are all violations.
External Secrets Operator
The cleanest pattern: secrets live in AWS Secrets Manager or SSM Parameter Store. The External Secrets Operator syncs them into Kubernetes secrets:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: {{ include "app.fullname" . }}
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: {{ include "app.fullname" . }}-secrets
creationPolicy: Owner
data:
- secretKey: DATABASE_URL
remoteRef:
key: /production/my-service/database-url
- secretKey: API_KEY
remoteRef:
key: /production/my-service/api-key Secrets never appear in Helm values, Git repositories, or CI/CD logs. They're fetched at runtime from the secrets manager by the operator running in-cluster.
Sealed Secrets as Alternative
For environments without access to external secrets managers, Sealed Secrets encrypt secrets with a cluster-specific key. The encrypted form is safe to commit to Git:
# Encrypt a secret for the cluster
kubeseal --format yaml < secret.yaml > sealed-secret.yaml
# The sealed secret can be committed to Git safely
# Only the target cluster can decrypt it Proper secrets management is a core part of container security in production CI/CD — if secrets leak through your Helm charts, all your other security controls are undermined.
Chart Testing
Untested Helm charts are a deployment risk. Test at multiple levels:
Template Testing with helm-unittest
Validate that templates render correctly for all value combinations:
# tests/deployment_test.yaml
suite: deployment tests
templates:
- deployment.yaml
tests:
- it: should set correct replica count
set:
replicaCount: 5
autoscaling.enabled: false
asserts:
- equal:
path: spec.replicas
value: 5
- it: should not set replicas when autoscaling is enabled
set:
autoscaling.enabled: true
asserts:
- isNull:
path: spec.replicas
- it: should enforce security context
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext.allowPrivilegeEscalation
value: false
- equal:
path: spec.template.spec.containers[0].securityContext.readOnlyRootFilesystem
value: true
- it: should set resource limits
asserts:
- isNotNull:
path: spec.template.spec.containers[0].resources.limits
- isNotNull:
path: spec.template.spec.containers[0].resources.requests Linting and Schema Validation
# Lint the chart
helm lint ./my-service -f values-production.yaml
# Render templates and validate against Kubernetes schemas
helm template my-service ./my-service -f values-production.yaml | \
kubeval --strict --kubernetes-version 1.28.0
# Security scanning with Checkov
helm template my-service ./my-service -f values-production.yaml | \
checkov --framework kubernetes - Integration Testing with ct
Chart Testing (ct) validates charts in a real cluster:
# ct.yaml
target-branch: main
chart-dirs:
- charts
helm-extra-args: --timeout 120s
check-version-increment: true
validate-maintainers: false ct lint-and-install --config ct.yaml --charts ./my-service This spins up the chart in a test cluster, runs the test hooks, verifies health, and tears down — giving you confidence that the chart actually works, not just that it renders valid YAML.
Our Kubernetes containerization capabilities include comprehensive Helm chart development with testing, security scanning, and multi-environment deployment patterns built in from the start.
Frequently Asked Questions
Should I use Helm or Kustomize for production Kubernetes?
Both are viable. Helm excels when you need templating across multiple environments with significantly different configurations, package management with versioned releases, and rollback capabilities. Kustomize excels for simpler overlay-based customization without the complexity of Go templating. Many teams use both: Helm for application charts and Kustomize for environment-specific overlays on top of rendered Helm output.
How do you handle Helm chart versioning?
Follow semantic versioning. Bump the chart version on every change. Bump appVersion when the application version changes. Use Chart.lock for dependency pinning. In CI/CD, automate version bumping based on commit type: patch for fixes, minor for features, major for breaking changes to values schema.
What's the maximum number of Helm release revisions to keep?
Set --history-max to 10-20 revisions. Each revision stores the complete release manifest, which consumes etcd storage. Keeping too many revisions bloats the cluster. Keeping too few limits your rollback options. Ten revisions gives you a reasonable rollback window without storage concerns.
How do you manage Helm charts across multiple clusters?
Use a chart repository (ChartMuseum, OCI registry, or S3) to publish versioned charts. Each cluster's deployment configuration references the chart version and provides environment-specific values. GitOps tools like ArgoCD or Flux pull charts from the repository and apply environment values automatically.
How do you debug a Helm template that isn't rendering correctly?
Use helm template with --debug to see the rendered output. Add --show-only templates/deployment.yaml to isolate a specific template. For complex logic, temporarily add {{- fail (printf "Debug: %v" .Values.someValue) -}} to inspect values at specific points in the template. Remove debug statements before committing.
---
Production Kubernetes deployments deserve production-grade Helm charts. Contact Rutagon about building deployment infrastructure that's tested, secure, and reliable across environments.