Skip to content

Control Plane SLA, Release Channels, & Versioning Policy

Control Plane SLA

Regional Cluster (HA)

Commitment: 99.95% uptime SLA
           = ~22 minutes downtime/year

How achieved:

  • Control plane replicated across 3 zones
  • Automatic failover
  • Load balanced

Zonal Cluster (Standard)

Commitment: 99.5% uptime SLA
           = ~3.6 hours downtime/year

Why lower:

  • Control plane single zone
  • Maintenance windows cause downtime
  • More frequent updates

Release Channels

GKE provides upgrade cadence options:

Rapid Channel

Purpose: Get latest features fastest
Release cadence: Weekly updates
Support window: ~2 weeks per version
Best for: Development, testing, early adopters
Downside: Frequent upgrades, higher instability risk
Purpose: Balance features & stability
Release cadence: Monthly updates
Support window: ~3 months per version
Best for: Most production clusters
Tradeoff: Good for typical needs

Stable Channel

Purpose: Maximum stability, slow rollout
Release cadence: Quarterly updates
Support window: ~6 months per version
Best for: Mission-critical, risk-averse organizations
Downside: Longer lag for fixes/features

Extended Channel (GKE Enterprise)

Purpose: Maximum support window
Support window: 12 months per version
Cadence: Quarterly releases
Best for: Large enterprises needing long support

GKE Versioning Policy

Version Format

Major.Minor.Patch
1.29.5
├─ 1: API major version (rarely changes)
├─ 29: Release version (monthly-ish)
└─ 5: Patch version (security/bug fixes)

n-2 Support Policy

At any time, GKE supports:
├─ Current version (e.g., 1.29)
├─ Previous version (e.g., 1.28)
├─ Previous-previous version (e.g., 1.27)
└─ Older: Not supported

Implication:
├─ Create cluster at version 1.29 → OK
├─ Create cluster at version 1.27 → Might not be available
└─ Cluster at 1.26 → Must upgrade

Auto-Upgrade Windows

GKE can auto-upgrade clusters:

Release: 1.29.1 released
Window: 30-60 days
        (within release channel cadence)

Notification: Email + dashboard

Maintenance window: User can schedule

Auto-upgrade happens

Version Skew Policy

Control Plane ↔ Kubelet Compatibility

Control Plane: 1.29
Kubelets: Can be 1.29, 1.28, 1.27

NEVER: Kubelet newer than control plane
       (kubelet 1.30 with control plane 1.29)

Why This Matters

If control plane runs 1.29:
├─ Might remove v1beta1 API
├─ Kubelet 1.30 expects v1beta1
├─ Communication breaks
└─ Cluster problems

Solution: Always upgrade control plane FIRST
          Then upgrade nodes

GKE Management

GKE manages this automatically:
├─ Control plane upgraded
├─ Waits for nodes to upgrade
├─ Enforces compatibility

Upgrade Process

Standard Cluster Upgrade

Timeline:
T=0: Upgrade initiated
  ├─ Control plane upgraded (5-15 min downtime)
  └─ API requests briefly interrupted

T=15min: Control plane ready
  ├─ Nodes begin upgrade (rolling)
  ├─ Each node: drain pods → upgrade → rejoin
  └─ Takes hours for large clusters

T=hours: Cluster upgraded

Autopilot Cluster Upgrade

Timeline:
T=0: Upgrade initiated
  ├─ Control plane upgraded (zero-downtime)
  ├─ Nodes upgraded automatically
  └─ Workloads continue running (mostly)

T=minutes: Cluster upgraded

Breaking Changes Between Versions

Example: 1.29 → 1.30 Breaking Changes

yaml
# Removed in 1.30:
apiVersion: extensions/v1beta1  # GONE
kind: Ingress

# Use instead:
apiVersion: networking.k8s.io/v1
kind: Ingress

Deprecation Timeline

Version N.M:
  ├─ Feature deprecated (warning in logs)
  
Version N.(M+1):
  ├─ Feature still works (loud warnings)
  
Version N.(M+2):
  └─ Feature REMOVED (fails immediately)

Migration Between Release Channels

bash
# Standard → Rapid
gcloud container clusters update my-cluster \
  --release-channel=rapid

# Takes effect at next upgrade window

# Cannot downgrade channels directly
# Must upgrade first, then downgrade

GKE Versioning & Support

bash
# Check available versions
gcloud container get-server-config --region=us-central1

# Check cluster current version
gcloud container clusters describe my-cluster \
  --format="value(currentMasterVersion)"

# Check upgrade status
gcloud container operations list --cluster=my-cluster

Production Patterns

Pattern: Staged Upgrades

Cluster Fleet:
├─ Dev cluster (Rapid channel) → 1.30 latest
├─ Staging cluster (Regular channel) → 1.29 stable
└─ Prod cluster (Stable channel) → 1.28 proven

Upgrade flow:
Rapid → Regular → Stable (over 3-6 months)

Pattern: Custom Maintenance Windows

bash
# Set maintenance window
gcloud container clusters update my-cluster \
  --maintenance-window-start=2024-01-01T02:00:00Z \
  --maintenance-window-end=2024-01-01T06:00:00Z \
  --maintenance-window-recurrence="FREQ=WEEKLY;BYDAY=SU"

# Upgrades only during this window
# Enables planning around business hours

Reference Documentation


Summary

  • SLA: 99.95% regional, 99.5% zonal
  • Release channels: Rapid (weekly), Regular (monthly), Stable (quarterly)
  • n-2 support: Current + 2 previous versions supported
  • Version skew: Kubelet ≤ control plane
  • Breaking changes: Deprecation timeline spans 2-3 versions
  • Upgrade flow: Control plane → nodes
  • Auto-upgrade: Automatic within maintenance window
  • Production pattern: Multi-channel strategy (Rapid → Regular → Stable)