Alias IP Ranges & GKE Pods — The Secret to VPC-native Networking

Executive Summary

Alias IP ranges = GCP's answer to container networking challenges. Unlike Kubernetes on AWS (which requires flannel/weave overlays or VPC CNI complexity), GCP natively embeds pods into VPC via alias IPs:

✅ Pods get actual VPC IP addresses (10.2.x.x, not 172.17.x.x overlay)
✅ Direct pod-to-pod routing (no encapsulation, no MTU issues)
✅ Firewall rules apply to pods directly (no service mesh overhead)
✅ Performance: native routing latency (microseconds vs milliseconds with overlay)
❌ Complexity: requires secondary CIDR ranges, anti-spoofing enforcement

What is an Alias IP Range?

Alias IP = additional internal IP address assigned to same network interface.

Without alias IPs:

VM "app-1":
  Primary interface (eth0):
    Primary IP: 10.0.1.5 (VM itself)
    → Can run single application or container with NAT

With alias IPs:

VM "app-1" (same VM, same NIC):
  Primary interface (eth0):
    Primary IP: 10.0.1.5 (VM host)
    Alias IP 1: 10.2.1.0/24 (Pod A)
    Alias IP 2: 10.2.1.1 (Pod B)
    Alias IP 3: 10.2.1.2 (Pod C)
    → Each container gets routable VPC IP!

GCP automatically installs routes for each alias:

VM "app-1" (10.0.1.5) with aliases:

VPC Route 1: 10.2.1.0/24 → 10.0.1.5 (eth0)
VPC Route 2: 10.2.1.1/32 → 10.0.1.5 (eth0)
VPC Route 3: 10.2.1.2/32 → 10.0.1.5 (eth0)

Packets destined to 10.2.1.0 → delivered to VM (linux routing handles local delivery)
Packets destined to 10.2.1.1 → delivered to VM (Pod B)

GKE Pod CIDR Architecture

VPC-native Cluster Setup

Create subnet with secondary range:

gcloud compute networks subnets create app-subnet \
  --network=prod \
  --region=us-west1 \
  --range=10.0.0.0/20 \
  --secondary-range pods=10.2.0.0/16,services=10.3.0.0/16

Create GKE cluster:

gcloud container clusters create my-cluster \
  --network=prod \
  --cluster-secondary-range-name pods \
  --services-secondary-range-name services \
  --region=us-west1

What GCP does automatically:
  1. Node pool VMs get primary IPs (10.0.x.x)
  2. Each node's pods get alias IPs (10.2.x.x)
  3. Each service gets IP (10.3.x.x - internal, not routable externally)
  4. VPC routes automatically created for pod ranges

Pod Networking Architecture

GKE Cluster "my-cluster" in subnet "app-subnet"

Node 1 (10.0.1.10):
  ├── Pod A (10.2.1.10)
  ├── Pod B (10.2.1.11)
  └── Pod C (10.2.1.12)

Node 2 (10.0.1.11):
  ├── Pod D (10.2.2.10)
  ├── Pod E (10.2.2.11)
  └── Pod F (10.2.2.12)

Node 3 (10.0.1.12):
  ├── Pod G (10.2.3.10)
  ├── Pod H (10.2.3.11)
  └── Pod I (10.2.3.12)

Service A (10.3.1.1):
  ← LoadBalancer for Pods A, B, C

Service B (10.3.2.1):
  ← LoadBalancer for Pods D, E, F

Traffic Flow: Pod-to-Pod

Pod A (10.2.1.10) → Pod D (10.2.2.10)

Step 1: Pod A sends packet to 10.2.2.10
Step 2: Packet leaves Node 1 (10.0.1.10)
  (Linux routing: destination 10.2.2.10 via default gateway 10.0.0.1)
Step 3: Packet enters GCP SDN (Andromeda)
Step 4: Andromeda looks up route:
  "10.2.2.10/32 → Node 2 (10.0.1.11)"
Step 5: Packet routed to Node 2 primary IP (10.0.1.11)
Step 6: Node 2 kernel receives, recognizes 10.2.2.10 as local alias
Step 7: Packet delivered to Pod D container

Latency: ~100-200 microseconds (native routing)
vs.
Overlay network (flannel): ~5-10 milliseconds (encapsulation overhead)

Critical: Primary vs Alias IP Addressing

Primary IP (VM Host)

Primary IP = VM itself

VM "node-1":
  IP: 10.0.1.10 (primary)
  └── VM kernel, kubelet, system processes
  
Firewall rule "allow-ssh":
  target: tag:kubernetes-node
  → Matches node-1's primary IP (10.0.1.10) ✓

Pod network communication:
  Pod → Pod routing = handled by Andromeda
  Pod → VM access = also handled by Andromeda

Alias IPs (Pods)

Alias IPs = Pods (containers)

Node "node-1" (10.0.1.10):
  Pods using alias IPs:
    Pod A: 10.2.1.10
    Pod B: 10.2.1.11
    Pod C: 10.2.1.12

Firewall rule "allow-http-pods":
  target: tag:app-pod
  → Matches pod IPs (10.2.x.x) ✓

Key difference:
  Pod IP ≠ VM primary IP
  Firewall rules target pods, not node VMs

Firewall Rules with Alias IPs

Firewall applies to both primary and alias IPs but with different semantics:

VM "node-1" (Primary: 10.0.1.10) with pods (Alias: 10.2.1.10-12)

Firewall rule "allow-ingress-pods":
  Direction: INGRESS
  Target: tag:gke-node
  Allow: tcp:8080 from 0.0.0.0/0
  
Effect:
  ✓ Allows traffic to 10.0.1.10:8080 (node VM)
  ✓ Allows traffic to 10.2.1.10:8080 (Pod A)
  ✓ Allows traffic to 10.2.1.11:8080 (Pod B)
  ✓ Allows traffic to 10.2.1.12:8080 (Pod C)
  
Why? Because all IPs (primary + aliases) are evaluated against the rule

Best Practice: Separate Rules for VMs and Pods

Rule 1: SSH access
  Direction: INGRESS
  Target: tag:kubernetes-node
  Allow: tcp:22 from 10.0.0.0/8 (internal only)
  CIDR: 10.0.x.x (PRIMARY VM network)

Rule 2: Pod ingress
  Direction: INGRESS
  Target: tag:gke-pod
  Allow: tcp:8080,8443 from 0.0.0.0/0
  CIDR: 10.2.x.x (ALIAS pod network)

Rule 3: Pod-to-DB
  Direction: INGRESS
  Target: tag:database
  Allow: tcp:3306 from 10.2.0.0/16 (pods only)
  Deny: tcp:3306 from 10.0.0.0/16 (VMs not allowed)

Anti-spoofing Checks: The Hidden Protection

GCP enforces source IP validation:

Security check on every packet:
  "Is this packet's source IP assigned to this VM?"

Example:
  Pod A (10.2.1.10) sends packet
  ✓ Source IP 10.2.1.10 = alias on node-1 ✓ ALLOWED
  
  Attacker spoofs:
  Packet with source 10.1.1.1 (external IP)
  ✗ Source IP 10.1.1.1 ≠ any IP on node-1 ✗ DROPPED
  
  Another node spoofs:
  Packet from node-2 (10.0.1.11) with source 10.2.1.10
  ✗ Source 10.2.1.10 belongs to node-1, not node-2 ✗ DROPPED

Implication: Cannot use static routes for container networking:

❌ INCORRECT (would be allowed in AWS with disabled anti-spoofing):
  Static route: 10.2.0.0/16 → next-hop-instance node-1
  Pod in node-2 sends: packet with source 10.2.1.5
  → Anti-spoofing rejects (pod IP doesn't match node-2)

✅ CORRECT (Alias IP approach):
  Each node has alias IPs for its pods
  Anti-spoofing checks: source IP = alias on sending node ✓
  Andromeda route lookup: 10.2.1.5 → node-1 ✓

GKE to Non-GKE Communication

Scenario: GKE Pod → Cloud SQL

GKE Cluster "my-app" (pod CIDR: 10.2.0.0/16)
  ├── Pods: 10.2.1.x, 10.2.2.x, etc.
  └── CloudSQL IP: 10.10.0.5 (Private IP instance)

Pod A (10.2.1.10) connects to CloudSQL (10.10.0.5):

Step 1: Pod A sends to 10.10.0.5:3306
Step 2: Pod-to-CloudSQL routing via VPC
Step 3: CloudSQL firewall rule
  Rule: "allow-gke-pods"
    Source: 10.2.0.0/16
    ✓ Pod A's source IP 10.2.1.10 ✓ MATCHES
Step 4: Connection established ✓

Firewall config:
  gcloud sql instances patch cloudsql-instance \
    --authorized-networks=10.2.0.0/16

Scenario: GKE Pod → On-premises Database (Interconnect)

Network layout:
  GKE pods: 10.2.0.0/16
  On-premises network: 192.168.0.0/16
  Interconnect: google.com → customer-on-prem

Pod A (10.2.1.10) → On-prem DB (192.168.1.5):

Step 1: Pod A sends to 192.168.1.5
Step 2: VPC routing: 192.168.0.0/16 → Interconnect VLAN
Step 3: Packet exits GCP via Interconnect
Step 4: On-prem firewall checks
  Source: 10.2.1.10 (pod IP, routable)
  ✓ Allowed (assuming on-prem rules permit)
Step 5: Packet reaches on-prem DB

Anti-spoofing benefit:
  On-prem admin knows:
  "10.2.x.x = GCP pod network (verified by GCP anti-spoofing)"
  "Cannot spoof pod IPs from outside GCP"

Advanced: Multiple Alias Ranges Per Node

For advanced networking:

Node with multiple alias ranges:

gcloud compute instances create complex-node \
  --network-interface=subnet=app-subnet,\
    aliases=pods:10.2.0.0/24;services:10.3.0.0/25;cache:10.4.0.0/26

Node "complex-node" (Primary: 10.0.1.20):
  Alias range 1: 10.2.0.0/24 (pods)
  Alias range 2: 10.3.0.0/25 (k8s services)
  Alias range 3: 10.4.0.0/26 (cache layer)

VPC automatically creates routes:
  10.2.0.0/24 → 10.0.1.20
  10.3.0.0/25 → 10.0.1.20
  10.4.0.0/26 → 10.0.1.20
  
In-guest config (Linux):
  ip addr add 10.2.1.0/24 dev eth0 (enables pods on this range)
  ip addr add 10.3.0.0/25 dev eth0 (enables services on this range)
  # etc.
  
Then container orchestrator (Kubelet, etc) assigns IPs from these ranges

Design Pattern: Shared VPC + Alias IPs

For multi-team GKE:

Shared VPC setup:
  Host project: "gke-host"
    VPC: "shared-vpc"
    Subnet "gke-pods":
      Primary: 10.0.0.0/20 (nodes)
      Secondary pods: 10.2.0.0/12 (1M pods potential)
      
Service projects:
  "team-a-project":
    GKE cluster: team-a-cluster
    Pod CIDR: 10.2.0.0/16 (65K pods)
    Assigned nodes in shared subnet ✓
    
  "team-b-project":
    GKE cluster: team-b-cluster
    Pod CIDR: 10.2.16.0/16
    Assigned nodes in shared subnet ✓
    
  "team-c-project":
    GKE cluster: team-c-cluster
    Pod CIDR: 10.2.32.0/16
    Assigned nodes in shared subnet ✓

All clusters on same VPC:
  ✓ Service discovery across clusters
  ✓ Single firewall policy for all pods
  ✓ Cost: data transfer = free (intra-VPC)
  ✗ Blast radius: misconfigured rule affects all teams

Troubleshooting Alias IP Issues

Issue 1: Pod Cannot Reach External IP

Symptom: Pod to external IP (e.g., 8.8.8.8) fails

Diagnosis:
  1. Check pod source IP: kubectl exec pod -- ip addr
     Shows: 10.2.1.10 ✓
  
  2. Check VPC routes: gcloud compute routes list
     Shows: 0.0.0.0/0 → default-internet-gateway ✓
  
  3. Check firewall rule: gcloud compute firewall-rules list
     Shows: allow-egress to 0.0.0.0/0 from tag:gke-pod ✓
  
  4. Real issue: Source IP 10.2.1.10 = private IP
     External services require source IP = public IP (via NAT)
     Solution: Create Cloud NAT or use GCE proxy

  gcloud compute networks nat create pod-nat \
    --network=prod \
    --auto-allocate-nat-external-ips \
    --nat-all-subnet-ip-ranges \
    --region=us-west1

Issue 2: Pod-to-Pod Traffic Blocked

Symptom: Pod A (10.2.1.10) cannot ping Pod D (10.2.2.10)

Diagnosis:
  gcloud compute firewall-rules list --filter="allow" \
    --format="table(name,sourceRanges,targetTags)"
  
  Shows: No allow rule from 10.2.0.0/16 to 10.2.0.0/16
  
  Firewall rules:
    rule-1: allow ingress to tag:public from 0.0.0.0/0 (port 443)
    rule-2: deny-all (implicit)
  
  Fix: Add explicit allow rule
  
  gcloud compute firewall-rules create allow-pod-internal \
    --network=prod \
    --allow=tcp,udp,icmp \
    --source-ranges=10.2.0.0/16 \
    --target-tags=gke-pod

Issue 3: IP Address Exhaustion in Pod Subnet

Symptom: Cannot schedule new pods, no free IPs

Diagnosis:
  kubectl describe nodes
  Shows: "AllocatableIPs: 0" ← All IPs consumed
  
  gcloud compute networks subnets describe app-subnet \
    --region=us-west1 \
    --format="table(secondaryIpRanges[].rangeName,\
                   secondaryIpRanges[].ipCidrRange)"
  
  Shows: pods=10.2.0.0/16 with 65536 IPs
  Current usage: 60000+ pods (all IPs allocated)
  
  Fix (immediate): Scale down pods
  Fix (long-term): Add new secondary range
  
  # Cannot resize existing range, must add new one
  gcloud compute networks subnets update app-subnet \
    --region=us-west1 \
    --add-secondary-ranges pods2=10.4.0.0/16
  
  # Create new node pool using new range
  gcloud container node-pools create new-pool \
    --cluster=my-cluster \
    --region=us-west1 \
    --secondary-range-name=pods2
  
  # Migrate pods to new pool, drain old pool

Conclusion

Alias IP ranges = GCP's elegant solution to container networking:

Advantages over overlays (flannel, weave):

✅ Native VPC routing (fast)
✅ Direct firewall application
✅ IP exhaustion is transparent (planning is required upfront)
✅ No MTU complexity

Advantages over static routes:

✅ Anti-spoofing protection (prevents network attacks)
✅ Per-pod routing (not per-class)
✅ Supported at GCP level (not application-level)

Required planning:

❌ Must allocate secondary ranges at subnet creation
❌ Cannot resize secondary ranges (immutable)
❌ Requires IP address management discipline

For GKE production systems, alias IPs are non-negotiable: they're the foundation of secure, performant pod networking.

Alias IP Ranges & GKE Pods — The Secret to VPC-native Networking ​

Executive Summary ​

What is an Alias IP Range? ​

GKE Pod CIDR Architecture ​

VPC-native Cluster Setup ​

Pod Networking Architecture ​

Traffic Flow: Pod-to-Pod ​

Critical: Primary vs Alias IP Addressing ​

Primary IP (VM Host) ​

Alias IPs (Pods) ​

Firewall Rules with Alias IPs ​

Best Practice: Separate Rules for VMs and Pods ​

Anti-spoofing Checks: The Hidden Protection ​

GKE to Non-GKE Communication ​

Scenario: GKE Pod → Cloud SQL ​

Scenario: GKE Pod → On-premises Database (Interconnect) ​

Advanced: Multiple Alias Ranges Per Node ​

Design Pattern: Shared VPC + Alias IPs ​

Troubleshooting Alias IP Issues ​

Issue 1: Pod Cannot Reach External IP ​

Issue 2: Pod-to-Pod Traffic Blocked ​

Issue 3: IP Address Exhaustion in Pod Subnet ​

Conclusion ​