Auto-mode vs Custom-mode VPC — Chọn Đúng Architecture

Summary

Auto-mode VPC: Easy default, poor for production Custom-mode VPC: Requires planning, mandatory for production

Feature	Auto-mode	Custom-mode
CIDR Block	Fixed 10.128.0.0/9	Your choice (10.0.0.0/8, etc.)
Subnets Created	Auto-created per region	Manual, on-demand
Subnet CIDRs	10.128.0.0/20 (us-west1), 10.129.0.0/20 (us-central1), etc.	You define
Peering Support	Limited (10.128.0.0/9 block overlap with other autos)	Full
Secondary Ranges	Can add	Required for GKE
Production Ready	❌ No	✅ Yes

Production use custom-mode exclusively.

Auto-mode VPC: "Convenient but Dangerous"

Ketika membuat VPC tanpa specify mode:

gcloud compute networks create my-network
# Hasil: auto-mode VPC created
# CIDR: 10.128.0.0/9 (fixed)
# Subnets: Auto-created in ALL regions

GCP automatically creates subnet untuk setiap region dengan pre-allocated CIDRs:

Auto-mode CIDR Block: 10.128.0.0/9 (2M IPs)
├── us-west1:         10.128.0.0/20 (4096 IPs)
├── us-central1:      10.129.0.0/20 (4096 IPs)
├── us-east1:         10.130.0.0/20 (4096 IPs)
├── europe-west1:     10.131.0.0/20 (4096 IPs)
├── asia-northeast1:  10.132.0.0/20 (4096 IPs)
└── ... (38 more regions)

Problem 1: CIDR Block is Fixed & Shared

Semua auto-mode VPCs di organisasi share same 10.128.0.0/9 block:

Organization:
├── Project "team-a"
│   └── VPC "team-a-network" (auto-mode → 10.128.0.0/9)
│
└── Project "team-b"
    └── VPC "team-b-network" (auto-mode → 10.128.0.0/9)

Issue: Kedua VPC menggunakan exact same CIDR!

Implication: Cannot peer auto-mode VPCs:

gcloud compute networks peerings create team-a-to-b \
  --network=team-a-network \
  --auto-create-routes \
  --network-peering-name=team-b-network

# Error: "Cannot peer two auto-mode networks because
# subnet ranges overlap (10.128.0.0/9)"

Perbaikan: Disable auto-mode untuk peering:

# Tidak efisien - harus remove setiap subnet manually
gcloud compute networks subnets delete subnet-us-west1 --region us-west1
gcloud compute networks subnets delete subnet-us-central1 --region us-central1
# ... repeat untuk 40+ regions

# Sekarang CIDRs available untuk custom allocation

Problem 2: Fixed Subnet Size (4096 IPs)

Setiap auto-mode subnet = /20 (4096 IPs):

Subnet us-west1 (auto-mode): 10.128.0.0/20
├── Usable IPs: 4094 (4096 - 2 reserved - 1 gateway)
├── With GKE: ~4000 pods max (if using pods CIDRs)
├── Real scenario: 1000 pods = ~2500 IPs used
└── Remaining: ~1500 free IPs

Reality: Setelah 2-3 clusters, IP exhaustion!

Scaling issue: Tidak bisa resize subnet. Jika butuh lebih banyak:

❌ Cannot: gcloud compute networks subnets resize...
✅ Must: Create new custom-mode VPC, migrate everything

Problem 3: Secondary Ranges Complexity

Auto-mode tidak langsung support secondary ranges. Harus tambah manually:

# Auto-mode subnet (us-west1): 10.128.0.0/20
gcloud compute networks subnets update subnet-us-west1 \
  --secondary-range pods=10.192.0.0/16
  
Issue: Hardcoded allocation - secondary range = 10.192.0.0/16
        untuk US ONLY, 10.193.0.0/16 untuk EU, etc.

Tidak bisa customize nama atau CIDR blocks per use case.

Problem 4: Difficult Migration Path

Auto-mode ke custom-mode migration = painful:

Step 1: Create new custom-mode VPC
Step 2: Create subnets dengan custom CIDR layout
Step 3: Create disk snapshots of all VMs
Step 4: Create new VMs dengan custom-mode VPC
Step 5: Restore data
Step 6: Update DNS/load-balancers
Step 7: Delete old auto-mode VPC
Step 8: Deprecate old IPs in firewall rules

Duration: Days, not hours
Risk: High (DNS inconsistency, data sync issues)

Custom-mode VPC: Production-Grade

Custom-mode memungkinkan full control:

gcloud compute networks create prod-network \
  --subnet-mode=custom

# Tidak ada subnets created otomatis!
# Anda yang define semuanya

Benefit 1: CIDR Flexibility

Choose any RFC 1918 range atau public IPs:

Opsi 1: 10.0.0.0/8 (16M IPs) - untuk most organizations
Opsi 2: 172.16.0.0/12 (1M IPs) - untuk smaller setups
Opsi 3: 192.168.0.0/16 (65K IPs) - untuk very small labs
Opsi 4: Custom public IPs (jika punya own AS number)

Benefit 2: Per-Region Subnet Planning

Custom-mode VPC: 10.0.0.0/8 (GLOBAL)

Regions:
├── Americas:
│   ├── us-west1:   10.0.0.0/14 (262K IPs per zone)
│   ├── us-central1: 10.4.0.0/14
│   └── us-east1:    10.8.0.0/14
├── Europe:
│   ├── europe-west1: 10.32.0.0/14
│   └── europe-north1: 10.36.0.0/14
└── Asia:
    ├── asia-northeast1: 10.64.0.0/14
    └── asia-southeast1: 10.68.0.0/14

Flexibility: Jika us-west1 butuh lebih banyak:

Current: 10.0.0.0/14 (262K)
New request: 10.0.0.0/13 (524K)

Reallocation:
  Step 1: Check for conflicts (10.0.0.0/13 includes existing ranges)
  Step 2: Add new secondary range 10.2.0.0/15
  Step 3: Update GKE cluster to use new range
  Step 4: Monitor, verify
  Step 5: Old range becomes deprecated, unused

Benefit 3: Secondary Ranges at Design Time

Define semuanya upfront:

Subnet "us-west1-prod":
  Primary CIDR: 10.0.0.0/20 (VMs)
  Secondary CIDR 1: 10.2.0.0/16 (GKE pods)
  Secondary CIDR 2: 10.3.0.0/16 (GKE services)

Implication: Design traffic untuk pods ≠ VMs at network level:

Firewall rule-01: allow ingress tcp:80 from tag:lb to pods (10.2.0.0/16)
Firewall rule-02: allow ingress tcp:3306 from pods (10.2.0.0/16) to tag:db
Firewall rule-03: deny all other to VMs (10.0.0.0/8)

Benefit 4: Peering Enablement

Multiple custom-mode VPCs can peer freely:

VPC "team-a": 10.0.0.0/8
VPC "team-b": 172.16.0.0/12

Peering "team-a-to-b":
  Network A subnets (10.0.0.0/8) ↔ Network B subnets (172.16.0.0/12)
  Status: ✓ Can peer (no CIDR overlap)
  
Result: Cross-VPC connectivity without public IPs

Detailed Comparison: Auto vs Custom

Scenario: GKE Deployment

Auto-mode Cluster Setup:

1. Create auto-mode VPC
   gcloud compute networks create test-network

2. GKE cluster creation
   gcloud container clusters create my-cluster \
     --network=test-network \
     --region=us-west1
   
   GCP allocates:
   - Cluster nodes use: 10.128.0.0/20 (primary)
   - Pods use: 10.192.0.0/16 (secondary, hardcoded)
   - Services use: 10.195.0.0/16 (hardcoded)
   
3. After 2 months: Running 500 pods
   - Pod CIDR usage: ~500 IPs from 10.192.0.0/16
   - Available: 65K IPs
   - Scale: Can support ~120K pods before exhaustion
   
4. After 6 months: 10 clusters
   - Total pod CIDR: 10x 10.192.0.0/16 per region allocation
   - Problem: 10 independent networks, cannot share across clusters

Custom-mode Cluster Setup:

1. Create custom-mode VPC
   gcloud compute networks create prod-network --subnet-mode=custom

2. Create subnet with planned secondaries
   gcloud compute networks subnets create us-west1-pods \
     --network=prod-network \
     --region=us-west1 \
     --range=10.0.0.0/20 \
     --secondary-range pods=10.2.0.0/16,services=10.3.0.0/16

3. GKE cluster
   gcloud container clusters create cluster-1 \
     --network=prod-network \
     --cluster-secondary-range-name pods \
     --services-secondary-range-name services

4. Cluster 2 (different region)
   gcloud compute networks subnets create us-central1-pods \
     --network=prod-network \
     --region=us-central1 \
     --range=10.1.0.0/20 \
     --secondary-range pods=10.4.0.0/16,services=10.5.0.0/16
   
   gcloud container clusters create cluster-2 \
     --network=prod-network \
     --cluster-secondary-range-name pods \
     --services-secondary-range-name services

5. Both clusters on same VPC
   - Pod networking: cluster-1 pods ↔ cluster-2 pods
   - Services: service.cluster1 ↔ service.cluster2 (DNS resolution)
   - Single firewall policy governs both

Winner: Custom-mode (multi-cluster support, shared networking)

Scenario: Org-wide Peering

Auto-mode Attempt:

Org "MyCompany" has 3 teams:

Team A: VPC "team-a" (auto-mode)
  CIDR: 10.128.0.0/9 (fixed)

Team B: VPC "team-b" (auto-mode)
  CIDR: 10.128.0.0/9 (SAME!)

Team C: VPC "team-c" (auto-mode)
  CIDR: 10.128.0.0/9 (SAME!)

Peering Attempt:
  team-a ↔ team-b: ❌ FAIL (CIDR overlap)
  team-b ↔ team-c: ❌ FAIL (CIDR overlap)
  
Solution: Manually disable auto-mode, reconfigure each VPC
  Duration: 1-2 weeks
  Risk: IP conflicts, service interruption

Custom-mode Plan:

Org "MyCompany" with custom-mode:

Team A: VPC "team-a"
  CIDR: 10.0.0.0/12

Team B: VPC "team-b"
  CIDR: 10.16.0.0/12

Team C: VPC "team-c"
  CIDR: 10.32.0.0/12

Peering Matrix:
  team-a ↔ team-b: ✓ OK (10.0.0.0/12 vs 10.16.0.0/12)
  team-b ↔ team-c: ✓ OK (10.16.0.0/12 vs 10.32.0.0/12)
  team-a ↔ team-c: ✓ OK (10.0.0.0/12 vs 10.32.0.0/12)
  
Full mesh: 3 peering connections → all teams can communicate internally

Migration: Auto → Custom

Ketika harus migrate:

Option 1: Parallel Migration (Zero-downtime)

Week 1: Plan
  - Allocate new CIDRs for custom-mode VPC
  - Design secondary ranges
  - Review firewall rules

Week 2: Setup
  - Create custom-mode VPC
  - Create subnets with new CIDRs
  - Create disk snapshots

Week 3-4: Dual-run
  - Spin up new VMs on custom-mode VPC
  - Mirror data from auto-mode VMs
  - Run both in parallel

Week 5: DNS Switch
  - Update DNS to point to custom-mode VMs
  - Verify 100% traffic on new VPC
  - Keep auto-mode as fallback

Week 6: Cleanup
  - Delete auto-mode VMs
  - Remove auto-mode VPC
  - Verify no lingering references

Cost: 2x resources untuk 2 weeks (temporary) Benefit: Zero downtime

Option 2: Maintenance Window (Faster)

Friday 2am - Sunday 6am maintenance window

Step 1: Snapshot all disks (2 hours)
Step 2: Delete auto-mode VPC + subnets (30 min)
Step 3: Create custom-mode VPC + subnets (1 hour)
Step 4: Restore VMs from snapshots (2 hours)
Step 5: Verify networking (1 hour)
Step 6: Notify teams, revert if issues (on standby)

Total: ~6-8 hours downtime

Cost: Lower (single set of resources) Risk: Higher (DNS inconsistency, DNS cache issues)

Best Practices for Custom-mode

1. Design CIDR Layout Before Creating VPC

Template:
VPC: 10.0.0.0/8

Per-region allocation:
├── Region A: 10.0.0.0/11 (2M IPs)
├── Region B: 10.32.0.0/11 (2M IPs)
├── Region C: 10.64.0.0/11 (2M IPs)
├── Region D: 10.96.0.0/11 (2M IPs)
└── Reserve: 10.128.0.0/8 (16M) - future expansion/fallback

Per-zone allocation (within region):
  Region A (10.0.0.0/11):
  ├── AZ-1 VMs: 10.0.0.0/14
  ├── AZ-1 pods: 10.0.16.0/12
  ├── AZ-2 VMs: 10.4.0.0/14
  ├── AZ-2 pods: 10.4.16.0/12
  └── ...

Document dalam spreadsheet, version control.

2. Reserve Secondary Ranges at Subnet Creation

# Wrong - no secondary ranges
gcloud compute networks subnets create prod-us-west1 \
  --network=prod \
  --region=us-west1 \
  --range=10.0.0.0/20

# Right - define all secondaries upfront
gcloud compute networks subnets create prod-us-west1 \
  --network=prod \
  --region=us-west1 \
  --range=10.0.0.0/20 \
  --secondary-range \
    pods=10.2.0.0/16 \
    services=10.3.0.0/16 \
    canary=10.4.0.0/16

Memudahkan future GKE clusters, canary deployments.

3. Document Firewall Rules Against CIDR Semantics

# Bad documentation
rule-001: allow egress to 10.0.0.0/8

# Good documentation
rule-001: allow egress to prod-pods-secondary-range (10.2.0.0/16)
          Reason: GKE pods need to reach external APIs
          Exceptions: 10.3.0.0/16 (services) - internal only

# Better - use firewall policies
policy-prod:
  rules:
    - name: allow-pods-egress
      direction: EGRESS
      destination: 10.2.0.0/16
      description: GKE pods (secondary range)

4. Test Peering Before Org-wide Rollout

Pilot phase:
  Team A (10.0.0.0/12) ↔ Team B (10.16.0.0/12) peering

Verification:
  - VM in team-a can ping VM in team-b
  - DNS resolution works across VPCs
  - Firewall rules don't block unexpectedly
  - Routing table consistency after propagation

If issues: Fix before team-c, team-d join

When Auto-mode is Acceptable

Development/Lab environments only
- Short-lived clusters
- No peering needed
- Rapid prototyping
Single-region GKE clusters (beta only)
- Cluster only uses local subnets
- No cross-region communication
- Acceptable risk for non-production
Educational purposes (tutorials, codelab)
- Learn GCP basics
- Disposable infrastructure
- Easy cleanup

DO NOT use auto-mode for:

Production systems ❌
Multi-region deployments ❌
Enterprise org-wide peering ❌
Any system requiring IP predictability ❌

Conclusion

Auto-mode VPC = training wheels that must come off.

Custom-mode VPC = required for:

✅ Production systems
✅ Multi-region scaling
✅ Peering/networking
✅ GKE at any scale
✅ Firewall policy consistency

Planning overhead (few hours) << recovery cost of auto-mode mistakes (days).

Always start with custom-mode. Your future self will thank you.

Auto-mode vs Custom-mode VPC — Chọn Đúng Architecture ​

Summary ​

Auto-mode VPC: "Convenient but Dangerous" ​

Problem 1: CIDR Block is Fixed & Shared ​

Problem 2: Fixed Subnet Size (4096 IPs) ​

Problem 3: Secondary Ranges Complexity ​

Problem 4: Difficult Migration Path ​

Custom-mode VPC: Production-Grade ​

Benefit 1: CIDR Flexibility ​

Benefit 2: Per-Region Subnet Planning ​

Benefit 3: Secondary Ranges at Design Time ​

Benefit 4: Peering Enablement ​

Detailed Comparison: Auto vs Custom ​

Scenario: GKE Deployment ​

Auto-mode Cluster Setup: ​

Custom-mode Cluster Setup: ​

Scenario: Org-wide Peering ​

Auto-mode Attempt: ​

Custom-mode Plan: ​

Migration: Auto → Custom ​

Option 1: Parallel Migration (Zero-downtime) ​

Option 2: Maintenance Window (Faster) ​

Best Practices for Custom-mode ​

1. Design CIDR Layout Before Creating VPC ​

2. Reserve Secondary Ranges at Subnet Creation ​

3. Document Firewall Rules Against CIDR Semantics ​

4. Test Peering Before Org-wide Rollout ​

When Auto-mode is Acceptable ​

Conclusion ​