Skip to content

Cloud DNS Managed Zones: Public vs Private

Tại sao điều này quan trọng trong production

Managed zone là atomic unit của Cloud DNS—tất cả records, policies, permissions, logging đều attached tại zone level. Thiết kế zone hierarchy sai ở đầu có hậu quả cascade:

  • Blast radius: Một zone delete = tất cả records mất
  • Permission model: Zone-level IAM binding, không granular tới record level
  • Billing & quota: Tính phí per zone, quota enforcement per zone
  • Split-horizon architecture: Public + private zones cho same domain = complex management
  • Multi-project sharing: Via peering, không via direct zone sharing
  • Operational overhead: Synchronization giữa multiple zones (nếu cần), validation

Real-world: Công ty manage 500+ zones (development, staging, production, external services). Mỗi zone là risk point. Sai design = outages, costly troubleshooting.

Managed Zone Fundamentals

Trong Cloud DNS, managed zone là một collection of DNS records cho một specific domain:

Managed Zone "production.example.com" {
  Zone Name: "production.example.com"
  Records: 
    - service.production.example.com A 10.0.1.5
    - api.production.example.com A 10.0.2.100
    - db.production.example.com A 10.0.3.50
  Type: Public | Private
  Visibility: internet-wide | VPC-specific
  Nameservers: ns-*.googledomains.com (auto-managed)
}

GCP fully manages:

  • Nameserver infrastructure (globally distributed)
  • Zone propagation (automatic)
  • DNSSEC signing (optional)
  • Query handling (anycast routing)

Bạn chỉ manage:

  • Zone creation/deletion
  • Record management (add/update/remove)
  • Access control (IAM)
  • Policies (forwarding, peering, response policies)

Zone Characteristics

PropertyPublic ZonePrivate Zone
VisibilityInternet-wideVPC-specific
Authoritative NameserversGCP global DNS infrastructureGCP VPC-internal resolvers
External ResolutionYes (can be resolved from anywhere)No (only within authorized VPCs)
Use CasePublic domains, website, APIsInternal services, databases
Nameserver UpdateUpdate domain registrarNo registrar needed
Cost$0.20/zone/month, $0.40/million queriesSame

Public Managed Zones

Mục đích

Public managed zones host DNS records cho domains có thể resolve từ bất kỳ đâu trên internet.

External User (8.8.8.8 resolver)
  query: api.example.com?

  GCP Global DNS Servers (anycast 216.239.32.10)

  [Public Zone "example.com"]

  Response: 35.201.100.50 (public IP)

Zone Creation Flow

bash
# Step 1: Create public zone
gcloud dns managed-zones create example-zone \
  --dns-name=example.com \
  --visibility=public

# Step 2: Get nameservers
gcloud dns managed-zones describe example-zone --format="value(nameServers)"
# Output:
# ns-123.googledomains.com.
# ns-456.googledomains.com.
# ns-789.googledomains.com.
# ns-1011.googledomains.com.

# Step 3: Update domain registrar
# Login to registrar (GoDaddy, Namecheap, etc.)
# Update nameservers to: ns-123.googledomains.com, ns-456.googledomains.com, etc.

# Step 4: Add records
gcloud dns record-sets transaction start --zone=example-zone
gcloud dns record-sets transaction add 35.201.100.50 \
  --name=api.example.com \
  --type=A \
  --ttl=300 \
  --zone=example-zone
gcloud dns record-sets transaction execute --zone=example-zone

# Step 5: Verify
dig api.example.com @ns-123.googledomains.com
# Should return 35.201.100.50

Naming Constraints

Public zone names phải conform với domain rules:

Valid:
- example.com
- api.example.com (subdomain)
- staging-api.example.com
- db-001.staging.example.com

Invalid:
- example (no TLD)
- .example.com (leading dot)
- example.com. (trailing dot in gcloud, automatic added)
- EXAMPLE.COM (case doesn't matter, normalized to lowercase)

Important: Zone name phải match domain registrar records. Nếu bạn có domain example.com registered, bạn có thể tạo zone:

  • example.com (root)
  • api.example.com (subdomain zone)
  • staging.example.com (subdomain zone)

Nhưng không thể tạo other.com zone nếu bạn không own other.com domain.

Production Pattern: Zone Hierarchy

Instead of 1 large zone, sử dụng multiple smaller zones:

Production setup:
├── example.com (root, minimal records)
│   ├── A 35.201.100.50 (website)
│   └── MX 10 mail.example.com (email)

├── api.example.com (API zone)
│   ├── A 35.201.100.51
│   ├── api-v1.api.example.com A 10.0.1.100
│   └── api-v2.api.example.com A 10.0.2.100

├── cdn.example.com (CDN zone)
│   └── CNAME *.cdn.example.com to CDN provider

└── staging.example.com (staging)
    └── A 35.201.100.52

Benefit:

  • Permission isolation: api.example.com team chỉ manage api zone
  • Operational safety: Deleting api.example.com zone không affect example.com
  • TTL flexibility: api.example.com có thể TTL=60, staging=300
  • Quota management: Per-zone quota isolation

Private Managed Zones

Mục đích

Private managed zones resolve DNS queries chỉ từ authorized VPCs. Dùng cho internal-only services.

Internal VM (10.0.1.5 in VPC A)
  query: db.internal.example.com?

  GCP VPC Resolver (only within VPC A)

  [Private Zone "internal.example.com" bound to VPC A]

  Response: 10.0.3.50 (private IP)

External User (8.8.8.8 resolver)
  query: db.internal.example.com?

  Cannot resolve (zone not visible)

  Response: NXDOMAIN

Zone Creation & VPC Binding

bash
# Step 1: Create private zone
gcloud dns managed-zones create internal-zone \
  --dns-name=internal.example.com \
  --visibility=private \
  --networks=projects/PROJECT_ID/global/networks/VPC_NAME

# Step 2: Add records
gcloud dns record-sets transaction start --zone=internal-zone
gcloud dns record-sets transaction add 10.0.3.50 \
  --name=database.internal.example.com \
  --type=A \
  --ttl=60 \
  --zone=internal-zone
gcloud dns record-sets transaction execute --zone=internal-zone

# Step 3: Bind to additional VPCs (if needed)
gcloud dns managed-zones update internal-zone \
  --networks=projects/PROJECT_ID/global/networks/VPC_NAME_2

VPC Binding Semantics

Private zone attachment có precise semantics:

Zone "db.internal" bound to: [VPC A, VPC B]

VPC A resources: Can resolve db.internal ✓
VPC B resources: Can resolve db.internal ✓
VPC C resources: Cannot resolve db.internal ✗
VPC D (different project): Cannot resolve db.internal ✗

Cross-project binding: Requires DNS peering (separate feature), không direct zone sharing.

Implication: Multi-VPC Architectures

Sử dụng shared VPC (recommended):

Host Project (central DNS)
├── Shared VPC
├── Private Zone "internal.prod.example.com"
│   └── Bound to: Shared VPC (all service projects can resolve)

Service Project A
├── VMs/GKE in shared VPC → Can resolve internal.prod.example.com

Service Project B
├── VMs/GKE in shared VPC → Can resolve internal.prod.example.com

Tanpa shared VPC (not recommended, complex):

Project A: VPC A
├── Private Zone "a-internal.example.com" (bound to VPC A)
Project B: VPC B
├── Private Zone "b-internal.example.com" (bound to VPC B)

Need DNS peering untuk cross-project resolution.

Zone Naming Best Practices

Convention 1: Environment Separation

Production: prod.example.com, api.prod.example.com
Staging: staging.example.com, api.staging.example.com
Development: dev.example.com, api.dev.example.com

Benefit: Clear environment isolation, easy permission boundaries.

Convention 2: Team/Service Separation

Auth team: auth.internal.example.com
Backend team: backend.internal.example.com
Data team: data.internal.example.com
Infra team: infra.internal.example.com

Benefit: Per-team DNS management, clear ownership.

Convention 3: Geographical Zones (Multi-region)

us-central1: us-c1.example.com
us-east1: us-e1.example.com
europe-west1: eu-w1.example.com

Benefit: Region-specific failover, load balancing decisions.

example.com (root)
├── api.prod.example.com (API production, team-agnostic)
├── backend.prod.example.com (Backend services, team-owned)
├── data.prod.example.com (Data services, team-owned)
└── internal.prod.example.com (Internal tools, all teams)

Zone Access Control (IAM)

Permissions attach at zone level, tidak per-record:

bash
# Grant team read-only access
gcloud dns managed-zones add-iam-policy-binding api-prod-zone \
  --member=group:backend-team@company.com \
  --role=roles/dns.reader

# Grant DNS editor role
gcloud dns managed-zones add-iam-policy-binding api-prod-zone \
  --member=group:dns-admins@company.com \
  --role=roles/dns.admin

# Grant only record creation (no zone delete)
gcloud dns managed-zones add-iam-policy-binding api-prod-zone \
  --member=serviceAccount:automation@project.iam.gserviceaccount.com \
  --role=roles/dns.recordAdmin

Key roles:

  • roles/dns.admin: Full control (create/delete/modify)
  • roles/dns.recordAdmin: Only modify records (cannot delete zone)
  • roles/dns.reader: Read-only queries

Production pattern:

  • Developers: dns.recordAdmin (modify records)
  • Platform engineers: dns.admin (manage zones)
  • Automation: dns.recordAdmin (Terraform)

Quotas & Limits

Cloud DNS quotas per zone:

QuotaLimitImplication
Records per zone100,000Usually enough, but large deployments caution
Record sets per zone100,000Same as above
Queries per second (QPS)Burst: 1000 QPS, sustained: burst/zoneRate limiting if exceeded
Nameserver changesPropagate in secondsImmediate effect (unlike traditional DNS)

Example: Jika 1 zone hit rate limit, query ke other zones tidak affected (isolation).

Monitoring quota:

bash
gcloud compute project-info describe --project=PROJECT_ID --format='value(quotas[name:QUOTA_NAME].limit)'

# To monitor query rate:
# Use Cloud Monitoring → DNS metrics

Zone Deletion & Recovery

Soft Delete Mechanics

Deleting zone:

bash
gcloud dns managed-zones delete zone-name
# Zone removed immediately from console
# Records stop resolving within seconds
# BUT: Zone data retained in Cloud Audit Logs

Recovery (within 30 days):

bash
# If zone deleted, need recreate with same name + restore records
# (No automatic recovery like Google Cloud Storage)

Production implication: Always backup zone records (via Terraform, Ansible, scripts).

bash
# Backup zone records
gcloud dns record-sets list --zone=zone-name > zone-backup.txt

# Or via Terraform (recommended)
resource "google_dns_managed_zone" "prod" {
  name        = "prod-zone"
  dns_name    = "prod.example.com"
  description = "Production zone - IaC managed"
}
resource "google_dns_record_set" "api" {
  name = "api.prod.example.com"
  type = "A"
  ttl  = 300
  rrdatas = ["10.0.1.5"]
  managed_zone = google_dns_managed_zone.prod.name
}

Common Mistakes

❌ Mistake 1: Single zone cho mọi services

Zone "example.com" contains:
  - api records
  - database records
  - internal tools records
  - staging records

Problem: Cannot isolate permission, cannot delete staging records without affecting production.

Solution: Multiple zones (api.example.com, db.example.com, staging.example.com).

❌ Mistake 2: Public zones cho internal services

Private Zone: None
Public Zone "example.com" contains:
  - api.example.com → 10.0.1.5 (PRIVATE IP exposed publicly!)

Problem: Private IPs leaked to internet, DNS resolution works từ inside but also outside.

Solution: Use private zones cho internal-only services.

❌ Mistake 3: Not backing up zones

Zone deleted by accident → Cannot recover

Solution: Terraform IaC, automated backups.

❌ Mistake 4: Wrong TTL

database.internal.example.com TTL=0
→ Every query hits authoritative nameserver
→ Latency spike, possible rate limiting

Solution: TTL=300 (5 min) for stable records, TTL=60 for changing records.

Zone Naming & Resolver Behavior

Subdomain Delegation (important for scale)

Jika zone "example.com" exist, bagaimana resolver handle query "api.example.com"?

Scenario 1: api.example.com là A record dalam zone

Zone "example.com" {
  api.example.com A 10.0.1.5
}

Query: api.example.com?
→ Resolve dalam zone → 10.0.1.5

Scenario 2: api.example.com adalah separate zone (delegated)

Zone "example.com" {
  api.example.com NS ns-1.googledomains.com  # Delegation
}

Zone "api.example.com" {
  api.example.com A 10.0.1.5
}

Query: api.example.com?
→ example.com zone says: "ask ns-1.googledomains.com"
→ ns-1 returns: 10.0.1.5

Production impact: Subdomain delegation adds one DNS hop (slightly more latency).

Tradeoff:

  • Single zone: Simpler, faster, but less permission granularity
  • Multiple zones: More complex, slightly more latency, better isolation

Recommendation: Use multiple zones untuk large organizations, single zone untuk small.

Production Patterns

Pattern 1: Service-Oriented Zones

Each service team owns 1 zone:
- auth-team: auth.internal.example.com
- backend-team: api.internal.example.com
- data-team: db.internal.example.com

Benefits:
✓ Clear ownership
✓ Independent modification
✓ Permission isolation
✓ Easy to add/remove team services

Pattern 2: Environment-Separated Zones

prod.example.com (all prod services)
staging.example.com (all staging services)
dev.example.com (all dev services)

Benefits:
✓ Easy environment promotion (copy records)
✓ Different TTLs per environment
✓ Clear audit boundaries

Pattern 3: Hybrid Public-Private

Zone "example.com" (public, minimal)
  - example.com A 35.201.100.50 (website)
  - api.example.com A 35.201.100.51 (public API)

Zone "internal.example.com" (private, internal-only)
  - database.internal.example.com A 10.0.3.50
  - cache.internal.example.com A 10.0.3.51
  - admin.internal.example.com A 10.0.1.100

Benefits:
✓ Clear public/private boundary
✓ Public records cannot accidentally expose private IPs
✓ Different permission models

References