Skip to content

DNS Forwarding: Cấu hình Upstream Resolvers

Tại sao điều này quan trọng

DNS forwarding là glue layer giữa Cloud DNS và external nameservers (on-premises DNS, ISP resolvers, third-party DNS services).

Real-world scenarios:

Scenario 1: On-Premises Integration
├── On-prem BIND server has corp.internal zone
├── Cloud DNS doesn't know corp.internal
└── Solution: Forwarding zone → send queries to on-prem BIND

Scenario 2: Legacy System Migration
├── Old DNS server (Infoblox) on 192.168.1.10
├── New Cloud DNS chỉ để external
└── During migration: Forwarding zone cho legacy

Scenario 3: ISP DNS Failover
├── Default: 8.8.8.8 (Google's free DNS)
├── Need: Company-specific upstream (corporate proxy)
└── Solution: Forwarding zone → corporate resolver

Scenario 4: Conditional Routing
├── Queries cho *.internal → forward to on-prem
├── Queries cho *.example.com → use public DNS
└── Different routing rules per domain

Impact của sai implementation:

  • DNS timeouts (forwarding target unreachable)
  • Infinite loops (forwarder forwards back to self)
  • Performance degradation (unnecessary hops)
  • Compliance breaches (queries not logged)

Forwarding Zone Fundamentals

Concept

Forwarding zone = conditional query redirection.

Normal Resolution:
  Query: unknown.domain.com
  └── Cloud DNS checks local zones
      → Not found
      → Forwards to public DNS (default)
      → Returns result

Forwarding Zone:
  Query: corp.internal (matched by forwarding zone)
  └── Cloud DNS checks forwarding zone config
      → Matched!
      → Forwards to specific upstream (192.168.1.10)
      → Returns result

  Query: api.example.com (not matched)
  └── Cloud DNS proceeds normally

Types of Forwarding

TypeTargetUse CaseExample
Internal ForwardingOn-prem DNSHybrid cloudcorp.internal → 192.168.1.10
External ForwardingPublic DNSConditional routingprivate.service → 10.0.1.100
Chained ForwardingAnother GCP zoneMulti-projectzone-a → peering to zone-b

Implementation: Internal Forwarding (On-Prem)

Architecture

Cloud DNS (default resolver):
├── Zone corp.internal (forwarding zone)
├── Target: 192.168.1.10 (on-prem BIND)
└── Applies to: VPC queries

VPC Resource (10.0.1.5):
  Query: server.corp.internal

  VPC Resolver (169.254.169.254)

  Check local zones → forwarding zone "corp.internal" matched

  Forward to 192.168.1.10

  On-prem BIND responds: 192.168.1.100

  Result: 192.168.1.100

Setup: GCP Command

bash
# Create forwarding zone
gcloud dns managed-zones create onprem-corp \
  --dns-name=corp.internal \
  --visibility=private \
  --networks=projects/PROJECT_ID/global/networks/default

# Add forwarding target
gcloud dns managed-zones update onprem-corp \
  --inbound-forwarding-servers=192.168.1.10

# Optional: Add secondary/tertiary for HA
gcloud dns managed-zones update onprem-corp \
  --inbound-forwarding-servers=192.168.1.10,192.168.1.11,192.168.1.12

Setup: Terraform

hcl
resource "google_dns_managed_zone" "onprem_corp" {
  name        = "onprem-corp-zone"
  dns_name    = "corp.internal."
  description = "Forwarding to on-premises BIND"
  visibility  = "private"
  
  private_visibility_config {
    networks_list {
      network_url = google_compute_network.default.id
    }
  }
  
  # Forwarding configuration
  forwarding_config {
    target_nameservers {
      ipv4_address = "192.168.1.10"
    }
    target_nameservers {
      ipv4_address = "192.168.1.11"
    }
  }
}

# Ensure network connectivity
resource "google_compute_firewall" "allow_dns_to_onprem" {
  name      = "allow-dns-to-onprem"
  network   = google_compute_network.default.id
  direction = "EGRESS"
  
  allow {
    protocol = "udp"
    ports    = ["53"]
  }
  
  destination_ranges = ["192.168.1.0/24"]
}

Failover & Retry Logic

GCP Cloud DNS automatically handles forwarder failover:

Primary: 192.168.1.10 (on-prem BIND 1)
Secondary: 192.168.1.11 (on-prem BIND 2)

Query: server.corp.internal

Step 1: Try primary (192.168.1.10)
  → Timeout/unreachable
  
Step 2: Retry: Try secondary (192.168.1.11)
  → Success → Response cached
  
Step 3: Future queries use secondary (retry exponential backoff to primary)

Automatic failover = No manual intervention.

Advanced Forwarding Patterns

Pattern 1: Conditional Routing

Forward different domains to different upstreams:

bash
# On-prem zone → on-prem DNS
gcloud dns managed-zones create onprem-forward \
  --dns-name=corp.internal \
  --visibility=private \
  --networks=projects/PROJECT_ID/global/networks/default

gcloud dns managed-zones update onprem-forward \
  --inbound-forwarding-servers=192.168.1.10

# External provider → external resolver
gcloud dns managed-zones create provider-forward \
  --dns-name=external.provider.com \
  --visibility=private \
  --networks=projects/PROJECT_ID/global/networks/default

gcloud dns managed-zones update provider-forward \
  --inbound-forwarding-servers=8.8.8.8

Result:

Query: server.corp.internal → 192.168.1.10 (on-prem)
Query: api.external.provider.com → 8.8.8.8 (external)
Query: service.example.com → default public DNS

Pattern 2: Chained Forwarding (Multi-Hop)

A forwards to B, B forwards to C:

Cloud DNS (GCP):
  Zone corp.internal → Forward to 192.168.1.10

On-Prem BIND (192.168.1.10):
  Zone corp.internal
  Zone legacy.corp → Forward to 192.168.1.20

Firewall (192.168.1.20):
  Zone legacy.corp

Query Path:

Cloud DNS query: server.legacy.corp
  → Forward to 192.168.1.10 (on-prem BIND)
  → BIND query: server.legacy.corp
  → BIND matches Zone legacy.corp → Forward to 192.168.1.20
  → Firewall responds: 192.168.1.200
  → Response back through chain
  
Total hops: 3
Total latency: ~20-30ms (acceptable)

Recommendation: Limit to 2-3 hops max (avoid cascading failures).

Pattern 3: Load Balancing Across Forwarders

Multiple forwarders, GCP load-balances:

bash
gcloud dns managed-zones create balanced-forward \
  --dns-name=distributed.internal \
  --visibility=private \
  --networks=projects/PROJECT_ID/global/networks/default

gcloud dns managed-zones update balanced-forward \
  --inbound-forwarding-servers=192.168.1.10,192.168.1.11,192.168.1.12

GCP distributes queries:

100 queries
  → 33 to 192.168.1.10
  → 33 to 192.168.1.11
  → 34 to 192.168.1.12

Benefit: Load spreads, single DNS server doesn't bottleneck.

Integration with Peering + Forwarding

Hub-Spoke + On-Prem

Hub VPC:
├── Zone corp.internal (forwarding) → 192.168.1.10 (on-prem)
└── Zone gcp.services (local)

Spoke VPC:
├── Query corp.internal
  → Forwarding zone matches
  → Forward to hub (via peering)
  → Hub forwards to on-prem
  → Answer returns to spoke
  
Result: Spoke can resolve on-prem domains through hub.

Terraform: Complete Hub-Spoke + On-Prem

hcl
# Hub Project
resource "google_dns_managed_zone" "hub_onprem" {
  project  = "hub-project"
  name     = "hub-onprem"
  dns_name = "corp.internal."
  
  private_visibility_config {
    networks_list {
      network_url = google_compute_network.hub_vpc.id
    }
  }
  
  forwarding_config {
    target_nameservers {
      ipv4_address = "192.168.1.10"
    }
  }
}

resource "google_dns_inbound_forwarding_policy" "hub" {
  project = "hub-project"
  name    = "hub-inbound"
  
  networks {
    network_url = google_compute_network.hub_vpc.id
  }
  
  inbound_forwarding_servers {
    address = "129.154.169.252"
  }
}

# Spoke Project
resource "google_dns_managed_zone" "spoke_onprem_peered" {
  project  = "spoke-project"
  name     = "spoke-onprem-peered"
  dns_name = "corp.internal."
  
  private_visibility_config {
    networks_list {
      network_url = google_compute_network.spoke_vpc.id
    }
  }
  
  peering_config {
    target_network {
      network_url = "projects/hub-project/global/networks/hub-vpc"
    }
  }
}

Troubleshooting Forwarding

Issue 1: Timeout Resolving Forwarded Domain

bash
# From GCP VM:
nslookup server.corp.internal
# Result: timeout

# Debug:
1. Check forwarding zone exists
   gcloud dns managed-zones describe onprem-corp

2. Check forwarder IP is reachable
   gcloud compute ssh VM --zone=us-central1-a --tunnel-through-iap
   nc -zv 192.168.1.10 53  # Should succeed

3. Check firewall allows outbound DNS
   gcloud compute firewall-rules list --filter="direction:EGRESS"

4. Verify on-prem DNS server is listening
   On-prem console: netstat -an | grep 53
   Should show: 0.0.0.0:53 LISTEN

5. Test directly from VM
   dig @192.168.1.10 server.corp.internal

Issue 2: Infinite Loop (Forwarder Points Back)

Setup (WRONG):
  Cloud DNS: corp.internal → forward to 192.168.1.10
  On-prem (192.168.1.10): corp.internal → forward to Cloud DNS
  
Result:
  Query loops between Cloud DNS and on-prem
  
Debug:
  tcpdump -i eth0 -nn 'udp port 53'
  See queries going back-and-forth infinitely
  
Solution:
  Change forwarding target OR change on-prem config
  Ensure: Forwarder doesn't point back to self

Issue 3: Forwarder Down, Queries Hang

Setup:
  corp.internal → forward to 192.168.1.10 (only target)
  
If 192.168.1.10 is down:
  → All corp.internal queries timeout
  → Applications hang (waiting for DNS)
  
Solution:
  1. Add multiple forwarders
     gcloud dns managed-zones update onprem-corp \
       --inbound-forwarding-servers=192.168.1.10,192.168.1.11
  
  2. Monitor forwarder health
     Implement health checks, alerting
  
  3. Failover to alternative resolver
     If all forwarders down, should fallback to public DNS

Performance & Scalability

Latency Impact

Local zone (no forwarding):       ~2ms
Forwarded zone (nearby):          ~5-10ms
Forwarded zone (far):             ~20-50ms
Chained forwarding (2-3 hops):    ~50-100ms

Rule of thumb: If forwarder latency > 100ms, consider local caching or zone replication.

Query Rate Limits

If 10,000 queries/sec forwarded to on-prem BIND:
  → BIND might be overwhelmed
  
Scaling strategies:
  1. Multiple forwarders (load-balance)
  2. Local caching (reduce upstream load)
  3. Upgrade forwarder hardware
  4. Consider zone replication instead of forwarding

Cache Behavior

Query: server.corp.internal (TTL=300)

First query:
  Cloud DNS → forward to on-prem → response cached (300s)

Subsequent queries (within 300s):
  Cloud DNS → return cached result (no forwarding)

After 300s:
  Cache expires → forward again

Implication: Forwarding impact is per-unique-record-once-per-TTL, not per-query.

Security Considerations

DNSSEC with Forwarding

If forwarding zone uses DNSSEC:
  DNSSEC validation should happen at forwarder (on-prem)
  Cloud DNS passes-through DNSSEC records
  
If forwarding zone doesn't use DNSSEC:
  Risk: Forwarder could be hijacked
  
Recommendation:
  ✓ Use VPN/Interconnect (encrypted channel)
  ✓ Validate forwarder TLS certificate (if HTTPS DNS)

Access Control

bash
# Who can modify forwarding zones?
gcloud dns managed-zones add-iam-policy-binding onprem-corp \
  --member=group:dns-admins@company.com \
  --role=roles/dns.admin

# Restrict to specific team
gcloud dns managed-zones add-iam-policy-binding onprem-corp \
  --member=serviceAccount:automation@project.iam.gserviceaccount.com \
  --role=roles/dns.recordAdmin

Monitoring Forwarded Queries

bash
# Enable logging
gcloud dns policies create forward-log \
  --description="Log forwarded queries" \
  --enable-logging \
  --log-network=projects/PROJECT/global/networks/default

# View logs
gcloud logging read 'resource.type=dns_forward' --limit=50

Best Practices

  1. Always add secondary forwarders (HA)
  2. Test connectivity before production deployment
  3. Monitor forwarder health continuously
  4. Limit forwarding hops (max 3)
  5. Use VPN/Interconnect for secure forwarding
  6. Document forwarder targets (which team manages on-prem DNS?)
  7. Implement alerting if forwarder becomes unreachable
  8. Plan for failover (what if on-prem is down?)

Anti-Patterns

Single forwarder (no HA)Circular forwarding (A→B→A) ❌ Too many hops (performance degradation) ❌ No monitoring (blind spot when breaks) ❌ Unencrypted forwarding (security risk) ❌ No TTL consideration (cache miss optimization)

References