Skip to content

DNS Resolution Path: Pod → NodeLocal → Cloud DNS → Upstream

Tại sao điều này quan trọng

Hiểu complete DNS resolution path là critical để debug "service not reachable" issues. 70% của cluster issues có DNS component.

Complete Resolution Flow

Step 1: Pod Issues Query

bash
Pod (10.4.1.5):
  curl api.default.svc.cluster.local

  /etc/resolv.conf:
    nameserver 169.250.169.254
    search api.default.svc.cluster.local svc.cluster.local
    options ndots:5 timeout:2 attempts:2

Step 2: ndots Logic (Search Domain)

Query: api

Check ndots=5:
  "api" has 0 dots → Try with search domains

Try sequence:
  1. api.api.default.svc.cluster.local → NXDOMAIN
  2. api.default.svc.cluster.local → FOUND (10.4.0.50)
  3. Stop searching

Query: api.default.svc.cluster.local

Check ndots=5:
  "api.default.svc.cluster.local" has 4 dots → Try as-is first

Try sequence:
  1. api.default.svc.cluster.local (as-is) → FOUND (10.4.0.50)
  
Query: api.external.example.com

Check ndots=5:
  "api.external.example.com" has 2 dots → Try as-is first

Try sequence:
  1. api.external.example.com (as-is) → Forward to upstream

Step 3: NodeLocal DNSCache

Pod query reaches: 169.250.169.254:53

NodeLocal cache:
  ├─ Check cache for "api.default.svc.cluster.local"
  ├─ If HIT: Return cached result (2.1ms latency)
  ├─ If MISS: Forward to CoreDNS (10.4.0.11:53)
  │   ├─ CoreDNS resolves
  │   ├─ Return result to NodeLocal
  │   └─ NodeLocal caches result (TTL 30s)
  └─ Return result to pod

Step 4: CoreDNS (If Not Cached)

CoreDNS (10.4.0.11):
  Query: api.default.svc.cluster.local
  
  Corefile rules (in order):
    1. Cache plugin: Check cache
    2. Kubernetes plugin: Query Kubernetes API
       ├─ Service "api" in namespace "default" → ClusterIP 10.4.0.50
    3. Return A 10.4.0.50
    4. Cache result (TTL 30s)
    5. Send to NodeLocal

Step 5: External Queries (If Needed)

Query: example.com

CoreDNS:
  1. Kubernetes plugin: Not internal → Skip
  2. Forward plugin: Send to upstream
  
Upstream chain:
  ├─ VPC Resolver (169.254.169.254)
  │   ├─ Check private zones (prod.internal.com, etc.)
  │   └─ Not found → Forward to next
  ├─ Cloud DNS (auto-configured)
  │   ├─ Check public zones (example.com)
  │   └─ If not found → Forward to next
  └─ Public DNS (8.8.8.8)
      ├─ Recursive resolution
      └─ Return result

End-to-End Timing

Total latency breakdown for internal query (10.4.1.5):

  1. Pod → NodeLocal (local)         0.1ms
  2. NodeLocal lookup/forward        0.5ms
  3. NodeLocal → CoreDNS             0.2ms
  4. CoreDNS Kubernetes API query    1.5ms
  5. CoreDNS → NodeLocal             0.2ms
  6. NodeLocal → Pod                 0.1ms
  7. Pod processes result            0.5ms
  ────────────────────────────────
  Total (first query):               3.1ms
  Total (cached query):              0.5ms

Troubleshooting at Each Layer

Layer 1: Pod /etc/resolv.conf

bash
# Check pod's resolver config
kubectl exec POD_NAME -- cat /etc/resolv.conf

# Expected (with NodeLocal):
  nameserver 169.250.169.254
  search api.default.svc.cluster.local default.svc.cluster.local
  options ndots:5 timeout:2 attempts:2

# Expected (without NodeLocal):
  nameserver 10.4.0.11  (CoreDNS)
  search ...

Layer 2: Pod→NodeLocal Connectivity

bash
# From pod
kubectl exec POD_NAME -- nslookup kubernetes.default 169.250.169.254

# If fails: Port 53 blocked or NodeLocal not running
# Check:
kubectl get pods -n kube-system -l k8s-app=node-local-dns

Layer 3: NodeLocal Cache

bash
# From NodeLocal pod
kubectl exec -n kube-system POD_NAME -- /bin/sh
  # Inside
  nslookup api.default.svc.cluster.local
  
  # Check Corefile
  cat /etc/coredns/Corefile
  
  # Logs
  tail -100 /var/log/coredns.log

Layer 4: CoreDNS Resolution

bash
# From CoreDNS pod
kubectl exec -n kube-system COREDNS_POD -- /bin/sh
  # Check service exists
  kubectl get svc -n default api
  
  # Query Kubernetes API
  nslookup api.default.svc.cluster.local localhost:53
  
  # Check logs
  kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50

Layer 5: Upstream (Cloud DNS)

bash
# From CoreDNS pod (test upstream)
nslookup example.com 8.8.8.8
nslookup example.com 169.254.169.254  # VPC resolver

# Check VPC DNS
gcloud compute instances describe VM_NAME \
  --zone=us-central1-a \
  --format="value(networkInterfaces[0].network)"

Debugging Tools & Techniques

Tool 1: dig (Direct Query)

bash
# Query specific nameserver
dig api.default.svc.cluster.local @169.250.169.254
dig example.com @8.8.8.8

# Trace full resolution path
dig api.default.svc.cluster.local +trace

# Short answer only
dig +short api.default.svc.cluster.local

Tool 2: nslookup

bash
# Simple query
nslookup api.default.svc.cluster.local

# Query specific nameserver
nslookup api.default.svc.cluster.local 169.250.169.254

# Interactive mode
nslookup
> server 8.8.8.8
> api.default.svc.cluster.local

Tool 3: kubectl debug

bash
# Debug node DNS
kubectl debug node/NODE_NAME -it --image=ubuntu
  apt-get update && apt-get install -y dnsutils
  nslookup kubernetes.default
  dig @169.250.169.254 api.default.svc.cluster.local

Tool 4: tcpdump (Packet Capture)

bash
# Capture DNS queries on node
tcpdump -i eth0 -nn 'udp port 53' | head -20

# Log to file
tcpdump -i eth0 -nn 'udp port 53' -w dns-capture.pcap

# Analyze pcap
tcpdump -r dns-capture.pcap -nn 'udp port 53'

Tool 5: strace (System Call Tracing)

bash
# Trace DNS system calls
kubectl debug POD_NAME -it --image=ubuntu -- bash
  apt-get install -y strace dnsutils
  strace -e trace=network nslookup api.default.svc.cluster.local

Common Failure Modes

Failure 1: NXDOMAIN (Not Found)

Resolution: api.default → NXDOMAIN

Causes:
  1. Service doesn't exist
  2. Service in different namespace (ndots issue)
  3. Pod not in cluster namespace

Debug:
  kubectl get svc -n default api
  kubectl get svc -A | grep api

Failure 2: Timeout

Resolution: api.default → Timeout (2s)

Causes:
  1. NodeLocal not running
  2. CoreDNS overwhelmed
  3. Upstream resolver unreachable
  4. Firewall blocking port 53

Debug:
  kubectl get pods -n kube-system -l k8s-app=node-local-dns
  kubectl top pods -n kube-system -l k8s-app=kube-dns
  gcloud compute firewall-rules list --filter="allowed.IPProtocol:udp"

Failure 3: Wrong IP Returned

Resolution: api.default → 10.4.0.50 (expected 10.4.0.100)

Causes:
  1. Stale cache (TTL not expired yet)
  2. Load balancer/Ingress IP changed
  3. Wrong service selected

Debug:
  kubectl get svc -n default api -o wide
  nslookup api.default +nocmd +short
  Flush cache: Restart NodeLocal/CoreDNS

Performance Profiling

Query Latency Measurement

bash
# Measure resolution time
time nslookup api.default.svc.cluster.local

# Measure from pod
kubectl exec POD_NAME -- time nslookup api.default.svc.cluster.local

# Benchmark with multiple queries
for i in {1..100}; do
  time dig +short api.default.svc.cluster.local @169.250.169.254
done | awk '/^real/ {sum+=$2; n++} END {print "Average:", sum/n}'

Identify Bottlenecks

If latency high:

  1. Check NodeLocal cache hit rate
     Low hit rate → Increase cache size
  
  2. Check CoreDNS CPU
     High CPU → Scale CoreDNS replicas
  
  3. Check upstream latency
     dig +trace example.com @8.8.8.8
  
  4. Check network latency
     ping -c 5 CoreDNS_Pod_IP

Best Practices

  1. Understand ndots behavior (search domain logic)
  2. Always use NodeLocal (95% latency reduction)
  3. Monitor at each layer (cache → CoreDNS → upstream)
  4. Document expected IPs (what should api.default resolve to?)
  5. Test resolution after deployments (new service added?)
  6. Keep diagnostics tools available (dig, nslookup on nodes)
  7. Alert on DNS timeouts (early warning of issues)

References