Skip to content

NodeLocal DNSCache: Latency Reduction & Caching Mechanics

Tại sao điều này quan trọng

NodeLocal DNSCache = local DNS cache on each K8s node, giảm:

  • Latency từ pod → DNS server (1-2ms thay vì 5-10ms)
  • Load trên centralized CoreDNS (95% queries hit local cache)
  • DNS server resource usage

Impact at scale:

10,000 pods × 100 queries/min = 1,666 QPS

Without NodeLocal:
  Tất cả queries go to CoreDNS (bottleneck)

With NodeLocal:
  95% cache hits locally
  Only 83 QPS to CoreDNS (20x reduction)

Architecture

Pod Query (10.4.1.5):

/etc/resolv.conf nameserver 169.250.169.254:53

NodeLocal DNSCache (169.250.169.254:53, runs on each node)
  ├─ Cache hit: Return cached result (instant)
  └─ Cache miss: Forward to CoreDNS (10.4.0.11:53)

      CoreDNS resolves, returns result

      NodeLocal caches result (TTL seconds)

      Return to pod

Deployment

Enable on GKE Cluster

bash
# New cluster
gcloud container clusters create my-cluster \
  --addons=NodeLocalDNS \
  --zone=us-central1-a

# Existing cluster
gcloud container clusters update my-cluster \
  --enable-ip-alias \
  --addons=NodeLocalDNS \
  --zone=us-central1-a

# Verify
kubectl get pods -n kube-system -l k8s-app=node-local-dns

Configuration

bash
# Check NodeLocal config
kubectl get cm node-local-dns -n kube-system -o yaml

# Default: /etc/coredns/Corefile
# Caches:
#   - Internal services
#   - Negative cache (NXDOMAIN)
#   - External queries

Caching Mechanics

Cache Behavior

Query 1: api.default.svc.cluster.local
  → NodeLocal cache miss
  → Forward to CoreDNS
  → CoreDNS responds: 10.4.0.50
  → NodeLocal caches (TTL=30 by default)
  → Pod gets 10.4.0.50

Query 2 (same pod, T+5s): api.default.svc.cluster.local
  → NodeLocal cache hit
  → Return 10.4.0.50 (from cache)
  → NO CoreDNS query

Query 3 (T+31s, after cache TTL):
  → NodeLocal cache expired
  → Forward to CoreDNS
  → New query cycle

Negative Caching

Query: nonexistent.default.svc.cluster.local
  → CoreDNS returns NXDOMAIN
  → NodeLocal caches NXDOMAIN (TTL=5 by default)

Repeat query within 5 seconds:
  → NodeLocal returns cached NXDOMAIN
  → No CoreDNS involved

Result: Reduced load from erroneous queries

Performance Impact

Latency Reduction

Measurement: Pod to K8s service resolution

Without NodeLocal:
  Average: 8.5ms
  P99: 25ms
  P99.9: 50ms

With NodeLocal:
  Average: 2.1ms (75% reduction)
  P99: 5ms (80% reduction)
  P99.9: 12ms (76% reduction)

CoreDNS Load Reduction

Cluster: 10,000 pods
Without NodeLocal:
  CoreDNS QPS: 1,666
  CPU usage: ~4 cores

With NodeLocal:
  CoreDNS QPS: 83 (95% reduction)
  CPU usage: ~0.2 cores
  
Result: Single CoreDNS pod sufficient (vs. 10 pods without)

Deployment Gotchas

Gotcha 1: Host Network Services

NodeLocal DNSCache runs on host (169.250.169.254)
Services using hostNetwork cannot resolve via NodeLocal

Solution:
  kubectl apply -f - <<EOF
  apiVersion: v1
  kind: Pod
  metadata:
    name: host-net-example
  spec:
    hostNetwork: true
    dnsPolicy: ClusterFirst
    containers:
    - name: app
      image: busybox
  EOF

Gotcha 2: Sidecar Containers

If pod has sidecar that needs different DNS:
  Main container: Use NodeLocal (169.250.169.254)
  Sidecar: May need different resolver

Solution:
  apiVersion: v1
  kind: Pod
  spec:
    dnsConfig:
      nameservers:
      - 8.8.8.8  # Override for this pod
      - 8.8.4.4
    containers:
    - name: main
    - name: sidecar

Gotcha 3: Search Domain Issues

/etc/resolv.conf with NodeLocal:
  nameserver 169.250.169.254
  search api.default.svc.cluster.local default.svc.cluster.local svc.cluster.local cluster.local

Query: api (short name)
  → CoreDNS tries with search domains:
    1. api.api.default.svc.cluster.local → NXDOMAIN
    2. api.default.svc.cluster.local → NXDOMAIN
    3. api.svc.cluster.local → NXDOMAIN
    4. api.cluster.local → NXDOMAIN
    5. api (as-is) → NXDOMAIN

Result: 5 queries for 1 request (cache amplification)

Mitigation:
  Use FQDN where possible: api.default.svc.cluster.local
  Reduce search domains: remove unnecessary ones

Troubleshooting

Issue 1: Service Unresolvable After NodeLocal Enabled

bash
# Debug:
1. Check NodeLocal pod running
   kubectl get pods -n kube-system -l k8s-app=node-local-dns

2. Check /etc/resolv.conf in pod
   kubectl exec POD_NAME -- cat /etc/resolv.conf
   Should show: nameserver 169.250.169.254

3. Test NodeLocal directly
   kubectl debug node/NODE_NAME -it --image=ubuntu
   nslookup kubernetes.default.svc.cluster.local 169.250.169.254

4. Check CoreDNS
   kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50

Issue 2: NodeLocal Cache Staleness

Symptom: New service endpoint not resolve until cache expires

Debug:
  Check cache TTL
  Default: 30s for Kubernetes records
  
  If service changed:
    T+0: Update
    T+0-30: Old IP cached
    T+30: New IP resolved

Solution:
  1. Reduce TTL in NodeLocal config (lower consistency time)
  2. Flush cache: kubectl delete pod -n kube-system node-local-dns-*
  3. Use shorter TTL for frequently changing records

Issue 3: High Memory Usage

NodeLocal cache consuming too much memory:

Debug:
  1. Check cache size
     kubectl top pods -n kube-system -l k8s-app=node-local-dns

  2. Monitor cache misses/evictions
     kubectl logs -n kube-system -l k8s-app=node-local-dns | grep evict

  3. Reduce cache size in config
     Reduce cache_size in Corefile

Configuration Examples

Corefile with NodeLocal

corefile
.:53 {
    cache 30
    errors
    health {
        lameduck 5s
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
        ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
        max_concurrent 1000
    }
    loop
    reload
    loadbalance
}

Custom Corefile with Forwarding

corefile
.:53 {
    cache 30
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        ttl 30
    }
    
    # Custom forwarding for specific domain
    forward example.com 8.8.8.8
    
    # Cache negative responses for 5 seconds
    cache 5 {
        success 30
        denial 5
        minttl 0
        maxttl 300
    }
    
    forward . /etc/resolv.conf
    loop
    reload
}

Monitoring

bash
# Monitor cache performance
kubectl top pods -n kube-system -l k8s-app=node-local-dns

# Check cache hit rate
kubectl logs -n kube-system -l k8s-app=node-local-dns | \
  grep "cache" | tail -20

# Alert on NodeLocal unavailability
gcloud alpha monitoring policies create \
  --notification-channels=CHANNEL_ID \
  --display-name="NodeLocal DNSCache down" \
  --condition-threshold-filter='resource.type="k8s_pod" AND metric.type="kubernetes.io/pod/network/received_packets"'

Best Practices

  1. Always enable NodeLocal for production (95% query reduction)
  2. Monitor cache metrics (hit rate, memory)
  3. Set appropriate TTLs (balance freshness vs. load)
  4. Document search domain strategy (reduce unnecessary lookups)
  5. Test failover (NodeLocal down → CoreDNS fallback)
  6. Alert on pod startup failures (NodeLocal availability critical)

References