NodeLocal DNSCache: Latency Reduction & Caching Mechanics
Tại sao điều này quan trọng
NodeLocal DNSCache = local DNS cache on each K8s node, giảm:
- Latency từ pod → DNS server (1-2ms thay vì 5-10ms)
- Load trên centralized CoreDNS (95% queries hit local cache)
- DNS server resource usage
Impact at scale:
10,000 pods × 100 queries/min = 1,666 QPS
Without NodeLocal:
Tất cả queries go to CoreDNS (bottleneck)
With NodeLocal:
95% cache hits locally
Only 83 QPS to CoreDNS (20x reduction)Architecture
Pod Query (10.4.1.5):
↓
/etc/resolv.conf nameserver 169.250.169.254:53
↓
NodeLocal DNSCache (169.250.169.254:53, runs on each node)
├─ Cache hit: Return cached result (instant)
└─ Cache miss: Forward to CoreDNS (10.4.0.11:53)
↓
CoreDNS resolves, returns result
↓
NodeLocal caches result (TTL seconds)
↓
Return to podDeployment
Enable on GKE Cluster
bash
# New cluster
gcloud container clusters create my-cluster \
--addons=NodeLocalDNS \
--zone=us-central1-a
# Existing cluster
gcloud container clusters update my-cluster \
--enable-ip-alias \
--addons=NodeLocalDNS \
--zone=us-central1-a
# Verify
kubectl get pods -n kube-system -l k8s-app=node-local-dnsConfiguration
bash
# Check NodeLocal config
kubectl get cm node-local-dns -n kube-system -o yaml
# Default: /etc/coredns/Corefile
# Caches:
# - Internal services
# - Negative cache (NXDOMAIN)
# - External queriesCaching Mechanics
Cache Behavior
Query 1: api.default.svc.cluster.local
→ NodeLocal cache miss
→ Forward to CoreDNS
→ CoreDNS responds: 10.4.0.50
→ NodeLocal caches (TTL=30 by default)
→ Pod gets 10.4.0.50
Query 2 (same pod, T+5s): api.default.svc.cluster.local
→ NodeLocal cache hit
→ Return 10.4.0.50 (from cache)
→ NO CoreDNS query
Query 3 (T+31s, after cache TTL):
→ NodeLocal cache expired
→ Forward to CoreDNS
→ New query cycleNegative Caching
Query: nonexistent.default.svc.cluster.local
→ CoreDNS returns NXDOMAIN
→ NodeLocal caches NXDOMAIN (TTL=5 by default)
Repeat query within 5 seconds:
→ NodeLocal returns cached NXDOMAIN
→ No CoreDNS involved
Result: Reduced load from erroneous queriesPerformance Impact
Latency Reduction
Measurement: Pod to K8s service resolution
Without NodeLocal:
Average: 8.5ms
P99: 25ms
P99.9: 50ms
With NodeLocal:
Average: 2.1ms (75% reduction)
P99: 5ms (80% reduction)
P99.9: 12ms (76% reduction)CoreDNS Load Reduction
Cluster: 10,000 pods
Without NodeLocal:
CoreDNS QPS: 1,666
CPU usage: ~4 cores
With NodeLocal:
CoreDNS QPS: 83 (95% reduction)
CPU usage: ~0.2 cores
Result: Single CoreDNS pod sufficient (vs. 10 pods without)Deployment Gotchas
Gotcha 1: Host Network Services
NodeLocal DNSCache runs on host (169.250.169.254)
Services using hostNetwork cannot resolve via NodeLocal
Solution:
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: host-net-example
spec:
hostNetwork: true
dnsPolicy: ClusterFirst
containers:
- name: app
image: busybox
EOFGotcha 2: Sidecar Containers
If pod has sidecar that needs different DNS:
Main container: Use NodeLocal (169.250.169.254)
Sidecar: May need different resolver
Solution:
apiVersion: v1
kind: Pod
spec:
dnsConfig:
nameservers:
- 8.8.8.8 # Override for this pod
- 8.8.4.4
containers:
- name: main
- name: sidecarGotcha 3: Search Domain Issues
/etc/resolv.conf with NodeLocal:
nameserver 169.250.169.254
search api.default.svc.cluster.local default.svc.cluster.local svc.cluster.local cluster.local
Query: api (short name)
→ CoreDNS tries with search domains:
1. api.api.default.svc.cluster.local → NXDOMAIN
2. api.default.svc.cluster.local → NXDOMAIN
3. api.svc.cluster.local → NXDOMAIN
4. api.cluster.local → NXDOMAIN
5. api (as-is) → NXDOMAIN
Result: 5 queries for 1 request (cache amplification)
Mitigation:
Use FQDN where possible: api.default.svc.cluster.local
Reduce search domains: remove unnecessary onesTroubleshooting
Issue 1: Service Unresolvable After NodeLocal Enabled
bash
# Debug:
1. Check NodeLocal pod running
kubectl get pods -n kube-system -l k8s-app=node-local-dns
2. Check /etc/resolv.conf in pod
kubectl exec POD_NAME -- cat /etc/resolv.conf
Should show: nameserver 169.250.169.254
3. Test NodeLocal directly
kubectl debug node/NODE_NAME -it --image=ubuntu
nslookup kubernetes.default.svc.cluster.local 169.250.169.254
4. Check CoreDNS
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50Issue 2: NodeLocal Cache Staleness
Symptom: New service endpoint not resolve until cache expires
Debug:
Check cache TTL
Default: 30s for Kubernetes records
If service changed:
T+0: Update
T+0-30: Old IP cached
T+30: New IP resolved
Solution:
1. Reduce TTL in NodeLocal config (lower consistency time)
2. Flush cache: kubectl delete pod -n kube-system node-local-dns-*
3. Use shorter TTL for frequently changing recordsIssue 3: High Memory Usage
NodeLocal cache consuming too much memory:
Debug:
1. Check cache size
kubectl top pods -n kube-system -l k8s-app=node-local-dns
2. Monitor cache misses/evictions
kubectl logs -n kube-system -l k8s-app=node-local-dns | grep evict
3. Reduce cache size in config
Reduce cache_size in CorefileConfiguration Examples
Corefile with NodeLocal
corefile
.:53 {
cache 30
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
loop
reload
loadbalance
}Custom Corefile with Forwarding
corefile
.:53 {
cache 30
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
ttl 30
}
# Custom forwarding for specific domain
forward example.com 8.8.8.8
# Cache negative responses for 5 seconds
cache 5 {
success 30
denial 5
minttl 0
maxttl 300
}
forward . /etc/resolv.conf
loop
reload
}Monitoring
bash
# Monitor cache performance
kubectl top pods -n kube-system -l k8s-app=node-local-dns
# Check cache hit rate
kubectl logs -n kube-system -l k8s-app=node-local-dns | \
grep "cache" | tail -20
# Alert on NodeLocal unavailability
gcloud alpha monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="NodeLocal DNSCache down" \
--condition-threshold-filter='resource.type="k8s_pod" AND metric.type="kubernetes.io/pod/network/received_packets"'Best Practices
- Always enable NodeLocal for production (95% query reduction)
- Monitor cache metrics (hit rate, memory)
- Set appropriate TTLs (balance freshness vs. load)
- Document search domain strategy (reduce unnecessary lookups)
- Test failover (NodeLocal down → CoreDNS fallback)
- Alert on pod startup failures (NodeLocal availability critical)