DNS Query Logging: Detection, Audit & Compliance
Tại sao điều này quan trọng
DNS logs = traffic records không thể bị bypass. Mọi query tới DNS là chứng cứ:
- Data exfiltration detection (queries to c2.attacker.com)
- Access patterns (which services talking to each other)
- Compliance (queries tới blocked domains?)
- Troubleshooting (which queries failed?)
Without DNS logs: Blind spot cho security team.
Enabling Query Logging
Cloud DNS Query Logging
bash
# Enable logging for private zone
gcloud dns policies create prod-log-policy \
--description="Log all DNS queries" \
--enable-logging \
--log-network=projects/PROJECT_ID/global/networks/prod-vpc
# Verify
gcloud dns policies list
# View logs (Cloud Logging)
gcloud logging read "resource.type=dns_query" \
--limit=50 \
--format=jsonGKE CoreDNS Logging
bash
# Enable query logging in CoreDNS via ConfigMap
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
log.override: |
log {
class all
file /var/log/coredns.log
severity notice
}
EOF
# Restart CoreDNS
kubectl rollout restart deployment coredns -n kube-system
# View logs
kubectl logs -l k8s-app=kube-dns -n kube-system --all-containers=trueLog Analysis
Pattern 1: Exfiltration Detection
bash
# Find queries to external domains (potential data exfiltration)
gcloud logging read "resource.type=dns_query AND jsonPayload.query_type=A" \
--limit=1000 \
--format=json | jq '.[] | select(.jsonPayload.query_name | contains("attacker")) | {domain: .jsonPayload.query_name, timestamp: .timestamp}'
# Alert on suspicious patterns
gcloud logging read "resource.type=dns_query" \
--limit=10000 \
--format=json | jq -s 'group_by(.jsonPayload.source_ip) | map({ip: .[0].jsonPayload.source_ip, unique_queries: (map(.jsonPayload.query_name) | unique | length)}) | sort_by(.unique_queries) | reverse | .[0:10]'Pattern 2: Failed Queries
bash
# Find NXDOMAIN (not found) queries
gcloud logging read "resource.type=dns_query AND jsonPayload.response_code=NXDOMAIN" \
--limit=100 \
--format=json | jq '.[] | {domain: .jsonPayload.query_name, timestamp: .timestamp}'
# Find timeouts
gcloud logging read "resource.type=dns_query AND jsonPayload.response_code=TIMEOUT" \
--limit=100Pattern 3: Query Frequency Analysis
bash
# Top 10 most queried domains (identify hot services)
gcloud logging read "resource.type=dns_query" \
--limit=10000 \
--format=json | jq '[.[].jsonPayload.query_name] | group_by(.) | map({domain: .[0], count: length}) | sort_by(.count) | reverse | .[0:10]'
# Queries per source IP (identify clients)
gcloud logging read "resource.type=dns_query" \
--limit=10000 \
--format=json | jq '[.[].jsonPayload.source_ip] | group_by(.) | map({ip: .[0], queries: length}) | sort_by(.queries) | reverse | .[0:20]'BigQuery Integration
Export Logs to BigQuery
bash
# Create BigQuery dataset
bq mk --dataset PROJECT_ID:dns_logs
# Create sink to export
gcloud logging sinks create dns-logs-sink \
bigquery.googleapis.com/projects/PROJECT_ID/datasets/dns_logs \
--log-filter='resource.type="dns_query"'BigQuery Analysis
sql
-- Top 10 queried domains
SELECT
jsonPayload.query_name as domain,
COUNT(*) as query_count
FROM `project.dns_logs.cloudaudit_googleapis_com_activity`
WHERE TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), timestamp, HOUR) <= 24
GROUP BY domain
ORDER BY query_count DESC
LIMIT 10;
-- Query patterns by source
SELECT
jsonPayload.source_ip as client_ip,
COUNT(DISTINCT jsonPayload.query_name) as unique_domains,
COUNT(*) as total_queries,
ARRAY_AGG(DISTINCT jsonPayload.query_name LIMIT 5) as sample_domains
FROM `project.dns_logs.cloudaudit_googleapis_com_activity`
WHERE TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), timestamp, HOUR) <= 24
GROUP BY client_ip
ORDER BY total_queries DESC
LIMIT 20;
-- Failed queries (NXDOMAIN)
SELECT
jsonPayload.query_name as domain,
jsonPayload.source_ip as client_ip,
COUNT(*) as failure_count
FROM `project.dns_logs.cloudaudit_googleapis_com_activity`
WHERE jsonPayload.response_code = 'NXDOMAIN'
AND TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), timestamp, HOUR) <= 24
GROUP BY domain, client_ip
ORDER BY failure_count DESC;Compliance & Audit
Compliance Scenario 1: HIPAA (Healthcare)
Requirement: Log all DNS queries to ensure no data exfiltration
Implementation:
1. Enable Cloud DNS query logging
2. Export logs to BigQuery (immutable)
3. Set retention: 7 years (HIPAA requirement)
4. Regular audit: Check for unauthorized external services
Query: Find queries to non-approved external services
SELECT * FROM dns_logs
WHERE query_name NOT IN ('approved-vendor-1.com', 'approved-vendor-2.com')
AND query_name LIKE '%.com'
AND TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), timestamp, DAY) <= 1;Compliance Scenario 2: PCI-DSS (Payment)
Requirement: No queries to non-payment-related external services
Implementation:
1. Whitelist approved domains
2. Alert if query to non-whitelisted domain
3. Log all violations
Query: Find violations
SELECT query_name, COUNT(*) as violation_count
FROM dns_logs
WHERE query_name NOT IN (SELECT approved_domain FROM whitelist)
AND query_name LIKE '%.com'
GROUP BY query_name
HAVING COUNT(*) > 10;Alerting
Alert 1: Suspicious Domain Query
bash
# Create alert for queries to known malware domains
gcloud alpha monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="Suspicious DNS query detected" \
--condition-threshold-value=1 \
--condition-threshold-filter='resource.type="dns_query" AND jsonPayload.query_name=~"(c2\.|attacker\.|malware\.)"' \
--condition-threshold-duration=60Alert 2: Query Rate Anomaly
bash
# Alert if query rate > 10x normal
gcloud alpha monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="Abnormal DNS query volume" \
--condition-threshold-value=10000 \
--condition-threshold-filter='resource.type="dns_query"'Alert 3: Query Failures
bash
# Alert if > 5% of queries fail (NXDOMAIN, TIMEOUT)
gcloud alpha monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="High DNS query failure rate" \
--condition-threshold-value=50 \
--condition-threshold-filter='resource.type="dns_query" AND (jsonPayload.response_code="NXDOMAIN" OR jsonPayload.response_code="TIMEOUT")'Privacy & Data Retention
Privacy Considerations
DNS logs contain:
- Domains accessed (can infer user behavior)
- Timestamps (activity patterns)
- Source IPs (which pod/node queried)
Risk: Overly verbose logs could expose sensitive behavior patterns
Mitigation:
1. Restrict access: Only ops/security team
2. Anonymize: Log domain only, not full query details
3. Retention policy: Auto-delete after 30 days (if not compliance requirement)
4. Encryption: Enable encryption at rest (BigQuery CMEK)Data Retention Policy
bash
# Set log retention to 90 days
gcloud logging sinks update dns-logs-sink \
--log-filter='resource.type="dns_query" AND timestamp >= "-P90D"'
# Or via Organization Policy
gcloud resource-manager org-policies create \
--name=projects/PROJECT_ID/policies/constraints/logging.disableLogAnalytics \
--enforcedTroubleshooting
Issue 1: Logs Not Appearing
bash
# Debug:
1. Verify logging enabled
gcloud dns policies list
2. Check IAM permissions
gcloud projects get-iam-policy PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.role:roles/logging.*"
3. Check log volume
gcloud logging read "resource.type=dns_query" --limit=10
4. Verify sink working
gcloud logging sinks describe dns-logs-sinkIssue 2: Too Many Logs (Cost)
bash
# Filter to reduce volume
gcloud logging sinks update dns-logs-sink \
--log-filter='resource.type="dns_query" AND severity="ERROR"'Best Practices
- Always enable DNS logging (mandatory for security)
- Export to BigQuery (immutable, compliance)
- Set retention policy (HIPAA/PCI requirements)
- Monitor query patterns (exfiltration detection)
- Alert on anomalies (automated response)
- Regular audit (weekly/monthly review)
- Restrict access (logs contain sensitive patterns)
- Document processes (compliance evidence)