Managed OpenTelemetry & Custom Metrics cho HPA

Why this matters in production

Hai chủ đề của file này gắn với nhau bởi một sợi dây: cả hai đều là đường ống biến telemetry thô thành thứ có thể hành động. OpenTelemetry là đường ống chuẩn hóa thu thập ba loại signal (trace, metric, log) từ application với một bộ instrumentation duy nhất — giải bài toán "mỗi vendor một SDK, mỗi signal một agent". Custom metric cho HPA là đường ống biến metric thành quyết định autoscaling — biến "queue có 5000 message" thành "scale lên 20 replica" (Chương 9). Cả hai đều là chỗ observability ngừng là việc quan sát thụ động và trở thành cơ chế điều khiển chủ động.

Distributed tracing đặc biệt quan trọng ở kiến trúc microservice: khi một request đi qua 8 service, metric cho biết "latency tổng 800ms" nhưng không cho biết 600ms nằm ở service nào. Chỉ trace trả lời được — và trace chỉ tồn tại nếu app được instrument và propagate context. Managed OpenTelemetry cho GKE là giải pháp managed duy nhất của Google Cloud để thu trace trên GKE, theo tài liệu Managed OpenTelemetry for GKE.

Internal model: Managed OpenTelemetry cho GKE

Hai thành phần

Managed OpenTelemetry cho GKE gồm hai phần (theo tài liệu):

Managed collection: một OpenTelemetry collector chạy in-cluster, phơi bày một OTLP endpoint làm điểm đến cho workload đẩy trace/metric/log dạng OTLP — không cần bạn tự vận hành collector. Endpoint cố định trong cluster: http://opentelemetry-collector.gke-managed-otel.svc.cluster.local:4318. Collector nhận signal rồi route tới ba dịch vụ Google Cloud Observability:
- Logs → Cloud Logging
- Metrics → Cloud Monitoring
- Traces → Cloud Trace
Automatic configuration: một custom resource Instrumentation (instrumentations.telemetry.googleapis.com) tự động inject biến môi trường OpenTelemetry vào container của workload, để workload tự gửi signal tới managed collector mà không cần hardcode endpoint. Theo tài liệu: "Automatic configuration uses environment variables injected in the workload's container to have the workload send signals to the managed collector."

Instrumentation CRD và signal routing

Instrumentation resource cho phép cấu hình khai báo, không cần quản lý env var thủ công. Nó kiểm soát:

Target Pod/namespace nào được inject.
Loại signal nào thu (logs/metrics/traces).
OTEL_METRIC_EXPORT_INTERVAL (5–300s) — tần suất export metric.
OTEL_TRACES_SAMPLER và sampling ratio (0.0–1.0) — tỷ lệ sample trace.

Các env var then chốt được inject: OTEL_EXPORTER_OTLP_ENDPOINT (trỏ tới in-cluster collector), OTEL_TRACES_SAMPLER, OTEL_METRIC_EXPORT_INTERVAL, và OTEL_TRACES_EXPORTER/OTEL_METRICS_EXPORTER/OTEL_LOGS_EXPORTER (có thể tắt từng loại signal).

Điều kiện để auto-config hoạt động: workload phải dùng OpenTelemetry SDK và đọc cấu hình từ biến môi trường OpenTelemetry chuẩn — auto-config không "phép màu" instrument app chưa dùng OTel SDK; nó cấu hình SDK đã có sẵn. Lưu ý: injection không hỗ trợ privileged workload từ GKE Autopilot partner.

Vì sao OTLP collector in-cluster có giá trị

Mô hình "app đẩy OTLP tới một collector in-cluster, collector route đi" có vài ưu điểm so với "mỗi app tự đẩy thẳng tới backend":

Tách app khỏi backend: app chỉ biết một endpoint OTLP cố định; đổi backend/thêm xử lý không cần sửa app.
Một điểm để áp xử lý chung: sampling, batching, thêm resource attribute (cluster/namespace nhất quán — sợi chỉ correlation của file 01) tập trung ở collector.
Giảm overhead app: batching và export ở collector thay vì mỗi app tự lo.

Google-Built OpenTelemetry Collector — khi cần kiểm soát collector

Nếu cần filter/transform/sampling phức tạp ở tầng collector (tail-based sampling, drop signal theo điều kiện, route đa đích), dùng Google-Built OpenTelemetry Collector — bản build của Google từ component upstream qua secure supply chain, deploy tự quản (DaemonSet/Deployment). Đánh đổi: linh hoạt hơn nhưng phải tự vận hành collector. Khung quyết định: managed collection cho đa số (zero ops); Google-Built collector khi cần collector-level control mà managed không cung cấp.

Internal model: custom metric cho HPA

HPA mặc định scale theo CPU/memory (Chương 9), nhưng nhiều workload cần scale theo metric khác: queue depth (Pub/Sub backlog), request rate, business metric. Để HPA đọc được metric ngoài CPU/memory, cần một adapter implement Kubernetes Custom Metrics API hoặc External Metrics API. Theo tài liệu HPA with Managed Service for Prometheus, có ba con đường trên GKE:

KEDA — khuyến nghị và phổ biến nhất trong cộng đồng Prometheus hiện nay; đã được trình bày sâu ở Chương 9, file 08. KEDA tạo HPA dưới nền và cung cấp scaler cho Pub/Sub, Prometheus, Cloud Tasks, BigQuery... cùng scale-to-zero. File này không lặp lại KEDA.
Custom Metrics Stackdriver Adapter — giải pháp được Google hỗ trợ. Yêu cầu phiên bản ≥ 0.13.1 để query metric Managed Service for Prometheus. Adapter expose metric qua Custom/External Metrics API để HPA tham chiếu.
Prometheus Adapter — giải pháp bên thứ ba, cần cấu hình thêm để hoạt động với managed Prometheus (route query qua frontend UI proxy).

Ràng buộc then chốt: không chạy hai adapter cùng lúc

Theo tài liệu: "Their resource definitions overlap... running more than one adapter in the same cluster causes errors." Custom Metrics Stackdriver Adapter và Prometheus Adapter đều đăng ký cùng API service (custom.metrics.k8s.io, external.metrics.k8s.io) — chạy cả hai gây xung đột. Chọn một. Khung quyết định:

Dùng Custom Metrics Stackdriver Adapter nếu muốn giải pháp Google hỗ trợ, đọc thẳng từ Cloud Monitoring/Managed Prometheus, ít cấu hình.
Dùng Prometheus Adapter nếu đã quen hệ sinh thái prometheus-adapter và cần rule mapping linh hoạt.
Dùng KEDA nếu cần scale-to-zero hoặc event source phong phú (đa số trường hợp event-driven).

Pods metric vs External metric

Hai loại metric HPA tiêu thụ:

Pods metric (type: Pods): metric per-Pod, HPA tính trung bình trên các Pod (ví dụ request/giây mỗi Pod). Dùng khi tải phân bổ đều theo Pod.
External metric (type: External): metric không gắn với Pod (ví dụ độ sâu hàng đợi Pub/Sub) — HPA chia tổng cho target để ra số replica. Dùng cho event-driven scaling.

Lưu ý format: metric Prometheus có tiền tố prometheus.googleapis.com và cần chuyển / thành | khi tham chiếu trong HPA qua Stackdriver Adapter.

Pattern triển khai chung

Cả hai adapter theo cùng pattern: (1) bật managed collection scrape metric; (2) deploy adapter; (3) tạo PodMonitoring định nghĩa scrape target; (4) tạo HPA tham chiếu metric đã expose. Đây là nơi file 06 (PodMonitoring) nối thẳng vào Chương 9 (HPA): file 06 cung cấp nguồn metric, file này cung cấp cầu nối metric → HPA, Chương 9 cung cấp control loop.

Internal model: ServiceMonitor/PodMonitor và tương thích prometheus-operator

Hệ sinh thái prometheus-operator dùng CRD ServiceMonitor và PodMonitor (khác với PodMonitoring của GMP — tên gần giống nhưng là CRD khác). Nhiều Helm chart và app đóng gói sẵn ServiceMonitor. Trên GKE có hai lựa chọn:

GMP self-deployed collection hỗ trợ một tập CRD prometheus-operator, cho phép tái dùng ServiceMonitor/PodMonitor hiện có khi migrate.
Hoặc chuyển đổi ServiceMonitor → PodMonitoring của GMP khi dùng managed collection.

Pattern thực tế: khi migrate một stack prometheus-operator sang GMP, đánh giá có bao nhiêu ServiceMonitor cần chuyển; với số lượng lớn, self-deployed collection (giữ CRD operator, đổi backend) là đường ít ma sát hơn managed collection (phải viết lại thành PodMonitoring).

Production architecture patterns

Auto-instrumentation cho golden signal, manual cho business trace

Pattern hiệu quả: dùng Managed OpenTelemetry automatic configuration để có golden signal kỹ thuật và trace cơ bản miễn phí công sức (file 04), rồi instrument thủ công các span nghiệp vụ quan trọng (ví dụ span "charge payment", "reserve inventory") để trace mang ý nghĩa domain. Tránh trộn manual và auto config trên cùng workload vì auto injection có thể ghi đè cấu hình manual.

Tail-based sampling qua Google-Built collector

Pattern cho hệ thống high-traffic: head-based sampling (quyết định sample lúc bắt đầu request) bỏ lỡ request lỗi nếu sample ratio thấp. Dùng Google-Built collector với tail-based sampling để giữ 100% trace của request lỗi/chậm và sample thấp request bình thường — tối ưu giá trị/chi phí của trace.

Một adapter, nhiều HPA

Pattern: deploy một Custom Metrics Stackdriver Adapter cho cả cluster, nhiều HPA của nhiều team cùng tham chiếu metric qua nó. Tránh mỗi team tự deploy adapter (gây xung đột API service). Platform team sở hữu adapter; team sở hữu HPA.

Real-world scenarios

Kịch bản 1 — Latency 800ms không rõ ở đâu. Một checkout flow đi qua API gateway → auth → cart → pricing → payment. Metric cho thấy p99 end-to-end 800ms nhưng mỗi service báo p99 riêng đều < 150ms — con số không khớp. Bật Managed OpenTelemetry với trace cho thấy: request thực sự gọi pricing service ba lần (N+1 do bug) thay vì một lần, mỗi lần 150ms. Chỉ trace phơi bày được pattern gọi lặp — metric per-service không bao giờ thấy.

Kịch bản 2 — Scale theo backlog thay vì CPU. Một worker xử lý job từ Pub/Sub. Scale theo CPU vô dụng vì worker I/O-bound (CPU thấp dù backlog cao). Dùng Custom Metrics Stackdriver Adapter expose Pub/Sub num_undelivered_messages làm External metric, HPA scale theo backlog: target 100 message/Pod → backlog 5000 → 50 Pod. Backlog xử lý kịp, không còn tồn đọng. (KEDA Pub/Sub scaler là lựa chọn thay thế, Chương 9.)

Common mistakes / anti-patterns

Chạy hai metric adapter đồng thời. Stackdriver Adapter + Prometheus Adapter xung đột API service. Đúng: chọn một, hoặc dùng KEDA.
Tưởng auto-instrumentation thay được OTel SDK. Auto-config chỉ cấu hình SDK đã có; app chưa dùng OTel SDK không tự nhiên có trace. Đúng: app phải tích hợp OTel SDK (hoặc dùng auto-instrumentation agent theo ngôn ngữ).
Trộn manual và automatic configuration. Auto injection ghi đè manual config, khó truy vết thay đổi. Đúng: chọn một mode cho mỗi workload.
Sample trace quá thấp rồi mất trace lỗi. Head-based 1% bỏ lỡ phần lớn request lỗi. Đúng: tail-based sampling giữ trace lỗi/chậm.
Đứt context propagation. Một service không truyền traceparent làm đứt trace, mất giá trị end-to-end. Đúng: đảm bảo mọi service propagate context (auto-instrumentation thường lo việc này).

GCP-native implementation guidance

Ví dụ Instrumentation resource (Managed OpenTelemetry, bật trace + metric với sampling 10%):

yaml

apiVersion: telemetry.googleapis.com/v1alpha1
kind: Instrumentation
metadata:
  name: default-instrumentation
  namespace: production
spec:
  # Inject cho workload trong namespace; cấu hình SDK qua env var
  sampler:
    type: parentbased_traceidratio
    argument: "0.1"
  exporter:
    metrics: { enabled: true }
    traces:  { enabled: true }
    logs:    { enabled: true }

Ví dụ HPA dùng External metric qua Custom Metrics Stackdriver Adapter (scale theo Pub/Sub backlog):

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: worker-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: worker
  minReplicas: 1
  maxReplicas: 100
  metrics:
  - type: External
    external:
      metric:
        name: pubsub.googleapis.com|subscription|num_undelivered_messages
        selector:
          matchLabels:
            resource.labels.subscription_id: my-subscription
      target:
        type: AverageValue
        averageValue: "100"   # 100 message tồn đọng mỗi Pod

Official references

Managed OpenTelemetry for GKE — hai thành phần, Instrumentation CRD
Deploy Managed OpenTelemetry — bật, yêu cầu phiên bản
Google-Built OpenTelemetry Collector — collector tự quản
HPA with Managed Service for Prometheus — ba con đường adapter
Custom Metrics Stackdriver Adapter — adapter Google
Cloud Trace overview — distributed tracing
KEDA trên GKE — event-driven autoscaling (Chương 9)

Tóm lại: OpenTelemetry và custom-metric-cho-HPA đều là đường ống biến telemetry thành hành động. Managed OpenTelemetry cho GKE cung cấp in-cluster OTLP collector + Instrumentation CRD để thu trace/metric/log với cấu hình khai báo, route tới Cloud Trace/Monitoring/Logging — và trace là công cụ duy nhất phơi bày latency ở đâu trong chuỗi microservice. Cho HPA, chọn đúng một adapter (Stackdriver Adapter, Prometheus Adapter, hoặc KEDA — không chạy hai cái cùng lúc) để biến metric tùy biến thành quyết định autoscaling, nối file 06 với control loop của Chương 9.

Managed OpenTelemetry & Custom Metrics cho HPA ​

Why this matters in production ​

Internal model: Managed OpenTelemetry cho GKE ​

Hai thành phần ​

Instrumentation CRD và signal routing ​

Vì sao OTLP collector in-cluster có giá trị ​

Google-Built OpenTelemetry Collector — khi cần kiểm soát collector ​

Internal model: custom metric cho HPA ​

Ràng buộc then chốt: không chạy hai adapter cùng lúc ​

Pods metric vs External metric ​

Pattern triển khai chung ​

Internal model: ServiceMonitor/PodMonitor và tương thích prometheus-operator ​

Production architecture patterns ​

Auto-instrumentation cho golden signal, manual cho business trace ​

Tail-based sampling qua Google-Built collector ​

Một adapter, nhiều HPA ​

Real-world scenarios ​

Common mistakes / anti-patterns ​

GCP-native implementation guidance ​

Official references ​