API Server Request Lifecycle — Authentication → Authorization → Admission → Storage

Complete Request Flow

1. TLS Handshake (if HTTPS)
   ├─ Certificate validation
   └─ Symmetric key negotiation

2. HTTP Request Parse
   ├─ Header parsing
   └─ Body buffering

3. Authentication
   ├─ Extract credentials (cert, token, OIDC)
   ├─ Verify against trusted authorities
   └─ Set user context

4. Authorization (RBAC/ABAC)
   ├─ Check: can user perform action?
   ├─ Check: on which resources?
   └─ Reject if no permission

5. Admission Control (Mutating)
   ├─ Call mutating webhooks (sequential)
   ├─ Modify object
   └─ Chain mutations

6. Admission Control (Validating)
   ├─ Call validating webhooks (parallel)
   ├─ Validate object structure
   └─ Reject if invalid

7. Resource Quota Check
   ├─ Count against namespace quota
   └─ Reject if exceeded

8. etcd Write (for CREATE/UPDATE)
   ├─ Raft consensus
   ├─ Wait for quorum ack
   └─ Commit to state machine

9. Response Serialization
   ├─ Convert object to JSON/Protobuf
   └─ Add metadata (resourceVersion, etc)

10. Response Stream
    └─ Send response back to client

Latency Breakdown

Typical request latency: ~50-200ms

├─ TLS handshake: 5-10ms (first connection)
├─ HTTP parsing: 1ms
├─ Authentication: 5-10ms
├─ Authorization (RBAC): 1-2ms
├─ Admission webhooks: 10-50ms (typically slow part!)
├─ Resource quota check: 1-2ms
├─ etcd write + quorum: 10-50ms
├─ Serialization: 1ms
└─ Network RTT: 5-10ms

Total: ~40-150ms (p50-p99 distribution)

Bottleneck Analysis

Common Bottleneck 1: Slow Webhooks

Problem: ValidatingWebhook timeout 5-10 seconds
Impact: All Pod creation blocked
Solution: Optimize webhook, reduce timeout, use CEL instead

Common Bottleneck 2: etcd Latency

Problem: Raft consensus slow (disk I/O issues)
Impact: All writes blocked
Solution: Upgrade disk IOPS, reduce etcd load

Common Bottleneck 3: Authorization Evaluation

Problem: Complex RBAC rules, many ClusterRoles
Impact: Each request slower
Solution: Simplify RBAC, use ABAC where possible

Monitoring Request Latency

bash

# Prometheus metrics
apiserver_request_duration_seconds  # Histogram
  - labels: verb, resource, scope
  - buckets: [.05, .1, .2, .4, .8, 1.6, 3.2, 6.4]

# Query high latency
histogram_quantile(0.99, apiserver_request_duration_seconds)

Optimization Strategies

Strategy 1: Cache Responses

GET /pods/my-pod:
- First request: goes full pipeline, 50ms
- Subsequent requests within cache TTL: ~1ms

Strategy 2: Optimize Webhooks

yaml

timeoutSeconds: 1  # Aggressive timeout
failurePolicy: Ignore  # Don't block on failure

Strategy 3: Batch Operations

bash

# Instead of 1000 individual POSTs
# Use bulk creation where possible

Reference Documentation

Kubernetes API Server Metrics

Summary

10-step pipeline: TLS → Auth → Admission → etcd → Response
Typical latency: 50-200ms
Common bottlenecks: Webhooks, etcd, RBAC evaluation
Optimization: Cache, webhook tuning, batching

API Server Request Lifecycle — Authentication → Authorization → Admission → Storage ​

Complete Request Flow ​

Latency Breakdown ​

Bottleneck Analysis ​

Common Bottleneck 1: Slow Webhooks ​

Common Bottleneck 2: etcd Latency ​

Common Bottleneck 3: Authorization Evaluation ​

Monitoring Request Latency ​

Optimization Strategies ​

Strategy 1: Cache Responses ​

Strategy 2: Optimize Webhooks ​

Strategy 3: Batch Operations ​

Reference Documentation ​

Summary ​

API Server Request Lifecycle — Authentication → Authorization → Admission → Storage

Complete Request Flow

Latency Breakdown

Bottleneck Analysis

Common Bottleneck 1: Slow Webhooks

Common Bottleneck 2: etcd Latency

Common Bottleneck 3: Authorization Evaluation

Monitoring Request Latency

Optimization Strategies

Strategy 1: Cache Responses

Strategy 2: Optimize Webhooks

Strategy 3: Batch Operations

Reference Documentation

Summary