Resource Manager API: Quy trình tự động hóa và tính nhất quán

Tại sao Resource Manager API quan trọng

Khi quy mô GCP deployments lên đến hàng trăm projects, management thủ công thông qua Google Cloud Console trở nên impossible at scale. Resource Manager API là programmatic interface để:

Create/delete projects
Move projects giữa folders
List resources theo hierarchy
Query resource state
Apply IAM policies hierarchically
Manage project labels programmatically

Production reality: Tất cả enterprise GCP orgs sử dụng Infrastructure as Code tooling (Terraform, Pulumi, CloudFormation equivalents) để automate resource creation. Resource Manager API là backbone của tất cả tooling này.

Common failure modes:

Scripting automation không account cho propagation delays → inconsistent state
Querying APIs trước khi resources fully propagated → 404 errors, race conditions
Assuming synchronous operations khi thực ra asynchronous → partial failures không được detect
Retry logic không implement properly → silent failures trong CI/CD pipelines

Resource Manager API Architecture

Resource Manager API tách biệt làm ba main components:

1. Resource API (Control Plane)

Handles project/folder CRUD operations:

bash

# Create project
curl -X POST https://cloudresourcemanager.googleapis.com/v3/projects \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "projectId": "my-new-project",
    "displayName": "My New Project",
    "parent": "folders/1234567890"
  }'

# Response trả về Operation object (asynchronous)
{
  "name": "operations/cp.123456789",
  "done": false,
  "createTime": "2026-06-01T10:00:00Z"
}

Important: Tất cả project creation là asynchronous. API trả về Operation object với done: false, không phải project resource trực tiếp.

2. Metadata API (Propagation Layer)

Sau khi project được created, metadata cần phải propagate tới tất cả GCP services:

API enablement metadata
IAM policy cache
Quota systems
Billing systems
Service-specific metadata

Propagation timeline (typical):

T+0s: Project created ở control plane
T+0.5-2s: Project visible ở GCP Console
T+2-5s: IAM policies begin propagating
T+5-10s: Most GCP services see project
T+10-30s: All services fully consistent (eventual consistency)
T+30s+: Caching/propagation complete

3. Query API (Cloud Asset Inventory)

Programmatic query về resource hierarchy:

bash

# Search all resources trong organization
gcloud asset search-all-resources \
  --scope=organizations/123456789 \
  --asset-types=compute.googleapis.com/Instance,storage.googleapis.com/Bucket

Phía sau, Cloud Asset Inventory maintain indexed view của tất cả resources—cho phép efficient queries. Nhưng queries cũng affected by eventual consistency.

Eventual Consistency Model

GCP không guarantee strong consistency cho resource hierarchy operations. Thay vì vậy:

Control plane: Strong consistency (project creation immediately visible ở API)
Data plane: Eventual consistency (services see project changes after 5-30s)
Client libraries: May cache results (additional delay)

Why? Strong consistency would require:

Synchronizing state tới tất cả regions
Blocking on all dependent services
Much higher latency (potentially 100s of milliseconds)

Instead, GCP chose high availability + eventual consistency, which is correct trade-off cho majority of workloads.

Practical Implications

Scenario 1: Project Creation + API Enablement

python

from google.cloud import resourcemanager
from google.cloud import resource_manager

# Create project
rm_client = resourcemanager.Client()
project = rm_client.project(project_id)
project.create()
print(f"Created project {project_id}")

# ❌ PROBLEM: Immediate API enablement
serviceusage_client = service_usage.ServiceUsageClient()
request = service_usage.BatchEnableServicesRequest(
    parent=f"projects/{project_id}",
    service_names=["compute.googleapis.com"]
)
response = serviceusage_client.batch_enable_services(request)
# May fail with "project not found" if propagation not complete

# ✅ BETTER: Retry with backoff
import time
from google.api_core import retry

@retry.Retry(deadline=60)
def enable_api_with_retry(project_id, api_name):
    try:
        request = service_usage.BatchEnableServicesRequest(
            parent=f"projects/{project_id}",
            service_names=[api_name]
        )
        return serviceusage_client.batch_enable_services(request)
    except google.api_core.exceptions.NotFound:
        # Project not yet visible to Service Usage API
        time.sleep(2)
        raise  # Retry will handle

enable_api_with_retry(project_id, "compute.googleapis.com")

Scenario 2: Project Move + Networking

python

# Move project tới different folder
move_request = ResourceManager::MoveProjectRequest(
    project_name=f"projects/{project_number}",
    folder_name=f"folders/{new_folder_id}"
)
response = resource_manager_stub.MoveProject(move_request)

# ❌ PROBLEM: Immediate assumption project moved
print(f"Project moved to folder {new_folder_id}")

# At this point:
# - Project metadata shows new folder (strong consistency)
# - BUT: VPC peering relationships, firewall rules, may not be updated yet
# - Networking APIs may still see project in old location (stale cache)

# ✅ BETTER: Verify through multiple checks
import time

def verify_project_moved(project_id, expected_folder_id, max_retries=30):
    for attempt in range(max_retries):
        project = resource_manager_client.get_project(project_id)
        parent_id = project.parent.id
        
        if parent_id == expected_folder_id:
            print(f"✓ Metadata updated (attempt {attempt})")
            break
        
        time.sleep(1)
    
    # Additional check: Verify through IAM policy propagation
    # If IAM policy changes visible, folder move likely complete
    for attempt in range(max_retries):
        try:
            policy = iam_client.get_iam_policy(project_id)
            # If we successfully get policy, likely propagated
            break
        except Exception:
            time.sleep(1)

verify_project_moved(project_id, expected_folder_id)

API Quotas & Rate Limiting

Resource Manager API có global rate limits:

Operation	Limit
Projects created per minute	5
Projects created per day	500 (per organization)
Folder operations per minute	60
IAM policy updates per minute	10 per resource

Production implication: Nếu automation tạo 100 projects, không thể parallel-spawn tất cả—phải batch dengan delays.

python

# ❌ Too fast - will hit rate limit
for i in range(100):
    create_project(f"project-{i}")

# ✅ Better - batch with delays
import concurrent.futures
from time import sleep

def create_projects_with_rate_limit(project_ids, batch_size=5, delay=15):
    """Create projects respecting rate limits"""
    for batch in [project_ids[i:i+batch_size] for i in range(0, len(project_ids), batch_size)]:
        futures = []
        with concurrent.futures.ThreadPoolExecutor(max_workers=batch_size) as executor:
            for project_id in batch:
                future = executor.submit(create_project, project_id)
                futures.append(future)
        
        # Wait for batch to complete
        for future in concurrent.futures.as_completed(futures):
            try:
                future.result()
            except Exception as e:
                print(f"Error creating project: {e}")
        
        # Delay between batches
        sleep(delay)

create_projects_with_rate_limit([f"proj-{i}" for i in range(100)])

Querying the Hierarchy

Approach 1: Resource Manager API (Legacy)

bash

# List projects ở folder
gcloud resource-manager projects list \
  --filter="parent.id:FOLDER_ID"

Problems:

Chỉ list projects, không list folders
Không có depth/nesting info
Không scalable cho deep hierarchies

Approach 2: Cloud Asset Inventory (Recommended)

bash

# Query resources theo hierarchy
gcloud asset search-all-resources \
  --scope=organizations/ORG_ID \
  --asset-types=cloudresourcemanager.googleapis.com/Project \
  --format="table(name,displayName,parent)"

# Filter by parent
gcloud asset search-all-resources \
  --scope=organizations/ORG_ID \
  --asset-types=cloudresourcemanager.googleapis.com/Project \
  --query="parent.display_name:'Engineering'" \
  --format="csv(name,displayName,parent.displayName)"

Advantages:

Queryable ở depth
Support filtering, regex, custom expressions
Scalable (uses indexed backend)
Can query across org tại scale

API vs Console Discrepancy

Sometimes GCP Console shows resources, nhưng API queries don't—vì Console caching vs API consistency.

bash

# Console shows project, but:
gcloud projects describe my-project
# Error: Project 'my-project' not found

# Solution: The project exists, but hasn't propagated to Resource Manager API yet
# Wait and retry

Handling Failures & Race Conditions

Idempotency & Project Deletion

Project deletion có special behavior:

T+0: gcloud projects delete my-project
     → Project marked DELETE_REQUESTED
     → Can still be restored

T+0 to T+30d: Project in "soft delete" state
     → Still counts against quota
     → IAM policies still exist
     → Billing stops

T+30d: Project permanently deleted

Production implication: Nếu automation tạo project, delete, rồi immediately tạo lại với same ID—sẽ fail vì project ID trong soft-delete window.

python

# ❌ This will fail
project = create_project("my-project")
delete_project("my-project")
project = create_project("my-project")  # Fails! Project ID reserved

# ✅ Solution 1: Wait for permanent deletion (30 days)
# ✅ Solution 2: Use different project IDs
# ✅ Solution 3: Check deletion status first
def safe_create_project(project_id, timeout=30):
    try:
        response = create_project(project_id)
        return response
    except google.api_core.exceptions.AlreadyExists:
        # Project might be in soft-delete state
        project = get_project(project_id)
        if project.lifecycle_state == "DELETE_REQUESTED":
            print(f"Project {project_id} in soft-delete. Cannot recreate for 30 days.")
            raise
        else:
            raise

Cross-API Consistency

When cross-project resources (e.g., shared VPC), Resource Manager operations dapat cause temporary inconsistencies:

T+0: Move project X tới folder Y (host project)
     → Project metadata updated
     → Pero: firewall rules, routes still cached ở data plane

T+5: User tries tạo VM ở project X → firewall enforcement may be inconsistent
     → VM creation may succeed, pero filtering inconsistent

T+30: All services see consistent state

Mitigation:

After moving projects, wait 30-60 seconds sebelum create dependent resources
Test cross-project functionality after moves

Monitoring API Health

bash

# Check Resource Manager API quota usage
gcloud compute project-info describe --project=PROJECT_ID \
  --format='value(quotas[name=PROJECTS].usage)'

# Monitor rate limit errors in logs
gcloud logging read \
  'severity=ERROR AND resource.type="api" AND protoPayload.serviceName="cloudresourcemanager.googleapis.com"' \
  --limit=50

Terraform with Resource Manager

hcl

# Terraform handles eventual consistency automatically
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

provider "google" {
  project = "my-terraform-project"
  region  = "us-central1"
}

resource "google_project" "prod" {
  name            = "Production Project"
  project_id      = "my-prod-project"
  folder_id       = google_folder.prod.id
  billing_account = var.billing_account_id
}

resource "google_project_service" "compute" {
  project = google_project.prod.project_id
  service = "compute.googleapis.com"
  
  # Terraform automatically waits for project propagation
  depends_on = [google_project.prod]
}

# Terraform handles retries transparently

Important: Terraform provider internally retries eventual consistency issues, pero you still dapat hit edge cases:

hcl

# These may race if not careful
resource "google_project" "my_project" {
  # ...
}

resource "google_compute_network" "default" {
  project = google_project.my_project.project_id
  name    = "default"
  
  # Must wait for project fully propagated
  depends_on = [google_project.my_project]
}

Best Practices

Always retry with exponential backoff khi tạo resources:

python

@retry.Retry(initial=1, maximum=60, multiplier=2)
def create_with_retry(project_id):
    return create_project(project_id)

Never assume synchronous operation:
- Project creation → async
- IAM policy propagation → async
- Service enablement → async
Query through consistent channels:
- Cloud Asset Inventory > Resource Manager API
- Asset Inventory has better indexing, consistency guarantees
Implement comprehensive monitoring:
- Log all resource manager operations
- Alert on rate limiting
- Track propagation delays
Test automation thoroughly:
- Test project creation + immediate API usage
- Test project moves + dependent resource operations
- Test across regions

Resource Manager API: Quy trình tự động hóa và tính nhất quán ​

Tại sao Resource Manager API quan trọng ​

Resource Manager API Architecture ​

1. Resource API (Control Plane) ​

2. Metadata API (Propagation Layer) ​

3. Query API (Cloud Asset Inventory) ​

Eventual Consistency Model ​

Practical Implications ​

API Quotas & Rate Limiting ​

Querying the Hierarchy ​

Approach 1: Resource Manager API (Legacy) ​

Approach 2: Cloud Asset Inventory (Recommended) ​

API vs Console Discrepancy ​

Handling Failures & Race Conditions ​

Idempotency & Project Deletion ​

Cross-API Consistency ​

Monitoring API Health ​

Terraform with Resource Manager ​

Best Practices ​

References ​

Resource Manager API: Quy trình tự động hóa và tính nhất quán

Tại sao Resource Manager API quan trọng

Resource Manager API Architecture

1. Resource API (Control Plane)

2. Metadata API (Propagation Layer)

3. Query API (Cloud Asset Inventory)

Eventual Consistency Model

Practical Implications

API Quotas & Rate Limiting

Querying the Hierarchy

Approach 1: Resource Manager API (Legacy)

Approach 2: Cloud Asset Inventory (Recommended)

API vs Console Discrepancy

Handling Failures & Race Conditions

Idempotency & Project Deletion

Cross-API Consistency

Monitoring API Health

Terraform with Resource Manager

Best Practices

References