Skip to content

Project Naming, Identifiers, và tự động hóa

Tại sao project identifiers là critical

Một trong những common mistakes ở production GCP deployments là nhầm lẫn hoặc sử dụng không chính xác ba loại identifiers khác nhau:

  • Project ID: Globally unique identifier (user-specified)
  • Project Number: Automatically assigned identifier
  • Project Name: Display name (human-readable)

Production failures:

  • Scripts dùng project number thay vì project ID → automation breaks
  • Automation assumptions về project ID format → name validation fails
  • Project rename → scripts vẫn reference old display name → silent failures
  • Project ID reuse attempt (after deletion) → quota exhaustion

Project ID: The Primary Identifier

Định nghĩa & Constraints

Project ID là globally unique identifier cho project trên toàn GCP platform.

Format: 6-30 characters
        lowercase letters [a-z]
        numbers [0-9]
        hyphens [-]
        Must start with letter
        Cannot end with hyphen
        Cannot contain "google" or "ssl" (reserved)

Examples (valid):

  • my-webapp-prod
  • backend-api-v2
  • data-pipeline-2024

Examples (invalid):

my_webapp_prod           # underscore not allowed
myWebappProd             # uppercase not allowed
my-webapp-prod--         # ends with hyphen
-my-webapp-prod          # starts with hyphen
3my-webapp-prod          # starts with number
my-google-webapp         # contains "google"

Immutability & Permanence

Project ID là immutable once created—không thể thay đổi, rename, hoặc reuse:

bash
# ❌ Cannot rename project ID
gcloud projects update my-project --name="new-name"
# Changes display name, NOT project ID

# Project ID remains: my-project
# Display name changes to: new-name

Thực tế này có huge production implications:

  1. Deployment artifact naming phải bám sát project ID format requirements:

    python
    # ❌ Problem: automation generates invalid project IDs
    def create_project_for_customer(customer_name):
        project_id = f"customer_{customer_name}"  # underscores not allowed!
        # Will fail
    
    # ✅ Solution: Sanitize input
    import re
    def create_project_for_customer(customer_name):
        # Sanitize: lowercase, replace non-alphanumeric with hyphens
        sanitized = re.sub(r'[^a-z0-9-]', '-', customer_name.lower())
        # Remove consecutive hyphens
        sanitized = re.sub(r'-+', '-', sanitized)
        # Remove leading/trailing hyphens
        sanitized = sanitized.strip('-')
        
        project_id = f"cust-{sanitized}"
        # Now valid
  2. Soft deletion blocking:

    T+0: delete_project("my-project")
    T+0 to T+30d: "my-project" in soft-delete state (counts against quota)
    T+30d: Permanently deleted, can reuse "my-project"
    
    Problem: If automation tries recreate immediately → fails
  3. Naming conventions must be strict: Công ty cần establish project ID naming policy từ start:

    Pattern: <team>-<service>-<environment>
    Examples:
    - backend-api-prod
    - backend-api-staging
    - data-pipeline-prod
    - frontend-web-dev

URL Exposure

Project ID xuất hiện ở many places publicly:

# Cloud Storage bucket naming
gs://project-id-bucket/

# Compute Engine instance addresses
instance.zone.c.project-id.internal

# Cloud Functions URLs
https://region-project-id.cloudfunctions.net/function-name

# BigQuery dataset IDs
project-id:dataset_name

# Docker image registry paths
gcr.io/project-id/image-name:tag

Security implication: Project IDs không phải secrets, nhưng expose organization structure + environment naming. Không put sensitive info (customer IDs, API keys) vào project IDs.

Project Number: The Internal Identifier

Project number là automatically assigned unique identifier—dùng ở GCP internal systems.

json
{
  "projectId": "my-webapp-prod",
  "projectNumber": "123456789012"
}

Characteristics:

  • 12-digit number (không bao giờ thay đổi)
  • Automatically assigned khi project được created
  • Unique trên toàn GCP (giống project ID)
  • Primarily used ở service account naming

Service Account Naming

Service accounts trong GCP được named:

PROJECT_NUMBER@PROJECT-ID.iam.gserviceaccount.com

Example:
123456789012@my-webapp-prod.iam.gserviceaccount.com

Implication: Nếu automation dùng project number thay vì project ID ở service account references, sẽ break:

python
# ❌ Wrong: Using project number
service_account = f"{project_number}@{project_number}.iam.gserviceaccount.com"

# ✅ Correct: Using project ID
service_account = f"{project_number}@{project_id}.iam.gserviceaccount.com"

Default Service Accounts

GCP automatically tạo default service accounts:

PROJECT_NUMBER-compute@developer.gserviceaccount.com  # Compute Engine default
PROJECT_NUMBER@cloudservices.gserviceaccount.com       # GCP services default

Warning: Default service accounts có Owner role (overly permissive). Production best practice:

  • Disable default service accounts
  • Create custom service accounts with least-privilege roles
bash
# Get default service account
DEFAULT_SA=$(gcloud iam service-accounts list \
  --filter="email~-compute@" \
  --format="value(email)")

# Disable it (keep for reference, but not usable)
gcloud iam service-accounts disable $DEFAULT_SA

# Create custom service account instead
gcloud iam service-accounts create app-sa \
  --display-name="Application Service Account"

Project Name: Display Name Only

Project name adalah human-readable display name—fully mutable, tidak unique.

bash
# Change display name (not project ID)
gcloud projects update my-webapp-prod --name="My Web Application - Production"

# Query project
gcloud projects describe my-webapp-prod --format='value(name)'
# Output: My Web Application - Production

Problem: Scripts/automation thường assume project name uniqueness:

python
# ❌ Problematic: Find project by display name
def find_project_by_name(org_id, display_name):
    projects = list_projects(org_id)
    for project in projects:
        if project['name'] == display_name:
            return project
    # Problem: Multiple projects mungkin have same name!

# ✅ Correct: Use project ID
def find_project_by_id(project_id):
    return get_project(project_id)

Naming Convention for Operations

Meskipun project ID immutable, project display name dapat update untuk reflect current state:

bash
# T+0: Development
gcloud projects update backend-api-staging \
  --name="Backend API - Staging (Team Lead: Alice)"

# T+6 months: Team changes
gcloud projects update backend-api-staging \
  --name="Backend API - Staging (Team Lead: Bob)"

Automation Best Practices

Pattern 1: Centralized Project Registry

Maintain source of truth para sa project metadata:

yaml
# projects.yaml
projects:
  - id: backend-api-prod
    name: "Backend API - Production"
    folder_id: folders/1234567890
    environment: production
    team: backend
  
  - id: backend-api-staging
    name: "Backend API - Staging"
    folder_id: folders/1234567890
    environment: staging
    team: backend
  
  - id: backend-api-dev
    name: "Backend API - Development"
    folder_id: folders/9876543210
    environment: development
    team: backend
python
import yaml

def load_project_registry(filepath):
    with open(filepath) as f:
        config = yaml.safe_load(f)
    return {p['id']: p for p in config['projects']}

projects = load_project_registry('projects.yaml')

# Safely reference projects
for project_id, config in projects.items():
    print(f"Project: {project_id} ({config['name']})")
    create_resources(project_id=project_id, config=config)

Pattern 2: Environment Variables

bash
# .env.prod
GCP_PROJECT_ID=backend-api-prod
GCP_PROJECT_NUMBER=123456789012
GCP_REGION=us-central1

# application.sh
#!/bin/bash
source .env.prod

gcloud config set project $GCP_PROJECT_ID

# Now all gcloud commands use correct project ID
gcloud compute instances list

Better: Use Terraform

hcl
variable "environment" {
  default = "prod"
}

locals {
  project_config = {
    prod = {
      project_id = "backend-api-prod"
      region     = "us-central1"
    }
    staging = {
      project_id = "backend-api-staging"
      region     = "us-central1"
    }
  }
}

provider "google" {
  project = local.project_config[var.environment].project_id
  region  = local.project_config[var.environment].region
}

Pattern 3: Programmatic Project Creation

python
def sanitize_project_id(user_input, prefix="proj"):
    """Convert user input to valid GCP project ID"""
    # Lowercase
    clean = user_input.lower()
    
    # Replace spaces, underscores with hyphens
    clean = re.sub(r'[^a-z0-9-]', '-', clean)
    
    # Remove consecutive hyphens
    clean = re.sub(r'-+', '-', clean)
    
    # Strip leading/trailing hyphens
    clean = clean.strip('-')
    
    # Add prefix
    project_id = f"{prefix}-{clean}"
    
    # Ensure within length limits
    project_id = project_id[:30]
    
    return project_id

# Test
assert sanitize_project_id("My_Web App!") == "proj-my-web-app"
assert sanitize_project_id("Customer Name 2024") == "proj-customer-name-2024"

Pattern 4: Project ID Collision Detection

python
def is_project_id_available(project_id, organization_id):
    """Check if project ID is available (not in soft-delete)"""
    try:
        resource_manager_client.get_project(project_id)
        # Project exists and accessible
        return False
    except google.api_core.exceptions.NotFound:
        # Project doesn't exist - verify it's not in soft-delete
        pass
    
    # Query Cloud Asset Inventory for deleted projects
    query = f'''
    resource.type = "cloudresourcemanager.googleapis.com/Project"
    AND name = "projects/{project_id}"
    '''
    
    # If found in asset inventory → in soft-delete
    # If not found → available
    return check_asset_inventory_for_project(organization_id, project_id)

def safe_create_project(project_id, organization_id, retry_count=3):
    """Create project with collision detection"""
    for attempt in range(retry_count):
        if not is_project_id_available(project_id, organization_id):
            raise ValueError(f"Project ID {project_id} not available")
        
        try:
            return create_project(project_id, organization_id)
        except google.api_core.exceptions.AlreadyExists:
            if attempt < retry_count - 1:
                time.sleep(5 * (2 ** attempt))  # exponential backoff
            else:
                raise

Querying Projects Correctly

bash
# ❌ Problem: Case sensitivity
gcloud projects describe MY-WEBAPP-PROD
# Error: Project 'MY-WEBAPP-PROD' not found
# (Project ID is my-webapp-prod)

# ✅ Correct: Lowercase project ID
gcloud projects describe my-webapp-prod

# ✅ Correct: Use --project flag
gcloud compute instances list --project=my-webapp-prod

# ✅ Query by display name (if needed)
gcloud projects list --filter="name:*Production*"

# ✅ Query by folder
gcloud projects list --filter="parent.id:folders/FOLDER_ID"

Project Deletion & Recycling

Soft Deletion Timeline

T+0: gcloud projects delete my-project
     → Status: DELETE_REQUESTED
     → Can restore with: gcloud projects undelete

T+7 days: Project moved to "deleted state"
     → Cannot restore anymore
     → Still counts against quota

T+30 days: Permanent deletion
     → Project ID becomes available for reuse
     → Quota becomes available

Production Scenario: Blue-Green Deployments

python
def create_temporary_project(base_name, lifetime_days=1):
    """Create temp project for testing, auto-delete after lifetime"""
    import datetime
    
    timestamp = datetime.datetime.now().strftime("%Y%m%d%H%M")
    project_id = f"{base_name}-tmp-{timestamp}"
    
    # Create project
    project = create_project(project_id)
    
    # Schedule deletion (using Cloud Scheduler + Cloud Functions)
    schedule_project_deletion(project_id, lifetime_days)
    
    return project_id

# Usage:
temp_project = create_temporary_project("backend-api")
# Automatically deleted after 1 day

Terraform State Management

hcl
# When using Terraform, project IDs tracked ö state file
resource "google_project" "prod" {
  name       = "My Application - Production"
  project_id = "my-app-prod"
}

# Output project identifiers
output "project_id" {
  value = google_project.prod.project_id
}

output "project_number" {
  value = google_project.prod.number
}

# Terraform refresh ensures state accurate
terraform refresh
terraform output project_id
# Output: my-app-prod

Common Anti-Patterns to Avoid

Anti-patternProblemSolution
Using project number in project referenceAPIs expect project IDAlways use project ID
Assuming project name uniquenessNames can duplicateUse project ID for queries
Hard-coding project IDsNot portable across environmentsUse variables/config files
Not sanitizing user input for project IDsInvalid characters break automationImplement validation
Not handling soft-delete windowProject ID collision after deletionWait 30 days or use different ID
Renaming projects in placeBreaks dependent scriptsPlan naming carefully upfront

References