Regional vs Global Services: Data Sovereignty & Architectural Implications
Vì sao quan trọng trong production
GCP services phân thành 2 loại:
Global services: Operate globally, auto-replicate data
- Examples: Cloud Storage, BigQuery, Cloud Spanner
- Problem: Data residency requirements (GDPR, HIPAA) violated
Regional services: Run in specific region, you control placement
- Examples: Compute Engine, Cloud SQL, GKE
- Benefit: Data sovereignty guaranteed
Chọn sai service type → compliance violations → legal penalties.
Internal Model: Data Sovereignty Constraints
Global Services (Default Behavior)
Cloud Storage bucket created (default):
├─ Default storage class: STANDARD (multi-region)
├─ Data placement: Replicated to multiple regions automatically
│ └─ Example: "us" multi-region bucket
│ └─ Data stored in: us-central1 OR us-east1 OR us-west1
│ └─ GCP chooses automatically
│
├─ Compliance problem:
│ ├─ GDPR: "Processing must occur in EU"
│ ├─ What happened: Data in US data center
│ └─ Violation: Fined by EU authorities
│
├─ Regulatory requirements:
│ ├─ GDPR (EU): All EU personal data must stay in EU
│ ├─ CCPA (California): California resident data must stay in CA region
│ ├─ HIPAA (Healthcare): Patient data in specified regions only
│ ├─ SOC 2: Audit requirements, data location sensitive
│ └─ FedRAMP (Gov): US government data in approved facilities onlyRegional Services (Controlled Placement)
Compute Engine instance created:
├─ Specified zone: europe-west1-b (Belgium)
├─ Data stored: Only in europe-west1-b (your control)
├─ Replication: Your choice (replicate to other EU zones if needed)
│
├─ Compliance benefit:
│ ├─ GDPR: Data stays in EU → Compliant
│ ├─ Control: You decide where replicated
│ └─ Audit: Provable data location
│
├─ Regional database (Cloud SQL):
│ ├─ Primary: europe-west1
│ ├─ Replica: europe-west1 (same region, explicit)
│ ├─ Or: Another EU region (eu-west4) if needed
│ └─ Backup: Regional snapshot, stays in regionProduction Architecture Patterns
Pattern 1: GDPR-Compliant Architecture
Requirement: Process EU user data only in EU regions
Architecture:
├─ Compute tier:
│ └─ GKE cluster in eu-west1 (Belgium)
│
├─ Database tier:
│ └─ Cloud SQL PostgreSQL (eu-west1 primary + eu-west4 replica)
│
├─ Storage tier:
│ └─ Cloud Storage regional bucket (europe-west1)
│ └─ NOT multi-region (would violate GDPR)
│
├─ Analytics tier:
│ └─ BigQuery EU multi-region dataset
│ └─ Or: BigQuery regional EU dataset (more compliant)
│
└─ Backup tier:
└─ Snapshots in eu-west1 only (not multi-region)
Compliance verification:
├─ Data location audit:
│ └─ Run: "Where does each datum live?"
│ └─ Result: All in EU regions
│
├─ Access control:
│ └─ DLP (Data Loss Prevention) scans for sensitive data
│ └─ Alert if: Detected outside EU region
│
└─ Documentation:
└─ Data Processing Agreement (DPA) with GCP
└─ Proves: EU data processing in EUPattern 2: Multi-Region Compliance (US + EU Separation)
Requirement: US users data in US, EU users data in EU
Challenge: Single application needs to serve both
Architecture:
├─ Application layer (global, multi-region):
│ ├─ Load balancer: Global (routes by geography)
│ └─ Traffic steering: Users→nearest region
│
├─ US data tier:
│ ├─ Region: us-central1
│ ├─ Database: Cloud SQL (us-central1 primary)
│ ├─ Storage: Cloud Storage regional (us-central1)
│ └─ Users: US-based only
│
├─ EU data tier:
│ ├─ Region: eu-west1
│ ├─ Database: Cloud SQL (eu-west1 primary)
│ ├─ Storage: Cloud Storage regional (eu-west1)
│ └─ Users: EU-based only
│
└─ Application routing logic:
└─ if (user.location == 'US') → Connect to us-central1 services
└─ if (user.location == 'EU') → Connect to eu-west1 servicesPattern 3: Shared Global Service with Regional Data
Scenario: BigQuery analytics (global service) with regional data constraints
Challenge:
├─ BigQuery processes queries globally
├─ But personal data must stay in region
├─ How to analyze without violating constraints?
Solution:
├─ Separate datasets per region:
│ ├─ Dataset 1: EU personal data (BigQuery EU multi-region)
│ ├─ Dataset 2: US personal data (BigQuery US multi-region)
│ └─ Dataset 3: Anonymous/aggregated (global, no constraints)
│
├─ Query strategy:
│ ├─ For EU analysis: Query EU dataset (stays in EU)
│ ├─ For US analysis: Query US dataset (stays in US)
│ └─ For global analysis: Query anonymous dataset
│
├─ Cross-region analysis:
│ └─ If needed: Must aggregate at regional boundary
│ └─ Example: "US count" + "EU count" = "Global count" (allowed)
│ └─ But: Can't run single query across personal data (not allowed)Real-world Compliance Scenarios
Scenario 1: Accidental Multi-Region Storage (GDPR Violation)
Mistake: Created Cloud Storage bucket without specifying region
Symptom:
├─ Bucket created: "eu-gdpr-data"
├─ Expected: Data in Europe
├─ Actual: Multi-region bucket (default)
│ └─ Data stored in: us-central1, us-east1, eu-west1
│ └─ US data centers hold EU personal data!
Discovery:
├─ Audit: "Where is eu-gdpr-data stored?"
├─ GCP response: "Multi-region (US + EU locations)"
├─ Compliance check: Fails (data outside EU)
Impact:
├─ Regulatory audit: GDPR violation detected
├─ Notification: EU supervisory authority
├─ Fine: €4% of global revenue (potentially millions)
├─ Mandatory: Create new regional bucket, migrate data
Resolution:
├─ Create: eu-west1 regional bucket
├─ Migrate: All data from multi-region to regional
├─ Delete: Multi-region bucket
└─ Certification: Provide proof of data location to authoritiesScenario 2: Replication Violation (Data Residency)
Architecture:
├─ Primary: Cloud SQL in eu-west1 (EU requirement)
├─ Replica: Cloud SQL in us-central1 (for failover)
Problem:
├─ Primary in EU ✓ (compliant)
├─ Replica in US ✗ (non-compliant)
├─ Why: Some EU personal data replicated to US
Consequences:
├─ GDPR violation: Data transferred outside EU
├─ Transfer basis: Needs SCCs (Standard Contractual Clauses)
├─ Audit finding: Improper data transfer mechanism
└─ Fix: Use only EU replicas (eu-west4) or delete US replica
Correct approach:
└─ Primary: eu-west1
└─ Replica: eu-west4 (both in EU, always)Scenario 3: Implicit Data Transfer (BigQuery)
Scenario: European user with BigQuery in EU, US team analyzes data
Question: Does analysis transfer data to US?
├─ BigQuery dataset: EU multi-region
├─ Query: Analyst in US submits query
├─ Execution: Where does query run?
Answer:
├─ Query execution: Stays in EU region
├─ Results: Can be downloaded by US analyst
├─ Data transfer: No, analysis stayed in EU
├─ Compliance: OK (compute didn't move data)
But if:
├─ Export results to Cloud Storage: Check bucket location
├─ If bucket in us-central1: Now data in US ✗
├─ Solution: Export to EU regional bucket instead ✓Common Mistakes & Anti-Patterns
Mistake 1: Assuming Global Service = Data Spread Everywhere
❌ Wrong thinking:
"Cloud Storage stores data globally, so all data duplicated worldwide"✅ Correct understanding:
- Multi-region: Replicated across selected regions only
- Regional: Single region only
- You choose:
gs://my-bucket/in us (multi-region US only) vsgs://my-bucket/in europe-west1 (regional EU only)
Prevention: Always specify storage location when creating Cloud Storage buckets.
Mistake 2: Relying on "Hidden" Data Processing
❌ Wrong thinking:
"GCP processes data in EU region, so it's always EU-processed"✅ Correct understanding:
- Some services: Implicit processing outside region
- Example: BigQuery global service might cache in US
- Must verify: Where actually processes for compliance
- Better: Use regional datasets / services when required
Prevention: Review GCP compliance documentation for each service used.
Mistake 3: Forgetting Backups/Snapshots in Compliance
❌ Wrong thinking:
"Only the primary database matters, backups can go anywhere"✅ Correct understanding:
- Backups: Also personal data if primary has it
- Must follow same residence requirements
- Default backup location: Same region (usually)
- Risk: Setting backup to different region → data violation
Prevention: Verify backup storage location in service configuration.
GCP-native Implementation Guidance
Enforcing Regional Constraints
bash
# Create regional Cloud Storage bucket (not multi-region)
gsutil mb -l europe-west1 gs://eu-data-regional-bucket
# Create regional Cloud SQL instance
gcloud sql instances create eu-db \
--database-version=POSTGRES_15 \
--region=europe-west1 \
--tier=db-f1-micro
# Create regional GKE cluster
gcloud container clusters create eu-cluster \
--region=europe-west1 \
--num-nodes=3
# Create regional BigQuery dataset
bq mk --location=EU --dataset eu_data_dataset
# Verify regional placement
gsutil bucketpolicyonly get gs://eu-data-regional-bucket
gcloud sql instances describe eu-db --format='value(region)'
bq show --format=json eu_data_dataset | jq '.location'Compliance Auditing
bash
# Find all Cloud Storage buckets and their location
gsutil ls -L -b
# Check for multi-region buckets (potential violation)
for bucket in $(gsutil ls); do
LOCATION=$(gsutil bucketpolicyonly get "$bucket" 2>/dev/null | grep 'Location' || echo "Unknown")
if [[ "$LOCATION" == *"multi"* ]]; then
echo "WARNING: Multi-region bucket found: $bucket"
fi
done
# List Cloud SQL instances and regions
gcloud sql instances list --format='table(name, region, databaseVersion)'
# Audit data location compliance
gcloud logging read \
'severity=WARNING AND protoPayload.methodName=~"gcp.resourcemanager"' \
--format='table(severity, timestamp, protoPayload.resourceName)'Data Residency Assertion
bash
# Create compliance certificate (proof of regional placement)
# 1. List all services per region
gcloud compute instances list --format='value(name, zone)' | cut -d- -f1-3 | sort -u
gcloud sql instances list --format='value(name, region)'
gcloud container clusters list --format='value(name, location)'
# 2. Create compliance report
cat > compliance_report.txt << EOF
Data Residency Compliance Report - Generated $(date)
Compute Engine VMs (EU regions):
$(gcloud compute instances list --format='table(name, zone)' --filter='zone:europe*')
Cloud SQL Databases (EU regions):
$(gcloud sql instances list --format='table(name, region)' --filter='region:europe*')
Cloud Storage Buckets (Regional EU):
$(gsutil ls -L -b | grep europe)
Compliance Status: VERIFIED ✓
Date Verified: $(date)
Auditor: [Your Name]
EOF
echo "Compliance report created: compliance_report.txt"References
- GCP Compliance and Data Residency — Official guidance
- GDPR Compliance on GCP — EU-specific requirements
- Data Residency and Sovereignty — Architecture considerations
- Cloud Storage Location Best Practices — Regional vs multi-region
Next: Traffic Engineering & Multi-path Load Balancing — Final chapter: Putting it all together