Skip to content

VPC Service Controls — Preventing Data Exfiltration at Scale

Executive Summary

VPC Service Controls = Security perimeter kiểm soát data exfiltration (không phải network level).

Model:

  • ✅ Define protected resources (databases, buckets, projects)
  • ✅ Allow access from specific identities only
  • ✅ Prevent exfiltration to unauthorized destinations
  • ✅ Works across network boundaries (not just VPC)
  • ❌ L3-7 enforcement (higher latency than firewall)
  • ❌ Complexity: Requires careful access policy design

Network Security ≠ Data Security

Traditional Firewall Approach

Firewall rules: ALLOW tcp:443 from 10.0.1.0/24
  ├── Allows: app-tier → Cloud Storage
  ├── But also allows: malicious VM in app-tier → Cloud Storage
  ├── Problem: Cannot distinguish good vs bad traffic
  └── Result: Data accessible to any VM in app-tier

Implication:
  If one VM compromised, attacker can exfiltrate data
  Firewall cannot detect/prevent

VPC Service Controls Approach

Service perimeter: "Prod Data"
  Protected resources:
    - Cloud Storage bucket "prod-data"
    - BigQuery dataset "prod-analytics"
  
  Allowed members:
    - service-account@prod-analytics.iam.gserviceaccount.com
    - group:data-engineers@company.com

Implication:
  Even if app-tier VM compromised, attacker cannot:
    - Use prod SA to access data (SA not available in VM)
    - Use user account (user not in allowed group)
    - Exfiltrate data to unauthorized service
  
  Result: Data protected at API level, not just network

Architecture: Service Perimeter

Defining Perimeter

Service perimeter "prod_data_perimeter":
  ├── Protected resources:
  │   ├── Cloud Storage projects/PROJECT_ID/buckets/prod-data
  │   ├── BigQuery projects/PROJECT_ID/datasets/prod_analytics
  │   ├── Cloud SQL instances (via project)
  │   └── Secret Manager (via project)

  ├── Allowed members:
  │   ├── Service account: sa-etl@project.iam.gserviceaccount.com
  │   ├── User group: data-admins@company.com
  │   └── Organization: company.com

  ├── Access levels:
  │   ├── "corporate_network": from corporate IPs
  │   └── "device_policy": from managed devices only

  └── Egress policies:
      ├── "ALLOW_EGRESS_TO_AWS": Allow to AWS Redshift
      └── "DENY_ALL_OTHER": Block everything else

Access Levels

Access Level: Condition for granting access

Example 1: Device Policy
  Access level "managed_devices"
    ├── Device OS: Linux, macOS, Windows (approved)
    ├── Device encryption: Required
    ├── Device security score: > 80
    └── Device updated: < 30 days old
  
  Effect: Only users on compliant devices can access data

Example 2: Network Policy
  Access level "corporate_network"
    ├── IP addresses: 203.0.113.0/24 (corporate)
    └── VPCs: prod-vpc, staging-vpc
  
  Effect: Only users from corporate network can access

Example 3: Time-based
  Access level "business_hours"
    ├── Time: 9 AM - 5 PM
    ├── Day: Monday - Friday
    └── Timezone: America/New_York
  
  Effect: Users can access data during business hours only

Preventing Data Exfiltration

Egress Policies: Control Who Can Access Data

Setup:

Service perimeter "prod_data":
  Protected: Cloud Storage "gs://prod-data"
  
  Egress policy (to allow specific principals):
    ALLOW_EGRESS_TO_IDENTITIES:
      - sa-etl@project.iam.gserviceaccount.com
      - group:data-admins@company.com
      - domain:company.com
    
  All other: DENY (implicit)

Scenario 1: ETL Service Account reads data
  Request: gsutil cp gs://prod-data/file.csv /tmp/
  Principal: sa-etl@project.iam.gserviceaccount.com
  
  VPC-SC check: Is sa-etl allowed?
    Answer: YES (in allowed identities)
  
  Result: ✓ Access granted

Scenario 2: Compromised app VM tries to read data
  Request: gsutil cp gs://prod-data/file.csv /tmp/
  Principal: app-vm@project.iam.gserviceaccount.com
  
  VPC-SC check: Is app-vm allowed?
    Answer: NO (not in allowed identities)
  
  Result: ✗ Access denied (even though network path exists)

Ingress Policies: Control How Data Comes In

Scenario: Prevent API access from internet

Service perimeter:
  Protected: BigQuery dataset "customer_pii"
  
  Ingress policy:
    ALLOW_FROM:
      - resources: projects/PROJECT_ID
      - identities: sa-analytics@project.iam.gserviceaccount.com
    
    ONLY_FROM_NETWORK:
      - prod-vpc (VPC Service Controls extension)
    
    DENY ALL ELSE

Scenario: User tries to query dataset from home
  Request: bq query --use_legacy_sql=false "SELECT * FROM customer_pii"
  Source network: Home ISP (not in prod-vpc)
  Principal: alice@company.com
  
  VPC-SC check: Is alice allowed from this network?
    Answer: NO (not from prod-vpc)
  
  Result: ✗ Query rejected

Implementation: Setting Up VPC-SC

Create Service Perimeter

bash
# Define access level first
gcloud access-context-manager levels create corporate_network \
  --title="Corporate Network" \
  --basic-level-spec=ip_subnetworks=203.0.113.0/24

# Create service perimeter
gcloud access-context-manager perimeters create prod_data \
  --resources="projects/PROJECT_ID" \
  --access-levels="corporate_network" \
  --description="Prod data protection"
  
# Add restricted services
gcloud access-context-manager perimeters update prod_data \
  --add-restricted-services="storage.googleapis.com" \
  --add-restricted-services="bigquery.googleapis.com"

Configure Egress Policy

bash
# Allow specific service accounts to access data
gcloud access-context-manager perimeters update prod_data \
  --add-egress-policies="{
    \"egressFrom\": {
      \"identities\": [
        \"serviceAccount:sa-etl@project.iam.gserviceaccount.com\",
        \"group:data-admins@company.com\"
      ]
    },
    \"egressTo\": {
      \"resources\": [\"*\"],
      \"operations\": [
        {\"serviceName\": \"storage.googleapis.com\"},
        {\"serviceName\": \"bigquery.googleapis.com\"}
      ]
    }
  }"

Use Cases

Use Case 1: Preventing Data Theft by Disgruntled Employee

Scenario:
  Employee Alice (data-admin) leaves company (disgruntled)
  Has SA credentials for prod-data bucket
  Tries to download all data before leaving
  
Without VPC-SC:
  gsutil cp -r gs://prod-data /* /tmp/
  → Succeeds (firewall doesn't check identity)
  → Data exfiltrated to attacker
  
With VPC-SC:
  gsutil cp -r gs://prod-data /* /tmp/
  → Perimeter intercepted request
  → Checked: Is user in allowed group?
  → Alice removed from group during offboarding
  → Request DENIED ✗
  → No data exfiltrated

Use Case 2: Preventing Insider Cross-Org Data Access

Scenario:
  Company A and Company B partnering
  A's employees need to access B's Cloud Storage (for analytics)
  B wants to prevent A's employees from downloading all data
  
Setup (Company B):
  Service perimeter "customer_data":
    Protected: gs://customer-analytics
    
    Egress policy:
      ALLOW:
        - service-account: sa-company-a-access@company-b.iam.gserviceaccount.com
      
      Additional restriction:
        - Only allow BigQuery queries (NOT gsutil download)
        - Only specific columns (via IAM custom roles)
  
Result:
  Company A employees can query data (business need)
  Cannot bulk download (prevent exfiltration)
  Cannot access sensitive columns (fine-grained access)

Use Case 3: Compliance: Preventing Exfiltration to Non-Approved Cloud

Scenario:
  Company policy: Data never leaves Google Cloud
  Employees have personal AWS accounts
  Must prevent exfiltration to AWS
  
Setup:
  Service perimeter "compliance":
    Protected: All prod projects
    
    Egress policy:
      ALLOW_EGRESS_TO:
        - Only to Google Cloud APIs
      DENY_EGRESS_TO:
        - AWS endpoints (*.amazonaws.com)
        - Azure endpoints (*.blob.core.windows.net)
        - Any non-Google service
  
Result:
  Even if employee compromised, data cannot leave Google Cloud
  Compliance requirement met (data residency)

Troubleshooting VPC-SC

Symptom: Access Denied (Unexpected)

Error: (gcloud.compute.instances.create)
  User does not have permission...

Diagnosis:

1. Check if resource in perimeter:
   gcloud access-context-manager perimeters describe prod_data \
     --format="value(restrictedServices)"
   
   If service listed: Service protected by VPC-SC

2. Check if identity in allowed list:
   gcloud access-context-manager perimeters describe prod_data \
     --format="value(egressPolicies)"
   
   Check: Is your identity/group listed?
   
3. Check access level conditions:
   gcloud access-context-manager levels describe corporate_network \
     --format="value(basicLevel)"
   
   Verify: Are you meeting conditions? (IP, device, etc.)

4. Enable VPC-SC logs:
   gcloud logging read \
     "resource.type=api AND labels.service_name=accesscontextmanager" \
     --limit=20

5. Remediation:
   - Add identity to allowed list
   - Update access level conditions
   - Request exception (if legitimate)

Best Practices

Do:

  • Use VPC-SC for sensitive data (PII, secrets, customer data)
  • Combine with firewall rules (defense in depth)
  • Regularly audit perimeter membership
  • Test access policies in staging first
  • Document business case for each perimeter

Don't:

  • Over-engineer (not all data needs VPC-SC)
  • Mix VPC-SC with firewall confusion (both needed)
  • Forget to offboard employees (remove from groups)
  • Assume VPC-SC prevents all exfiltration (defense-in-depth)
  • Use single access level for all perimeters (too permissive)

Comparison: VPC-SC vs Firewall

yaml
Firewall Rules:
  - Scope: Network level (IP, port, protocol)
  - Prevents: Unauthorized network access
  - Cannot distinguish: Which user, which data
  - Latency: <1ms
  
VPC Service Controls:
  - Scope: Identity level (user, SA, group)
  - Prevents: Unauthorized data access
  - Can distinguish: Which user, which data, which identity
  - Latency: 50-200ms

Both together:
  1. Firewall allows network path (IP, port)
  2. VPC-SC checks identity (user, group)
  3. Data accessed only if both allow

Conclusion

VPC Service Controls provides identity-based data protection:

  • Perimeter model: Define what data is protected
  • Access control: Who can access (identities, groups)
  • Exfiltration prevention: Block unauthorized egress
  • Compliance: Data residency, access audit

Essential for: Sensitive data (PII, customer records, compliance-critical).

Complements: Firewall (network level), IAM (identity level).

Not needed: Non-sensitive data, single-project simple scenarios.