VPC Service Controls — Preventing Data Exfiltration at Scale
Executive Summary
VPC Service Controls = Security perimeter kiểm soát data exfiltration (không phải network level).
Model:
- ✅ Define protected resources (databases, buckets, projects)
- ✅ Allow access from specific identities only
- ✅ Prevent exfiltration to unauthorized destinations
- ✅ Works across network boundaries (not just VPC)
- ❌ L3-7 enforcement (higher latency than firewall)
- ❌ Complexity: Requires careful access policy design
Network Security ≠ Data Security
Traditional Firewall Approach
Firewall rules: ALLOW tcp:443 from 10.0.1.0/24
├── Allows: app-tier → Cloud Storage
├── But also allows: malicious VM in app-tier → Cloud Storage
├── Problem: Cannot distinguish good vs bad traffic
└── Result: Data accessible to any VM in app-tier
Implication:
If one VM compromised, attacker can exfiltrate data
Firewall cannot detect/preventVPC Service Controls Approach
Service perimeter: "Prod Data"
Protected resources:
- Cloud Storage bucket "prod-data"
- BigQuery dataset "prod-analytics"
Allowed members:
- service-account@prod-analytics.iam.gserviceaccount.com
- group:data-engineers@company.com
Implication:
Even if app-tier VM compromised, attacker cannot:
- Use prod SA to access data (SA not available in VM)
- Use user account (user not in allowed group)
- Exfiltrate data to unauthorized service
Result: Data protected at API level, not just networkArchitecture: Service Perimeter
Defining Perimeter
Service perimeter "prod_data_perimeter":
├── Protected resources:
│ ├── Cloud Storage projects/PROJECT_ID/buckets/prod-data
│ ├── BigQuery projects/PROJECT_ID/datasets/prod_analytics
│ ├── Cloud SQL instances (via project)
│ └── Secret Manager (via project)
│
├── Allowed members:
│ ├── Service account: sa-etl@project.iam.gserviceaccount.com
│ ├── User group: data-admins@company.com
│ └── Organization: company.com
│
├── Access levels:
│ ├── "corporate_network": from corporate IPs
│ └── "device_policy": from managed devices only
│
└── Egress policies:
├── "ALLOW_EGRESS_TO_AWS": Allow to AWS Redshift
└── "DENY_ALL_OTHER": Block everything elseAccess Levels
Access Level: Condition for granting access
Example 1: Device Policy
Access level "managed_devices"
├── Device OS: Linux, macOS, Windows (approved)
├── Device encryption: Required
├── Device security score: > 80
└── Device updated: < 30 days old
Effect: Only users on compliant devices can access data
Example 2: Network Policy
Access level "corporate_network"
├── IP addresses: 203.0.113.0/24 (corporate)
└── VPCs: prod-vpc, staging-vpc
Effect: Only users from corporate network can access
Example 3: Time-based
Access level "business_hours"
├── Time: 9 AM - 5 PM
├── Day: Monday - Friday
└── Timezone: America/New_York
Effect: Users can access data during business hours onlyPreventing Data Exfiltration
Egress Policies: Control Who Can Access Data
Setup:
Service perimeter "prod_data":
Protected: Cloud Storage "gs://prod-data"
Egress policy (to allow specific principals):
ALLOW_EGRESS_TO_IDENTITIES:
- sa-etl@project.iam.gserviceaccount.com
- group:data-admins@company.com
- domain:company.com
All other: DENY (implicit)
Scenario 1: ETL Service Account reads data
Request: gsutil cp gs://prod-data/file.csv /tmp/
Principal: sa-etl@project.iam.gserviceaccount.com
VPC-SC check: Is sa-etl allowed?
Answer: YES (in allowed identities)
Result: ✓ Access granted
Scenario 2: Compromised app VM tries to read data
Request: gsutil cp gs://prod-data/file.csv /tmp/
Principal: app-vm@project.iam.gserviceaccount.com
VPC-SC check: Is app-vm allowed?
Answer: NO (not in allowed identities)
Result: ✗ Access denied (even though network path exists)Ingress Policies: Control How Data Comes In
Scenario: Prevent API access from internet
Service perimeter:
Protected: BigQuery dataset "customer_pii"
Ingress policy:
ALLOW_FROM:
- resources: projects/PROJECT_ID
- identities: sa-analytics@project.iam.gserviceaccount.com
ONLY_FROM_NETWORK:
- prod-vpc (VPC Service Controls extension)
DENY ALL ELSE
Scenario: User tries to query dataset from home
Request: bq query --use_legacy_sql=false "SELECT * FROM customer_pii"
Source network: Home ISP (not in prod-vpc)
Principal: alice@company.com
VPC-SC check: Is alice allowed from this network?
Answer: NO (not from prod-vpc)
Result: ✗ Query rejectedImplementation: Setting Up VPC-SC
Create Service Perimeter
bash
# Define access level first
gcloud access-context-manager levels create corporate_network \
--title="Corporate Network" \
--basic-level-spec=ip_subnetworks=203.0.113.0/24
# Create service perimeter
gcloud access-context-manager perimeters create prod_data \
--resources="projects/PROJECT_ID" \
--access-levels="corporate_network" \
--description="Prod data protection"
# Add restricted services
gcloud access-context-manager perimeters update prod_data \
--add-restricted-services="storage.googleapis.com" \
--add-restricted-services="bigquery.googleapis.com"Configure Egress Policy
bash
# Allow specific service accounts to access data
gcloud access-context-manager perimeters update prod_data \
--add-egress-policies="{
\"egressFrom\": {
\"identities\": [
\"serviceAccount:sa-etl@project.iam.gserviceaccount.com\",
\"group:data-admins@company.com\"
]
},
\"egressTo\": {
\"resources\": [\"*\"],
\"operations\": [
{\"serviceName\": \"storage.googleapis.com\"},
{\"serviceName\": \"bigquery.googleapis.com\"}
]
}
}"Use Cases
Use Case 1: Preventing Data Theft by Disgruntled Employee
Scenario:
Employee Alice (data-admin) leaves company (disgruntled)
Has SA credentials for prod-data bucket
Tries to download all data before leaving
Without VPC-SC:
gsutil cp -r gs://prod-data /* /tmp/
→ Succeeds (firewall doesn't check identity)
→ Data exfiltrated to attacker
With VPC-SC:
gsutil cp -r gs://prod-data /* /tmp/
→ Perimeter intercepted request
→ Checked: Is user in allowed group?
→ Alice removed from group during offboarding
→ Request DENIED ✗
→ No data exfiltratedUse Case 2: Preventing Insider Cross-Org Data Access
Scenario:
Company A and Company B partnering
A's employees need to access B's Cloud Storage (for analytics)
B wants to prevent A's employees from downloading all data
Setup (Company B):
Service perimeter "customer_data":
Protected: gs://customer-analytics
Egress policy:
ALLOW:
- service-account: sa-company-a-access@company-b.iam.gserviceaccount.com
Additional restriction:
- Only allow BigQuery queries (NOT gsutil download)
- Only specific columns (via IAM custom roles)
Result:
Company A employees can query data (business need)
Cannot bulk download (prevent exfiltration)
Cannot access sensitive columns (fine-grained access)Use Case 3: Compliance: Preventing Exfiltration to Non-Approved Cloud
Scenario:
Company policy: Data never leaves Google Cloud
Employees have personal AWS accounts
Must prevent exfiltration to AWS
Setup:
Service perimeter "compliance":
Protected: All prod projects
Egress policy:
ALLOW_EGRESS_TO:
- Only to Google Cloud APIs
DENY_EGRESS_TO:
- AWS endpoints (*.amazonaws.com)
- Azure endpoints (*.blob.core.windows.net)
- Any non-Google service
Result:
Even if employee compromised, data cannot leave Google Cloud
Compliance requirement met (data residency)Troubleshooting VPC-SC
Symptom: Access Denied (Unexpected)
Error: (gcloud.compute.instances.create)
User does not have permission...
Diagnosis:
1. Check if resource in perimeter:
gcloud access-context-manager perimeters describe prod_data \
--format="value(restrictedServices)"
If service listed: Service protected by VPC-SC
2. Check if identity in allowed list:
gcloud access-context-manager perimeters describe prod_data \
--format="value(egressPolicies)"
Check: Is your identity/group listed?
3. Check access level conditions:
gcloud access-context-manager levels describe corporate_network \
--format="value(basicLevel)"
Verify: Are you meeting conditions? (IP, device, etc.)
4. Enable VPC-SC logs:
gcloud logging read \
"resource.type=api AND labels.service_name=accesscontextmanager" \
--limit=20
5. Remediation:
- Add identity to allowed list
- Update access level conditions
- Request exception (if legitimate)Best Practices
✅ Do:
- Use VPC-SC for sensitive data (PII, secrets, customer data)
- Combine with firewall rules (defense in depth)
- Regularly audit perimeter membership
- Test access policies in staging first
- Document business case for each perimeter
❌ Don't:
- Over-engineer (not all data needs VPC-SC)
- Mix VPC-SC with firewall confusion (both needed)
- Forget to offboard employees (remove from groups)
- Assume VPC-SC prevents all exfiltration (defense-in-depth)
- Use single access level for all perimeters (too permissive)
Comparison: VPC-SC vs Firewall
yaml
Firewall Rules:
- Scope: Network level (IP, port, protocol)
- Prevents: Unauthorized network access
- Cannot distinguish: Which user, which data
- Latency: <1ms
VPC Service Controls:
- Scope: Identity level (user, SA, group)
- Prevents: Unauthorized data access
- Can distinguish: Which user, which data, which identity
- Latency: 50-200ms
Both together:
1. Firewall allows network path (IP, port)
2. VPC-SC checks identity (user, group)
3. Data accessed only if both allowConclusion
VPC Service Controls provides identity-based data protection:
- Perimeter model: Define what data is protected
- Access control: Who can access (identities, groups)
- Exfiltration prevention: Block unauthorized egress
- Compliance: Data residency, access audit
Essential for: Sensitive data (PII, customer records, compliance-critical).
Complements: Firewall (network level), IAM (identity level).
Not needed: Non-sensitive data, single-project simple scenarios.