Skip to content

Network Intelligence Center — Unified Network Observability

Executive Summary

Network Intelligence Center = Google's managed dashboard cho VPC troubleshooting.

Features:

  • ✅ Topology visualization (VPC, subnets, VMs, peering)
  • ✅ Connectivity tests (layer 3-7)
  • ✅ Performance monitoring (latency, packet loss)
  • ✅ Firewall insights (rule analysis, optimization)
  • ✅ Packet mirroring / packet capture support

Accessing Network Intelligence Center

Opening NIC

GCP Console → VPC Network → Network Intelligence Center

Or via gcloud:
gcloud compute network-management connectivity-tests list

Or via API:
GET https://networkmanagement.googleapis.com/v1/projects/PROJECT_ID/locations/global/connectivityTests

Core Features

1. Topology Visualization

Display: VPC layout
  ├── VPC "prod-vpc"
  │   ├── Subnet "prod-app" (us-central1)
  │   │   ├── VM "app-1"
  │   │   └── VM "app-2"
  │   ├── Subnet "prod-db" (us-central1)
  │   │   └── VM "db-1"
  │   ├── Cloud Router (BGP)
  │   └── Cloud NAT

  ├── Peering "prod-staging-peer"
  │   └── VPC "staging-vpc"

  └── Cloud VPN (to on-prem)
      └── On-premises network 192.168.0.0/16

Interactive:
  - Click VM → see network details
  - Click route → see routing path
  - Click firewall rule → see applicability

2. Connectivity Tests

Test endpoint-to-endpoint reachability:

Source: app-1 (10.0.1.5)
Target: db-1 (10.0.2.10), port 3306
Protocol: TCP

Test runs:
  Step 1: Source VM reachable? YES
  Step 2: Route exists? YES (10.0.2.0/24 local)
  Step 3: Firewall allows? CHECKING...
  Step 4: Destination VM reachable? YES
  
Result: CONNECTIVITY OK ✓
Latency: 1.2ms
Hops: 1 (same subnet)

Or if failure:
  Step 3: Firewall blocks tcp:3306 from app-tier?
  Result: FIREWALL BLOCKING ✗
  Recommendation: Add firewall rule or check source IP

3. Performance Insights

Metrics displayed:
  - Latency (RTT) per path
  - Packet loss percentage
  - Jitter (latency variance)
  - Bandwidth utilization
  - Error rates (ICMP unreachable, timeout)

Time series:
  - Last hour
  - Last day
  - Last week
  - Custom range

Alerts:
  - High latency (> 100ms)
  - Packet loss (> 0.1%)
  - Connection errors (> 10 per minute)

4. Firewall Insights

Firewall rule analysis:

Show:
  - Rules by type (ingress/egress)
  - Rules by action (allow/deny)
  - Redundant rules (can be consolidated)
  - Unused rules (safe to delete)
  - Shadow rules (higher priority rule blocks lower)

Recommendations:
  1. Rule "allow-all" unused → DELETE
  2. Rules 10 and 20 overlap → CONSOLIDATE
  3. Rule 100 blocks traffic in rule 50 → REORDER priorities

Cost impact:
  Each rule = processing overhead
  Fewer rules = faster evaluation

Connectivity Tests: Deep Dive

Creating Test

bash
gcloud compute network-management connectivity-tests create test-app-to-db \
  --source=instance/projects/PROJECT_ID/zones/us-central1-a/instances/app-1 \
  --destination=instance/projects/PROJECT_ID/zones/us-central1-a/instances/db-1 \
  --protocol=TCP \
  --destination-port=3306

# Or via GCP Console:
# VPC Network → Network Intelligence Center → Connectivity Tests → CREATE

Test Results

Example result:

{
  "name": "test-app-to-db",
  "source": {
    "instance": "app-1",
    "network": "prod-vpc"
  },
  "destination": {
    "instance": "db-1",
    "network": "prod-vpc"
  },
  "protocol": "TCP",
  "reachability": "REACHABLE",
  "traces": [
    {
      "endpoint": "app-1",
      "steps": [
        {
          "step": 1,
          "description": "VM has network interface",
          "state": "SUCCESS"
        },
        {
          "step": 2,
          "description": "Route exists",
          "route": "10.0.2.0/24 local",
          "state": "SUCCESS"
        },
        {
          "step": 3,
          "description": "Firewall rule allows",
          "firewall_rule": "allow-db",
          "state": "SUCCESS"
        },
        {
          "step": 4,
          "description": "Destination VM reachable",
          "state": "SUCCESS"
        }
      ]
    }
  ]
}

Troubleshooting with NIC

Case 1: VM → DB Connectivity Fails

Problem: app-1 cannot connect to db-1:3306

Using Connectivity Tests:

1. Create test (source: app-1, dest: db-1, TCP port 3306)
2. View result: BLOCKED at firewall
3. View trace: "No firewall rule allows tcp:3306 from app-1"
4. Check firewall rules:
   - Rule must have:
     * direction: INGRESS (to db-1)
     * target-tags: db-tier (or target instance)
     * source-ranges: 10.0.1.0/24 (app subnet)
     * protocol: tcp
     * port: 3306

Solution:
gcloud compute firewall-rules create allow-app-to-db \
  --network=prod-vpc \
  --direction=INGRESS \
  --priority=1000 \
  --action=ALLOW \
  --source-ranges=10.0.1.0/24 \
  --target-tags=db-tier \
  --allow=tcp:3306

Case 2: Cross-Region Latency High

Problem: app-1 (us-central1) → app-2 (europe-west1) has 150ms latency

Using Performance Insights:

1. Navigate to "Performance Metrics"
2. Select path: us-central1 → europe-west1
3. View latency graph: Shows 150ms+
4. Check causes:
   - Route via correct region? (check routing table)
   - Network service tier? (Premium vs Standard)
   - Congestion? (check top talkers)

Solution options:
  a) Use Premium tier for lower-latency routing
  b) Use Cloud CDN for data locality
  c) Accept latency (geographic constraint)

Case 3: Firewall Rule Optimization

Problem: 500+ firewall rules, slow evaluation

Using Firewall Insights:

1. View all rules (sorted by priority, usage)
2. Identify:
   - Unused rules (0 matches) → DELETE
   - Redundant rules (same effect) → CONSOLIDATE
   - Shadow rules (blocked by higher priority) → REORDER

Before:
  100 allow rules (multiple overlapping)
  200 deny rules (some redundant)
  Total: 300 rules evaluated per packet

After optimization:
  30 consolidated rules
  Total: ~5 rules evaluated (10× faster)

Result: Faster firewall evaluation, lower CPU

Integration with Other Tools

Integration: Cloud Logging

VPC Flow Logs + NIC:

1. Enable VPC Flow Logs on subnet
2. In NIC, click "Performance Insights"
3. See real-time traffic from Flow Logs
4. Drill down to specific flows
5. Export to BigQuery for analysis

Workflow:
  NIC shows high latency
  → Click "View Flow Logs"
  → BigQuery query shows top talker
  → Investigate source VM

Integration: Cloud Monitoring

NIC metrics exposed to Cloud Monitoring:

Metric: compute.googleapis.com/vpc/connectivity_test_results
  - reachability (REACHABLE/UNREACHABLE/UNKNOWN)
  - latency_ms
  - packet_loss_percent

Alert:
  Alert on reachability = UNREACHABLE (connectivity broken)
  Alert on latency_ms > 100 (performance issue)

Example alert:
  WHEN compute.googleapis.com/vpc/connectivity_test_results
  reachability = UNREACHABLE
  THEN send PagerDuty alert

Best Practices

Do:

  • Run connectivity tests before/after network changes
  • Review firewall insights monthly
  • Monitor performance trends (establish baseline)
  • Use NIC for incident response (fast diagnosis)
  • Integrate with monitoring (alerts on failures)

Don't:

  • Run tests too frequently (unnecessary costs)
  • Ignore firewall optimization recommendations
  • Assume topology visualization is always current (refresh)
  • Skip testing after firewall rule changes
  • Treat NIC results as complete truth (always verify)

Limitations

Known limitations:

1. Connectivity Tests: Limited to ~100 tests/day (quota)
   Workaround: Schedule tests during off-peak

2. Topology: May lag 1-2 minutes behind actual state
   Workaround: Refresh manually

3. Performance metrics: 1-5 minute granularity
   Workaround: Use VPC Flow Logs for sub-second analysis

4. No real-time packet capture
   Workaround: Use tcpdump on VM + VPC Flow Logs

5. Cannot test to external internet destinations
   Workaround: SSH into VM + use curl/telnet

Conclusion

Network Intelligence Center provides unified VPC observability:

  • Topology: Visual understanding of network
  • Connectivity: Diagnose reachability issues
  • Performance: Monitor latency, packet loss
  • Firewall: Optimize rules, identify issues

Essential for: Production troubleshooting, network design validation, incident response.

Complements: VPC Flow Logs (detailed analysis), Cloud Monitoring (alerting).