02 / 09

GCP / 02

VPC & networking

The first thing engineers raised on AWS get wrong about GCP is the scope of the network itself. A GCP VPC is global: one network that exists in every region at once, with subnets as regional slices of it. Two VMs on different continents reach each other over internal IPs with nothing built in between. Once that lands, the rest of the catalogue, from the single-IP anycast load balancer to Shared VPC, reads as consequences of one design decision: Google owns the fibre between regions and lets your packets ride it by default.

The network is global. Subnets are regional.

Run gcloud compute networks create prod-net and notice what is missing: there is no region flag, because the command does not take one. The network you just made is a global resource inside a project. It exists in Iowa, Belgium, Mumbai, and São Paulo simultaneously, and a VM you launch in any of those regions can sit on it. Subnets are where regions enter the picture. Each subnet is a regional resource with its own CIDR range carved out of the network's address space, and a VM's internal IP comes from the subnet of the region it runs in. That is the whole model: one global routing domain, regional address blocks inside it.

Compare that with AWS, where the VPC itself is a regional object. An AWS team that wants workloads in three regions builds three VPCs, then connects them with peering or a Transit Gateway, keeps three sets of route tables in sync, and plans CIDR allocations so the three networks never collide. (The AWS VPC deep dive covers what that plumbing looks like in earnest.) On GCP the equivalent setup is one network with three subnets. A VM in us-central1 pings a VM in asia-south1 over its internal 10.x address on day one, with no peering connection, no transit hub, no inter-region VPN, and no second routing table. The traffic never touches the public internet; it rides Google's private backbone between the regions.

Why can Google offer this when AWS does not? Because the backbone was already there. The same private fibre network that moves Search and YouTube traffic between continents carries your inter-region packets, and the SDN layer that virtualises it, called Andromeda, was built from the start to treat the planet as one address space. AWS built regions as deliberately isolated islands and exposed that isolation in its networking API. Google built one network and exposed that instead. Neither choice is free: a global network is more convenient, and an isolated one has a smaller failure and blast radius. GCP's answer to the blast-radius concern is that you can still create many networks per project and many projects per organisation, so prod and non-prod typically get separate networks even though one network could technically span both.

If projects, the resource hierarchy, and gcloud defaults are still unfamiliar, read foundations first; everything below assumes you know what a project is and can run gcloud against one.

Left: a single GCP network spans both regions, so the two VMs already share a routing domain. Right: AWS gives each region its own VPC, and connectivity between them is something you build and operate.

The figure understates one practical difference. With per-region VPCs, every cross-region concern, routing, firewalling, DNS, IP planning, has two or more authorities that can drift apart. With one global network there is exactly one place where a route exists or does not, and one set of firewall rules to audit. The trade is that mistakes are global too: a too-broad allow rule applies in every region the moment you create it. Treat the network as a production-wide object with the review process that implies.

Subnets, and what living in one means

A GCP subnet spans every zone in its region. This is easy to read past and worth stopping on, because AWS subnets are zonal. An AWS architecture diagram shows the familiar checkerboard: public and private subnets repeated per availability zone, six or more subnets for a basic three-tier app. On GCP a managed instance group spread across three zones in us-central1 sits in one subnet, drawing addresses from one range. Zones still matter for failure isolation of the VMs, but they stop mattering for addressing.

There is also no such thing as a public subnet. Whether a VM is reachable from the internet is a property of the VM (does it have an external IP?) and of the firewall, not of the subnet it lives in. The AWS pattern of routing a subnet through an internet gateway to make it public has no equivalent because it has no purpose: every network has a default route to the internet (deletable, and many locked-down environments delete it), and any VM without an external address is unreachable from outside regardless of which subnet holds it.

Networks come in two subnet modes. Auto mode creates one subnet per region for you, all carved from 10.128.0.0/9, including in regions you will never use. The default network every new project ships with is auto mode plus a set of permissive firewall rules, fine for a demo, wrong for production. Custom mode creates no subnets at all; you add exactly the ones you want, where you want them, with ranges you chose. Use custom mode for anything real, partly for hygiene and partly because auto-mode ranges collide with each other the moment you try to peer two auto-mode networks. You can convert auto to custom one way, never back.

Three smaller subnet facts that pay rent later. First, subnet ranges can be expanded in place with gcloud compute networks subnets expand-ip-range, no downtime, which removes a whole genre of AWS re-IP misery. Second, subnets carry secondary ranges, called alias IP ranges, from which a VM can claim extra addresses beyond its primary one; GKE is the heaviest user, giving every pod a real VPC address from a secondary range so pods are routable without an overlay. Third, the network MTU defaults to 1460 and can be raised to 8896; mismatched assumptions about it are a classic source of mysterious throughput numbers when connecting to on-prem.

Routes belong to the network

Because the network is global, so is its routing table. Routes are network-level resources: create one and it applies everywhere the network exists, in every region, unless you scope it to specific VMs with network tags. The system maintains a subnet route for each subnet range automatically, which is how the two VMs in the figure above find each other, and a default route pointing 0.0.0.0/0 at the internet gateway. Custom static routes can point at an instance, a VPN tunnel, or an internal load balancer as next hop, with a priority field deciding among overlapping destinations.

Dynamic routes come from Cloud Router, which speaks BGP with your VPN or Interconnect peer and injects what it learns into the network. The setting worth knowing exists is the network's dynamic routing mode. In regional mode, routes learned by a Cloud Router are only visible to subnets in that router's region. In global mode, one Interconnect in one region can carry traffic for every region of the network, which is frequently the entire reason a company chose GCP for its hybrid estate: one physical link, planet-wide reach, instead of one per region.

Firewall rules: identity over IP arithmetic

Firewall rules are also network-level objects, but they are enforced at each VM, by the virtualisation layer underneath it, not at some appliance the traffic funnels through. There is no choke point to size or to saturate; the rule set is distributed to every host and applied as packets enter or leave each instance. Rules are stateful, so return traffic for an allowed connection is allowed automatically. Two implied rules sit at the lowest priority on every network: allow all egress, deny all ingress. A brand-new custom network therefore accepts nothing, and the first symptom of a forgotten firewall rule is an unreachable VM that is otherwise perfectly healthy.

Every rule has a direction, an action (allow or deny), a protocol and port list, a priority from 0 to 65535 where lower wins, and, this is the interesting part, a definition of which VMs it applies to and where traffic may come from. Both ends can be expressed three ways: as CIDR ranges, as network tags, or as service accounts. Tags are plain strings attached to VMs, so a rule can say "allow tcp:8080 to VMs tagged api from VMs tagged frontend" and membership follows the tag rather than any IP plan. Service accounts go one better: a rule like "allow tcp:5432 to VMs running as [email protected] from VMs running as [email protected]" is keyed to a real IAM identity.

The distinction matters for security, not just style. Anyone with permission to edit a VM can change its tags, so a tag-based allow rule can be joined by any instance whose editor adds the right string. Changing which service account a VM runs as requires the actAs permission on that service account, a far better-guarded operation. Use tags for convenience grouping; use service accounts where the rule is a trust boundary. Above per-network rules, hierarchical firewall policies attach at the organisation or folder level and evaluate first, which is how a security team enforces "nothing in this folder ever exposes port 22 publicly" without touching individual networks. Firewall rule logging, off by default, records matches to specific rules and is the first thing to enable when debugging why packets vanish.

Health checks need their own rules. Google's load balancer health probes arrive from 35.191.0.0/16 and 130.211.0.0/22, not from your subnets. A backend that works when you curl it from a neighbour VM but sits unhealthy behind a load balancer is almost always missing an ingress allow for those two ranges.

One IP, every front door: the global load balancer

Here is where the global network stops being a tidy abstraction and starts changing architecture. GCP's global external Application Load Balancer hands you a single IP address, IPv4 or IPv6, that is advertised by BGP anycast from every Google edge point of presence in the world at the same time. A client in Mumbai connecting to that IP lands on a Google Front End (GFE) in or near Mumbai. A client in São Paulo connecting to the same IP lands near São Paulo. TCP and TLS terminate right there at the edge, and the request then rides Google's backbone, not the public internet, to the closest backend region that is healthy and has capacity. If that region fills up or fails its health checks, traffic spills to the next closest automatically.

The global external Application Load Balancer. The same address is announced from every edge, so "which region serves this user" is decided by internet routing and Google's backbone, not by DNS.

Hold that against the AWS shape, where an ALB is regional. Serving three regions there means three load balancers, DNS-based geo or latency routing in Route 53 to spread clients among them, and either acceptance that failover rides on DNS TTLs or paying for Global Accelerator to get anycast on top. On GCP the multi-region story is one forwarding rule. There is no warm-up either: the GFE fleet that fronts your anycast IP is the same one that fronts Google's own properties, so a traffic spike does not wait for load balancer capacity to scale, and inside each region the Maglev consistent-hashing layer spreads flows across the fleet. Cross-region failover, capacity-based overflow between regions, and edge TLS all come from the design rather than from extra products. (For the vocabulary of L4 versus L7 and health checking in general, see how load balancing works.)

The global Application LB is the flagship, but it is one member of a family, and choosing among them is a recurring interview and design question. The grid that matters is scope (global or regional), layer (HTTP-aware proxy or pass-through), and audience (external or internal):

Load balancer	Scope	Layer	Reach for it when
Global external Application LB	Global	L7 proxy	Public HTTP(S) to backends in one or many regions; the default answer
Regional external Application LB	Regional	L7 proxy	Public HTTP(S) that must stay in one region, often for data-residency or Standard-tier reasons
Internal Application LB	Regional	L7 proxy	Service-to-service HTTP inside the VPC with URL routing and header tricks
External passthrough Network LB	Regional	L4, no proxy	Non-HTTP TCP/UDP from the internet; packets arrive unproxied with the real client IP
Internal passthrough Network LB	Regional	L4, no proxy	Internal TCP/UDP; also usable as a route's next hop for appliance setups
Proxy Network LB	Global or regional	L4 proxy	TCP (optionally TLS-terminated) that wants the anycast edge without HTTP awareness

All of them share one configuration vocabulary: a forwarding rule owns the IP and port, a backend service owns health checks and session affinity, and backends are instance groups or network endpoint groups (NEGs). NEGs are how containers and serverless join in; a GKE service exposed through container-native load balancing puts pod IPs directly in a NEG, so the load balancer addresses pods without bouncing through node ports.

Premium and Standard tiers: whose network carries your packets

Everything above about riding the backbone assumed a default you can actually turn off. Network service tiers decide where your traffic transitions between Google's network and the public internet. On Premium tier, the default, ingress enters Google's network at the edge nearest the user and egress stays on Google fibre until the exit nearest the user: cold-potato routing, in carrier slang, because Google holds the packet as long as possible. On Standard tier the handoff happens near your region instead: egress is dumped onto the public internet close to where the VM lives and makes its own way across the world, hot-potato style, the way most ISPs treat transit traffic.

The differences are concrete. Standard is billed at lower egress rates, but external IPs become regional objects, and the global load balancers refuse to play: anycast from every edge only makes sense if the backbone carries the traffic, so Standard tier limits you to regional external load balancing. Latency and jitter to faraway users get visibly worse, since the long haul happens over whatever public paths exist that day. Tier is set per resource (per VM address, per forwarding rule), so the sane pattern is Premium for anything user-facing and Standard for bulk, latency-tolerant egress where the discount is the point. If a design review says "we will save money switching tiers," the follow-up question is always which specific addresses, and whether a global LB sits on any of them.

Private Google Access and Private Service Connect

A locked-down VM with no external IP still needs to reach Cloud Storage, BigQuery, Artifact Registry, and the rest of the API surface. Private Google Access is the subnet-level switch for exactly that: enable it, and VMs in the subnet that have only internal addresses can call Google APIs through the network's internal routing, no NAT and no public exposure required. It costs nothing and there is rarely a reason to leave it off. Stricter environments take it further with the private.googleapis.com and restricted.googleapis.com address ranges, pointing DNS for Google API names at a small fixed block so that even API traffic is auditable and the restricted range can be fenced with VPC Service Controls.

Private Service Connect (PSC) generalises the idea from "reach Google" to "reach any producer privately." A producer, Google itself, a SaaS vendor, or another team, publishes a service behind a service attachment; a consumer creates a PSC endpoint inside their own VPC that gets an IP from their own subnet and forwards to the producer. The two networks are never connected; no routes are exchanged, CIDR overlap between them is irrelevant, and the consumer sees one IP they fully control. If you know AWS PrivateLink, this is the same shape (the AWS page covers that side). PSC is also the modern answer for managed services like Cloud SQL, which historically used a dedicated peering into a Google-owned network, an arrangement that inherited all of peering's transitivity headaches. New designs should reach for PSC first.

Shared VPC: one network, many projects

GCP's unit of ownership and billing is the project, and a company quickly accumulates dozens: one per team, per service, per environment. Giving each project its own network would recreate the AWS many-VPCs problem with worse tooling. Shared VPC is the designed escape. One host project owns the network, its subnets, its routes, and its firewall rules. Other projects attach to it as service projects, and their VMs, GKE nodes, and internal load balancers get their network interfaces inside the host project's subnets while the resources themselves stay owned, billed, and quota-tracked in the service project.

Shared VPC. The network team administers one network in the host project; each application team attaches as a service project and is granted use of specific subnets.

The permission model is the part to remember. Attaching a service project is an organisation-level operation, and after that, access is granted per subnet via the compute.networkUser role. The payments team can be allowed to place VMs only in the payments subnets, while the network team alone holds compute.networkAdmin on the host project and is the only party who can change firewall rules or routes. Centralised network control, decentralised compute ownership, and because the underlying network is still one global VPC, services in different service projects talk over internal IPs with no extra wiring. Most GCP organisations of any size converge on a small number of Shared VPCs, commonly one per environment, as their backbone topology.

Peering: useful, and stubbornly non-transitive

When two networks must stay separately administered, in different organisations, say, VPC Network Peering connects them: subnet routes are exchanged automatically, traffic flows over internal IPs on Google's fabric, and there is no gateway, no per-hour charge, and no bandwidth choke point. The constraints are the same ones AWS peering veterans already carry. Ranges must not overlap, each side configures and can tear down its half independently, and peering is strictly non-transitive: if A peers with B and B peers with C, A cannot reach C, full stop. There is no Transit Gateway equivalent to buy your way out; a hub-and-spoke of peerings does not become a mesh, and designs that assume otherwise fail quietly at the routing layer.

Firewall rules do not cross a peering either; each network enforces its own, so an allow rule for the peer's ranges has to exist on both sides before traffic flows. Custom and dynamic route exchange is optional and off by default, and there are per-network limits on peering count that large SaaS producers hit in practice, one of the reasons producer-style connectivity has been migrating from peering to Private Service Connect. The decision rule that holds up: same organisation and a desire for central control, use Shared VPC; separate administration but genuine full-mesh L3 need, peer; the consumer only needs one service from the producer, PSC.

Cloud NAT: egress without addresses, and without a box

Private VMs that must reach the internet, pulling packages, calling third-party APIs, need NAT, and GCP's version is worth a beat of attention precisely because of what it is not. A Cloud NAT gateway is not an instance, not a managed appliance, and not a hop that packets converge on. It is configuration pushed into Andromeda: each VM's host does the address translation locally using ports reserved for it from the NAT's pool of external IPs. There is no single device to saturate and no per-gigabyte processing toll on a box in the middle, which is a pointed contrast with the AWS NAT Gateway and its zonal placement, hourly fee, and data charge.

The operational surface that remains is port allocation. Each VM gets a block of ports (64 by default) per NAT IP, and a VM making thousands of outbound connections to the same destination can exhaust its block and start dropping new connections, the same disease as AWS's port-allocation errors with different defaults. The fixes are mechanical: raise minimum ports per VM, enable dynamic port allocation, or add NAT IPs. Cloud NAT is regional and attaches to specific subnets through a Cloud Router, pairs naturally with Private Google Access (Google traffic stays internal, the rest goes through NAT), and its logging of dropped and translated flows is the first place to look when egress misbehaves.

What breaks in practice

A fresh custom network "doesn't work." The implied deny-ingress rule is doing its job. Nothing is reachable, including over SSH, until you write allow rules; that is the intended starting state, not a fault.
The default network ships to production. Auto-mode ranges from 10.128.0.0/9 plus permissive default rules. It also collides with every other auto-mode network the day someone proposes peering. Delete it, create custom-mode networks deliberately.
Tags treated as a security boundary. Anyone who can edit an instance can add the tag that admits it through your firewall rule. For trust boundaries, key rules on service accounts.
Backends unhealthy behind a working service. Missing ingress allows for 35.191.0.0/16 and 130.211.0.0/22, the health-check ranges. The single most common GCP load-balancing ticket.
Hub-and-spoke peering assumed transitive. Spokes peered to a hub cannot see each other, and managed services attached to the hub via their own peering are invisible to spokes too. Redesign around Shared VPC or PSC.
Cloud NAT port exhaustion. High-fanout workers hammering one destination run out of allocated ports and new connections fail intermittently. Raise minimum ports per VM or add NAT addresses; check the NAT logs for drops.
Standard tier on the wrong resource. A forwarding rule quietly created on Standard cannot be global, and faraway users feel the public-internet detour. Audit tier on every external address.
Regional dynamic routing with a global estate. The VPN comes up, one region works, the others cannot reach on-prem. The network's dynamic routing mode is still regional; flip it to global.
Inter-region traffic assumed free. The global VPC removes the plumbing, not the bill. Egress between regions is charged, and a chatty service split across continents will surface in cost review before it surfaces in latency graphs.

Lab: one network, two continents, one ping

Ten minutes at a shell makes the global-network claim concrete in a way no diagram can. You will build a custom-mode network, put one subnet in Iowa and one in Belgium, allow internal traffic, launch a VM in each region with no external IPs at all, and watch one ping the other across the Atlantic over internal addresses, with zero connectivity resources created. Costs are a few cents if you tear down promptly; everything here fits comfortably inside a sandbox project. Set a project and default auth first (see foundations if needed).

Create the network. Note the absence of any region flag.
gcloud compute networks create lab-net --subnet-mode=custom
Two subnets, two regions, one network.
gcloud compute networks subnets create lab-us \ --network=lab-net --region=us-central1 --range=10.10.0.0/24 gcloud compute networks subnets create lab-eu \ --network=lab-net --region=europe-west1 --range=10.20.0.0/24
Firewall: allow ICMP and SSH between the subnets, and SSH from IAP so you can log in without external IPs.
gcloud compute firewall-rules create lab-allow-internal \ --network=lab-net --allow=icmp,tcp:22 \ --source-ranges=10.10.0.0/24,10.20.0.0/24 --target-tags=lab gcloud compute firewall-rules create lab-allow-iap-ssh \ --network=lab-net --allow=tcp:22 \ --source-ranges=35.235.240.0/20 --target-tags=lab
Without these, the implied deny-ingress rule blocks everything. The second rule admits Identity-Aware Proxy's tunnelling range, the standard way into VMs that have no public address.
One VM per region, no external IPs.
gcloud compute instances create vm-us \ --zone=us-central1-a --machine-type=e2-micro \ --subnet=lab-us --no-address --tags=lab \ --image-family=debian-12 --image-project=debian-cloud gcloud compute instances create vm-eu \ --zone=europe-west1-b --machine-type=e2-micro \ --subnet=lab-eu --no-address --tags=lab \ --image-family=debian-12 --image-project=debian-cloud
Look at the routes you did not have to create.
gcloud compute routes list --filter="network=lab-net"
One subnet route per range, both visible to both VMs, plus the default internet route. This is the entire cross-region "setup."
Ping Belgium from Iowa, over internal IPs.
EU_IP=$(gcloud compute instances describe vm-eu \ --zone=europe-west1-b \ --format='get(networkInterfaces[0].networkIP)') gcloud compute ssh vm-us --zone=us-central1-a \ --tunnel-through-iap --command="ping -c 5 $EU_IP"
Expect round trips in the neighbourhood of 100 ms: a transatlantic crossing on Google's backbone, between two private addresses, on a network you created four commands ago. On AWS this demo requires two VPCs, a peering connection or transit gateway, route table entries on both sides, and security group updates before the first packet flows.
Tear it down, inside-out.
gcloud compute instances delete vm-us --zone=us-central1-a --quiet gcloud compute instances delete vm-eu --zone=europe-west1-b --quiet gcloud compute firewall-rules delete lab-allow-internal lab-allow-iap-ssh --quiet gcloud compute networks subnets delete lab-us --region=us-central1 --quiet gcloud compute networks subnets delete lab-eu --region=europe-west1 --quiet gcloud compute networks delete lab-net --quiet

Worth trying while it is up. Run the ping a few more times and note how stable the latency is; that consistency is the backbone talking. Then, before teardown, try curl example.com from either VM and watch it hang: no external IP and no Cloud NAT means no internet egress, which is exactly the locked-down posture most production fleets want.

VPC & networking

The network is global. Subnets are regional.

Subnets, and what living in one means

Routes belong to the network

Firewall rules: identity over IP arithmetic

One IP, every front door: the global load balancer

Premium and Standard tiers: whose network carries your packets

Private Google Access and Private Service Connect

Shared VPC: one network, many projects

Peering: useful, and stubbornly non-transitive

Cloud NAT: egress without addresses, and without a box

What breaks in practice

Lab: one network, two continents, one ping

Further reading

03 — Compute Engine