Virtual networks
Everything in Azure that has a private IP address lives in a VNet, and most production incidents that look like application failures turn out to be a route, a security rule, or a DNS record inside one. This page covers the model end to end: regional VNets and their subnets, the NSG rule engine and how it evaluates twice, peering and the hub-and-spoke pattern, Private Endpoints versus service endpoints, user-defined routes, the four load balancers and how to pick between them, and the DNS machinery that quietly holds it all together. It closes with a CLI lab you can run and delete in twenty minutes.
A VNet is a regional network you fully control
A virtual network is a private, isolated slice of the Azure network in one region. You give it an address space in CIDR notation, usually from the RFC 1918 private ranges, and Azure gives you a software-defined network where every VM, container instance, and private endpoint you place inside it gets an IP from that space. Nothing outside the VNet can reach those addresses unless you explicitly connect it: peering, a VPN, an ExpressRoute circuit, or a public IP you attach on purpose. The default posture is isolation.
The word regional carries weight. A VNet exists in exactly one region, the same model AWS uses for a VPC, and the opposite of GCP, where a VPC is a global object with regional subnets. If you have workloads in West Europe and East US, that is two VNets, and making them talk is a deliberate act of peering or gateway plumbing. People arriving from GCP get surprised by this; people arriving from AWS feel at home, and the VPC deep dive maps almost one-to-one onto this page. If Azure itself is new to you, the foundations page covers the resource group and subscription scaffolding this all hangs from.
A VNet can hold more than one address range, and you can add ranges later without rebuilding anything, which is the standard escape hatch when a network planned at /20 turns out to need more room. The constraint to respect from day one is overlap: two networks with overlapping address space can never be peered or connected through a gateway. Address planning is dull, and it is also the one decision on this page that is genuinely painful to reverse, because re-IPing a production VNet means rebuilding the things inside it. Keep a spreadsheet, or better, an IPAM tool, and hand each VNet a unique block from the start.
Subnets, and the five addresses you never get
Subnets carve the address space into segments, and they are the unit at which most policy attaches: NSGs, route tables, service endpoints, and delegations all bind to subnets. A sensible layout separates tiers — a subnet for web VMs, one for application services, one for databases or private endpoints — not because subnets provide isolation by themselves (they do not; by default everything in a VNet can reach everything else) but because they give you clean boundaries to hang security rules and routes on.
Azure reserves five addresses in every subnet: the network address, the broadcast address, and three more — x.x.x.1 for the default gateway and x.x.x.2 and x.x.x.3 for Azure DNS. A /24 therefore yields 251 usable addresses, not 254, and a /29, the smallest subnet Azure allows, yields exactly three. This matters more than it sounds when you size subnets for services that consume many IPs, such as AKS with the Azure CNI, where every pod takes an address from the subnet.
Some subnets are special by name. A VPN or ExpressRoute gateway will only deploy into a
subnet called GatewaySubnet; Azure Firewall demands
AzureFirewallSubnet; Bastion wants AzureBastionSubnet. These
named subnets cannot carry NSGs in the usual way and should hold nothing else. Separately,
subnet delegation hands a subnet to a PaaS service, such as App Service VNet
integration or Azure Container Instances, letting that service inject and manage its own
resources there. A delegated subnet is effectively owned by the service, so plan for it as
consumed space.
Network security groups: the rule engine
An NSG is an ordered list of allow and deny rules evaluated against the classic five-tuple:
source, source port, destination, destination port, and protocol. Each rule has a priority
from 100 to 4096, lower numbers run first, and evaluation stops at the first match. The
engine is stateful, so if an inbound rule admits a TCP connection, the replies flow out
without needing a matching outbound rule. Sources and destinations can be IP ranges,
service tags — Azure-maintained labels such as Internet,
VirtualNetwork, or Storage.WestEurope that expand to the right
prefixes automatically — or application security groups, which we get to next.
Every NSG ships with default rules at priorities 65000 and up, which you cannot delete but
can override with anything numbered lower. Inbound, the defaults allow traffic from within
the VNet and from the Azure load balancer's health-probe address, then deny everything
else. Outbound, they allow VNet traffic and internet traffic, then deny the rest. Two
consequences are worth tattooing somewhere visible: by default, anything in a VNet can
reach anything else in it, including across peerings, because the
VirtualNetwork tag covers peered space; and by default, every VM can reach the
internet outbound. Locking down either is your job, not Azure's.
An NSG can attach to a subnet, to a NIC, or both, and this is where people get bitten. When both exist, inbound traffic must pass the subnet NSG first and then the NIC NSG; outbound traffic is checked at the NIC first and then the subnet. Both must allow the traffic. There is no merging of rules, no most-specific-wins across the two — they are two independent gates in series. The classic failure is an engineer adding a NIC-level allow rule and waiting for a connection that the subnet NSG silently drops one layer earlier.
Application security groups: rules that say what you mean
IP-based rules rot. The rule that allows 10.0.1.0/24 to reach 10.0.2.4 on port 5432 was
written when the web tier lived in that subnet and the database had that address; six
months and one re-deployment later, nobody remembers what it protects. Application
security groups fix this by letting you write rules in terms of workload identity instead
of addresses. An ASG is a label. You attach it to the NICs of VMs that play a role —
asg-web, asg-db — and then write NSG rules whose source and
destination are ASGs: allow asg-web to reach asg-db on 5432, deny
everything else to asg-db.
The payoff is operational. When a new web VM comes up, its NIC joins
asg-web and inherits every rule that mentions the group; nothing about the NSG
changes. The rules read as intent, which makes security review humane, and membership
changes do not require touching the rule set at all. The limits: ASGs only apply to NICs,
so they describe IaaS workloads rather than PaaS endpoints, and all NICs referenced in one
rule must live in the same VNet. Within those bounds, using ASGs for anything beyond
trivial single-purpose networks is just the correct default.
Peering, and why hub-and-spoke exists
VNet peering joins two VNets so resources in each can reach the other by private IP, over the Microsoft backbone, with no gateway, no public internet, and near-zero added latency. Peering within a region and global peering across regions work the same way; both are configured as a pair of one-directional links and only carry traffic when both sides agree. Address spaces must not overlap, and the data path costs per gigabyte, more for global than regional.
The property that shapes every real Azure network is that peering is non-transitive. If A peers with B and B peers with C, A still cannot reach C. Traffic will not flow through an intermediate VNet by default, full stop. With a handful of VNets you could mesh them all, but the pair count grows quadratically and so does the bookkeeping. The standard answer is hub-and-spoke: one hub VNet holds the shared, expensive plumbing — the VPN or ExpressRoute gateway, Azure Firewall, Bastion, shared private DNS — and every workload VNet peers only with the hub.
Two flags make the pattern work. Gateway transit lets spokes use the hub's VPN or ExpressRoute gateway as their path to on-premises — the hub allows it, each spoke sets "use remote gateways" — so you buy one gateway instead of one per VNet. And because peering alone will never carry spoke-to-spoke traffic, each spoke subnet gets a user-defined route sending 0.0.0.0/0, or at least the other spokes' prefixes, to the firewall's private IP, with "allow forwarded traffic" enabled on the peerings. The firewall then becomes the single inspection and logging point for east-west traffic, which auditors tend to appreciate. At larger scale, Azure Virtual WAN packages this whole arrangement as a managed service, but the topology you are paying for is the same one in the diagram.
Private Endpoints versus service endpoints
Both features exist to answer the same question — how do VMs in my VNet reach Azure PaaS services like Storage or SQL without traversing the public internet — and they answer it so differently that picking the wrong one shows up in security reviews. The difference matters; learn it once.
A service endpoint is a subnet-level switch. Enable it for, say,
Microsoft.Storage, and traffic from that subnet to Storage flows over the
Azure backbone with the VM's private identity attached, letting you write a firewall rule
on the storage account that says "only accept traffic from this subnet". It costs nothing
and takes a minute. But the storage account keeps its public IP — you are still connecting
to a public endpoint, just over a better path with a source-based ACL. On-premises machines
coming in over VPN cannot use it, and a VM in the subnet can still reach
anyone's storage account, which makes data exfiltration a one-liner.
A Private Endpoint, built on Private Link, goes further: it injects a NIC
into your subnet that carries a private IP from your address space and maps to one specific
resource — this storage account, this SQL server, not the service in general. The resource
becomes reachable at something like 10.0.2.5, you can disable its public endpoint entirely,
and the private address works from peered VNets and from on-premises over VPN or
ExpressRoute, since it is just an IP in your network. The cost is real money per endpoint
plus per-gigabyte processing, and a DNS obligation we will meet below: the service's
hostname must resolve to the private IP inside your network, which is what the
privatelink.* DNS zones are for.
User-defined routes and forced tunnelling
Every subnet starts with system routes Azure maintains for you: the VNet's own space is
reachable directly, peered VNets via peering, 0.0.0.0/0 heads to the internet. You override
them with a route table — a set of user-defined routes attached to a subnet, where
the most specific prefix wins and a UDR beats a system route of equal specificity. Each
route names a next hop type: VirtualAppliance with an IP (almost always a
firewall), VirtualNetworkGateway, Internet, or
None, which blackholes the traffic.
Nearly all UDR usage is one idea: steering traffic through an inspection point. The
spoke-to-firewall route in the hub-and-spoke diagram is a UDR. So is
forced tunnelling, where 0.0.0.0/0 points at the VPN or ExpressRoute
gateway so that even internet-bound traffic from Azure VMs hairpins through on-premises
security appliances before reaching the world. Some compliance regimes demand it; everyone
else mostly suffers the latency. Two operational notes: routes advertised over BGP from
on-premises mix into the same routing decision, and the effective routes view on a NIC
(az network nic show-effective-route-table) is the single most useful
debugging command in Azure networking — it shows what the packet will actually do, after
all sources are merged, rather than what you hoped.
The four load balancers, and how to choose
Azure ships four distinct load-balancing services, and interviewers love the question because the names do not help: one is called Load Balancer as if the others were not, and two of the four are global. The grid that untangles them has two axes — does the service operate at layer 4 (TCP/UDP) or layer 7 (HTTP), and is it regional or global? If the mechanics of L4 versus L7 balancing are fuzzy, the load balancing guide builds them up from scratch.
Azure Load Balancer is the plain L4 workhorse: it spreads TCP and UDP flows across VMs in one region using a five-tuple hash, runs health probes, and adds essentially zero latency because it rewrites packets rather than proxying connections. It has no idea what HTTP is. Use it inside a VNet in front of a database cluster or a pool of stateless services, or as a public entry point when you terminate TLS yourself. Application Gateway is the regional L7 proxy: it terminates TLS, routes by URL path and host header, handles cookie-based session affinity, and offers a web application firewall that blocks the OWASP-style attacks before they reach your code.
Front Door takes the L7 job global. Clients connect to the nearest Microsoft edge location via anycast, TLS terminates there, static content can be cached there, and requests ride the backbone to your nearest healthy origin, with failover between regions in seconds because Front Door sees every request. Traffic Manager is the odd one out: it is a DNS server with opinions. A client resolves your hostname, Traffic Manager answers with the IP of the best endpoint by priority, weight, geography, or measured latency, and then steps out of the way entirely — the actual traffic never touches it. That makes it protocol-agnostic and very cheap, and also means failover is hostage to DNS TTLs and clients that ignore them.
| Service | Layer | Scope | Reach for it when |
|---|---|---|---|
| Azure Load Balancer | L4 (TCP/UDP) | Regional | Spreading raw connections across VMs in one region; internal tiers; non-HTTP protocols; lowest latency |
| Application Gateway | L7 (HTTP/S) | Regional | TLS termination, path and host routing, WAF in front of regional web apps |
| Front Door | L7 (HTTP/S) | Global | Multi-region web apps and APIs: edge TLS, caching, WAF, fast regional failover |
| Traffic Manager | DNS | Global | Steering clients across endpoints for any protocol, including non-Azure ones; failover bounded by DNS TTL |
These compose. The canonical multi-region web architecture is Front Door at the edge, an Application Gateway (or the load balancer) in each region behind it, and the regional service spreading load across instances. The canonical mistake is putting Traffic Manager in front of an HTTP app that needs fast failover and discovering during an outage that thousands of clients cached the dead region's IP.
VPN Gateway and ExpressRoute at a glance
Both connect your VNets to networks outside Azure, both live in the
GatewaySubnet, and they solve different problems. VPN Gateway
builds IPsec tunnels over the public internet: site-to-site tunnels to your office
firewall, point-to-site connections for individual laptops. It is quick to stand up and
cheap to run, and its throughput and latency are at the mercy of the internet path between
you and the Azure edge — fine for management traffic and modest workloads, painful for
chatty database replication.
ExpressRoute is a private circuit into Microsoft's network, provisioned through a connectivity provider at 50 Mbps to 100 Gbps, with an SLA, predictable latency, and no public internet in the path. Two details surprise people: ExpressRoute traffic is not encrypted by default, since it is a private path rather than a tunnel, so regulated workloads often layer IPsec or MACsec on top; and a circuit takes weeks of provider lead time, not minutes. The common enterprise pattern runs both — ExpressRoute as the primary path, a VPN tunnel as warm standby — terminating in the hub VNet, with gateway transit handing connectivity to every spoke. Routes from on-premises arrive over BGP either way and merge into the effective-route calculation each NIC sees.
DNS inside a VNet
By default, VMs in a VNet resolve names through Azure's wire-served resolver at the
magic address 168.63.129.16, which answers for public names and for Azure's internal
ones. The moment you want your own names — db.internal.contoso.com resolving
to a private IP — you create a private DNS zone and link it to your VNets.
A link can enable auto-registration, in which case every VM in the linked VNet
registers its hostname in the zone automatically and the record follows the VM through
reallocation, which beats maintaining a spreadsheet of IPs by a comfortable margin.
Private DNS zones are also the other half of Private Link. When you create a Private
Endpoint for a storage account, clients still address it as
myaccount.blob.core.windows.net — nothing in your code changes. Resolution
inside the VNet has to return the private IP, though, and the mechanism is a private zone
named privatelink.blob.core.windows.net holding the record, linked to every
VNet that needs the private path. Public resolvers keep returning the public IP, or a
pointer to it, so the same hostname lands in different places depending on where you ask
from. Forgetting the zone link is the classic Private Link failure: the endpoint exists,
the NIC has its IP, and every client sails straight past it to the public endpoint, or to a
firewall that now rejects them. If on-premises machines must resolve these names too,
Azure DNS Private Resolver gives your corporate DNS servers something inside the VNet to
forward queries to.
CLI lab: build it, peer it, prove it, delete it
Theory done. This lab builds a VNet with two subnets, locks SSH down to your own IP with an
NSG, stands up two VMs, peers to a second VNet, proves the private path works, and deletes
everything. It costs a few cents if you finish within the hour. You need the
az CLI logged in and a subscription where you can create resource groups.
1. A resource group and the first VNet with two subnets.
az group create --name rg-vnet-lab --location westeurope
az network vnet create \
--resource-group rg-vnet-lab --name vnet-app \
--address-prefix 10.10.0.0/16 \
--subnet-name snet-web --subnet-prefixes 10.10.1.0/24
az network vnet subnet create \
--resource-group rg-vnet-lab --vnet-name vnet-app \
--name snet-db --address-prefixes 10.10.2.0/24 2. An NSG that allows SSH from your IP only. The rule at priority 100
admits you; the built-in DenyAllInbound at 65500 handles everyone else. Attaching the NSG at
the subnet level covers every NIC that will ever land in snet-web.
MYIP=$(curl -s ifconfig.me)
az network nsg create --resource-group rg-vnet-lab --name nsg-web
az network nsg rule create \
--resource-group rg-vnet-lab --nsg-name nsg-web \
--name allow-ssh-from-me --priority 100 \
--direction Inbound --access Allow --protocol Tcp \
--source-address-prefixes $MYIP/32 \
--destination-port-ranges 22
az network vnet subnet update \
--resource-group rg-vnet-lab --vnet-name vnet-app \
--name snet-web --network-security-group nsg-web 3. Two small VMs. One in each subnet. The web VM gets a public IP so you can reach it; the db VM gets none, which is the point — its only callers should be inside the network.
az vm create \
--resource-group rg-vnet-lab --name vm-web \
--image Ubuntu2204 --size Standard_B1s \
--vnet-name vnet-app --subnet snet-web \
--admin-username azureuser --generate-ssh-keys
az vm create \
--resource-group rg-vnet-lab --name vm-db \
--image Ubuntu2204 --size Standard_B1s \
--vnet-name vnet-app --subnet snet-db \
--public-ip-address "" \
--admin-username azureuser --generate-ssh-keys 4. A second VNet, peered both ways. Note the non-overlapping address space, and that peering takes two commands because each direction is its own link.
az network vnet create \
--resource-group rg-vnet-lab --name vnet-data \
--address-prefix 10.20.0.0/16 \
--subnet-name snet-tools --subnet-prefixes 10.20.1.0/24
az vm create \
--resource-group rg-vnet-lab --name vm-tools \
--image Ubuntu2204 --size Standard_B1s \
--vnet-name vnet-data --subnet snet-tools \
--public-ip-address "" \
--admin-username azureuser --generate-ssh-keys
az network vnet peering create \
--resource-group rg-vnet-lab --name app-to-data \
--vnet-name vnet-app --remote-vnet vnet-data \
--allow-vnet-access
az network vnet peering create \
--resource-group rg-vnet-lab --name data-to-app \
--vnet-name vnet-data --remote-vnet vnet-app \
--allow-vnet-access 5. Verify. SSH to the web VM from your machine (the NSG admits only your IP — try from a phone hotspot if you want to watch it time out), then hop to the private VMs from inside. The peered VM at 10.20.1.4 is reachable only because of step 4; pause the peering and the ping dies.
WEB_IP=$(az vm show -d --resource-group rg-vnet-lab \
--name vm-web --query publicIps -o tsv)
ssh -A azureuser@$WEB_IP
# from inside vm-web:
ping -c 3 10.10.2.4 # vm-db, same VNet, different subnet
ping -c 3 10.20.1.4 # vm-tools, across the peering
ssh [email protected] # full session over the peered link While you are connected, run the effective-route check from your own machine in a second
terminal and find the VNetPeering route that makes the second ping possible.
az network nic show-effective-route-table \
--resource-group rg-vnet-lab --name vm-webVMNic -o table 6. Tear it down. One command, because everything lives in one resource group. This is the habit worth keeping from the whole lab.
az group delete --name rg-vnet-lab --yes --no-wait Worthwhile extensions once the basics feel boring: replace the IP-based NSG rule with two
ASGs and an intent rule between them; add a route table that blackholes internet traffic
from snet-db with a None next hop; or stand up a private DNS zone
with auto-registration and watch the VM records appear.
Further reading
- Microsoft Learn — Virtual network overview — the canonical reference for VNet behaviour, limits, and pricing pointers.
- Azure Architecture Center — Hub-spoke network topology — the reference design this page's diagram is a sketch of, with Terraform and Bicep samples.
- Microsoft Learn — Private Link FAQ — the fine print on Private Endpoints, DNS integration, and what Private Link does not cover.
- Semicolony — Cloud networking — the provider-neutral grounding: CIDR, routing, and NAT from first principles.
- Semicolony — AWS VPC deep dive — the same concepts wearing AWS names; reading both cements the model.