02 / 09
Azure / 02

Entra ID

Every Azure API call starts with a question: who is asking? Entra ID is the system that answers it, for humans, for applications, and for the resources themselves. This page covers the tenant as the identity boundary, the app-registration and service-principal split that confuses everyone, managed identities and the token flow that makes credentials disappear, the two RBAC systems Azure quietly runs side by side, and the federation pattern that lets a GitHub Actions job deploy without a single stored secret.


Why identity is Azure's centre of gravity

In AWS, identity feels like one service among many. In Azure it is the ground everything else stands on, and there is a commercial reason for that. Most large companies were Microsoft customers before they were cloud customers. Their employees already sign in to Outlook, Teams, and SharePoint every morning, and every one of those sign-ins goes through Entra ID. When that company starts building on Azure, the directory of users, groups, MFA policies, and device registrations is already there. Azure does not bootstrap a new identity system per account the way AWS does; it plugs the cloud into the identity system the company already lives in. That single fact explains a lot of Microsoft's enterprise gravity, and it explains why this page comes second in the sequence, right after foundations.

It also means the stakes are different. The same directory that controls who can read a storage account controls who can read the CEO's mailbox. A compromised Entra tenant is not an Azure incident; it is a company-wide incident. Microsoft renamed Azure Active Directory to Microsoft Entra ID in 2023, partly to stop people confusing it with Windows Server Active Directory. The two are related by history, not by protocol: Entra ID speaks OAuth 2.0, OpenID Connect, and SAML over HTTPS. It does not speak Kerberos or LDAP and it has no Group Policy. If you need those, there is a separate paid service (Entra Domain Services) that emulates a domain controller. For everything in this codex, Entra ID is a token issuer, and that mental model is the right one.

Concretely: every call to the Azure Resource Manager API carries a bearer token, and every one of those tokens was minted by Entra ID. When you run az login, a browser window authenticates you against Entra and the CLI caches the resulting tokens. When a VM reads a secret from Key Vault, the VM presented an Entra token. There is no path into the Azure control plane that does not pass through this system, which is why understanding it pays off on every other page in the sequence.

The tenant is the identity boundary

A tenant is one dedicated instance of Entra ID: one directory, with its own GUID (the tenant ID), its own domain names (starting with something.onmicrosoft.com, plus any custom domains you verify), and its own set of users, groups, applications, and policies. Nothing inside one tenant can see inside another. When the docs say "the tenant is the security boundary," they mean it the strong way: a Global Administrator of tenant A has exactly zero authority in tenant B, full stop.

The piece that takes a while to click is how tenants relate to subscriptions. A subscription, as the foundations page covers, is a billing and resource container. Every subscription trusts exactly one tenant for authentication, and one tenant typically has many subscriptions hanging off it: production, development, the data team's sandbox. Identity lives a level above the resources. This is the first big structural difference from AWS, where each account carries its own private IAM universe and cross-account access has to be wired up role by role. In Azure, a single principal in the tenant can be granted access across fifty subscriptions with fifty role assignments and no trust-policy ceremony, because all fifty already trust the same directory.

ENTRA TENANT · the identity boundary · one directory, one tenant IDusersmembers + guestshumans who sign ingroupsbundle principalsassign roles onceservice principalsapps + automationhold secrets or certsmanaged identitiesAzure resourcesno secrets at allevery sign-in, every token requestsubscription · prodtrusts the tenant abovesubscription · devtrusts the tenant above
Four kinds of principal live in the directory. Subscriptions sit below it and delegate every authentication decision upward.

Two practical consequences. First, multi-tenant situations are real and annoying: consultancies, companies mid-acquisition, and anyone who clicked "create new tenant" by accident in 2019 will find az account tenant list returns more than one entry, and half their permission mysteries trace back to being signed in to the wrong one. Second, external collaboration has a first-class shape: B2B guest users. A guest is a real user object in your tenant that points at a home identity somewhere else, so you can grant a contractor access to one resource group without ever managing their password or their MFA. Their home tenant authenticates them; yours authorizes them.

Users and groups, quickly

Users are the boring part, with two details worth keeping. Members are accounts whose home is this tenant; guests are the B2B references described above, and they show up everywhere permissions are audited, so know the distinction. Groups carry more weight than you might expect, because the sane way to run Azure access is to assign roles to groups and manage membership, never to assign roles to individual users. Security groups are the workhorse. Microsoft 365 groups also exist (they carry a mailbox and a SharePoint site along for the ride) and can usually be ignored for infrastructure work.

The one feature to remember for interviews and for real life: dynamic membership. A dynamic group has a rule instead of a roster, something like "all users where department equals Data Platform," and Entra keeps the membership in sync as people join, move, and leave. Pair a dynamic group with a handful of role assignments and onboarding becomes an HR attribute change rather than a ticket queue. Groups can also be nested, though Azure RBAC only honours nesting for security groups and the evaluation rules around nesting have enough edge cases that flat group structures age better.

App registrations, enterprise applications, and service principals

Here is the part of Entra that genuinely confuses everyone, including people who have run it for years, because the portal uses three names for two objects. When an application needs its own identity (a CI pipeline, a backend service, a third-party SaaS tool reading your calendar) two distinct objects get involved.

The application object, created under "App registrations," is the blueprint. It lives in exactly one tenant, the one where it was registered, and it defines what the application is: its redirect URIs, the API permissions it wants, the credentials (secrets or certificates) it can authenticate with, the roles it exposes. Think of it as a class definition.

The service principal is the instance of that blueprint inside a particular tenant, and it is the thing that actually does stuff: it signs in, it holds role assignments, it appears in audit logs. The portal lists service principals under "Enterprise applications," which is the naming crime at the heart of the confusion. An enterprise application is a service principal. Same object, different blade.

Why split them at all? Multi-tenancy. When Contoso registers a multi-tenant app and Fabrikam consents to use it, the application object stays in Contoso's tenant, and a service principal gets created in Fabrikam's tenant. Fabrikam's admins control that local service principal: they can restrict who uses it, watch its sign-ins, or delete it, all without touching Contoso's blueprint. One class, an instance per tenant that uses it. For the common single-tenant case the split feels like pointless ceremony, because registering an app creates both objects in the same tenant a moment apart, but the model only makes sense once you picture the multi-tenant case it was designed for.

The one-line version. App registration = the definition, lives in the home tenant. Service principal (a.k.a. enterprise application) = the local instance that signs in and gets permissions, one per tenant that uses the app. Roles are always assigned to the service principal, never to the application object.

Traditional service principals authenticate with a client secret or a certificate, and that credential is the weak point: it gets pasted into a pipeline variable, copied into a teammate's notes, and forgotten until it expires on a Saturday or leaks in a repo. The whole next section exists because Microsoft decided the best secret management is no secret at all.

Managed identities, the no-credentials pattern

A managed identity is a service principal that Azure itself creates, owns, and rotates credentials for, bound to an Azure resource. Your VM, Function app, AKS workload, or Logic App gets an identity in the tenant, and no human ever sees a secret. Nothing to store in a pipeline, nothing to rotate, nothing to leak. If AWS instance roles are the nearest relative, managed identities are the same idea made tenant-wide and pushed across nearly every service in the platform. It is the flagship Entra feature for engineers, and the one interviewers reliably reach for.

There are two flavours, and the difference is lifecycle, not mechanism. A system-assigned identity is created with a specific resource, can only ever be attached to that resource, and is deleted automatically when the resource is deleted. Its role assignments die with it. A user-assigned identity is a standalone Azure resource with its own lifecycle: you create it explicitly, attach it to as many resources as you like, and it survives any of them being torn down. System-assigned is the low-friction default for a single resource that needs its own identity. User-assigned wins when a fleet of resources should share one identity (twenty VMs behind a scale set that all need the same Key Vault access: one identity, one set of role assignments) or when you want the identity and its permissions to exist before the compute does, which matters for infrastructure-as-code, where granting roles to an identity that gets recreated on every deploy is a flaky mess.

The mechanism underneath is worth knowing in detail, because it shows up in debugging and in interviews. Code running on an Azure VM gets tokens from the Instance Metadata Service, IMDS, the same link-local endpoint AWS engineers know: 169.254.169.254. The address is not routable, so only processes on the machine can reach it. The flow:

your code, on a VMno secret anywhereIMDS169.254.169.254link-local onlyEntra IDmints the tokenResource Managerchecks RBAC, then acts1 · GET /metadata/identity/...2 · which identity is attached?3 · access token (a JWT) comes back4 · Authorization: Bearer ...
The token flow. The code never authenticates; it asks the platform, and the platform vouches for it. Step 1 carries no credential at all.

Walk it once. Your code sends a plain HTTP GET to http://169.254.169.254/metadata/identity/oauth2/token with a Metadata: true header and a query parameter naming the resource it wants a token for, such as https://management.azure.com/ for ARM or https://vault.azure.net for Key Vault. The request carries no credential, because reaching that address from inside the VM is the proof: the hypervisor knows which VM is asking and which identities are attached to it. IMDS does the actual OAuth dance with Entra using certificates the platform manages, hands back a signed JWT, and your code attaches it as a bearer token on real API calls. Tokens are cached and short-lived. If the VM has several user-assigned identities attached, the request must say which one it wants via its client ID, and forgetting that parameter is a classic source of confusing 400 errors.

Two footnotes that save real debugging time. App Service and Functions do not expose IMDS; they inject IDENTITY_ENDPOINT and IDENTITY_HEADER environment variables instead, which is why the SDK classes (DefaultAzureCredential and friends) exist: they probe the environment and pick the right transport so your code stays identical from laptop to production. And managed identities only work from inside Azure resources. There is nothing to install on your own datacentre box; for code running outside Azure you want workload identity federation, covered below.

Azure RBAC: who can do what, where

Authentication says who you are; Azure RBAC decides what you can do to resources. Every grant is a role assignment, and a role assignment is exactly three things bolted together: a principal (user, group, or service principal, including managed identities), a role definition, and a scope.

A role definition is a named list of allowed operations: control-plane actions like Microsoft.Compute/virtualMachines/read, plus data-plane actions for services where reading the contents is separate from managing the resource. Azure ships hundreds of built-in roles, but four general ones do most of the work: Reader (look, don't touch), Contributor (do anything except manage access), Owner (everything, including granting roles), and User Access Administrator (manage access and nothing else). The data-plane split bites people constantly: Contributor on a storage account lets you delete the account but not read a blob in it. Reading blobs needs a data role such as Storage Blob Data Reader. If you have ever stared at a 403 while being Owner of the resource, this was probably why.

A scope is where the assignment applies, and scopes nest. From the top: management group, then subscription, then resource group, then individual resource. Assignments inherit downward with no exceptions: Reader on a management group means Reader on every subscription under it, every resource group in those, and every resource in those. There is no way to carve a hole in an inherited grant with another role assignment, because the model is additive: your effective permissions are the union of everything assigned to you and to every group you are in, at the scope in question and above. (Deny assignments exist as a separate object type, but you cannot create them directly; they are produced by platform features like managed applications, so day-to-day the model is purely additive.)

management grouporg-wide guardrails live heresubscriptionbilling + a common scope for platform teamsresource groupthe everyday unit: one app, one team, one labresourcea single VM, vault, or storage accountassignments inherit downReader granted here……is Reader on everything in the pyramid. Assign at the narrowest scope that does the job.
The four scope levels. A role assignment attaches at one level and flows down through everything beneath it.

The craft is in choosing scopes. Assign at the narrowest scope that does the job: the lab at the end of this page grants Reader on one resource group, not on the subscription, because the identity has no business elsewhere. Use management groups for the broad, boring grants (the security team's audit reader role, say) so they are written once instead of per subscription. And know the operational ceiling exists: each subscription tops out at a few thousand role assignments, which sounds like plenty until per-user, per-resource automation starts minting them, and is one more argument for assigning to groups.

The classic trap: two RBAC systems

Azure runs two role systems side by side, and conflating them is probably the single most common Azure identity misunderstanding. Azure RBAC, above, governs resources: VMs, vaults, clusters. Entra directory roles govern the directory itself: who can create users, register applications, reset passwords, change conditional access policies. Global Administrator, User Administrator, Application Administrator: these are directory roles, and they say nothing about resources.

The trap, stated plainly: a Global Administrator, the most privileged identity in the tenant, cannot read a single storage account by default. Directory roles do not inherit into Azure RBAC. The two systems share the same principals and the same token issuer, and nothing else. There is one deliberate bridge: a Global Admin can flip the "Access management for Azure resources" switch, which grants them User Access Administrator at the root scope, above every management group, which exists for break-glass recovery when someone has locked everyone out of a subscription. It is logged, it is loud, and using it casually is a security-review finding, but knowing it exists is the difference between a recoverable incident and a support ticket.

 Azure RBACEntra directory roles
GovernsResources: VMs, storage, vaults, clustersThe directory: users, groups, apps, policies
Example rolesOwner, Contributor, Reader, Storage Blob Data ReaderGlobal Administrator, User Administrator, Application Administrator
Scoped toManagement group / subscription / resource group / resourceThe tenant (some support administrative units)
Assigned viaaz role assignment create, IAM blade on a resourceEntra portal, az rest against Microsoft Graph
CrossoverNone by default. Only the audited root-scope access toggle bridges them.

Interviewers like this one because it sorts people who have run Azure from people who have read about it. If asked to design least privilege for a platform team, the answer touches both systems: directory roles for the people who manage identities and app registrations, Azure RBAC at deliberate scopes for the people and workloads that touch resources, and no one holding Global Admin for convenience.

Conditional access, briefly

Everything so far decides what a principal may do once it has a token. Conditional access decides whether the token gets issued at all, and under what conditions. A policy is an if-then statement evaluated at sign-in time: if this user or group, signing in to this application, from this location or device state or sign-in risk level, then block, or allow, or allow only with extra requirements such as MFA, a compliant device, or a short session lifetime.

The standard enterprise baseline reads like a checklist: require MFA for all administrators, block legacy authentication protocols that cannot do MFA at all, require compliant or hybrid-joined devices for sensitive apps, and require MFA when Entra's risk engine flags a sign-in as unusual (impossible travel, anonymizing proxies, leaked credentials). Policies have a report-only mode that logs what would have happened without enforcing it, and rolling out any new policy without running report-only first is how companies lock their own admins out on a Friday. Conditional access needs P1 licensing, service principals need the separate workload-identities add-on to be covered by policy, and the feature is a deep specialism of its own; for this codex the sentence to retain is that it is policy-driven token issuance, sitting in front of everything else on this page.

OAuth and OIDC under it all

None of the machinery above is proprietary protocol. Entra ID is a large, opinionated OAuth 2.0 authorization server and OpenID Connect provider, and every flow on this page maps onto a standard grant. A user signing in to the portal is an OIDC authorization-code flow; the ID token says who they are, and you can read the full mechanics on the OIDC page. A service principal authenticating with a secret is a client-credentials grant, the machine-to-machine flow from the OAuth walkthrough. A managed identity is a client-credentials grant where the platform holds the client credential for you. Tokens are JWTs signed by the tenant, carrying the tenant ID, the principal's object ID, the audience they were minted for, and group or role claims.

Each tenant exposes its own token endpoint at login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token, and the tenant-in-the-URL detail is the protocol-level expression of the tenant boundary from the top of this page: a token minted by one tenant's endpoint is meaningless to resources that trust another. One Azure-specific convention worth recognizing on sight: scopes for Azure services use the resource's identifier URI plus /.default, so a token for ARM is requested with the scope https://management.azure.com/.default. When a token request fails, the error nearly always names the scope, and knowing the convention turns an opaque error string into a fix.

Workload identity federation: no secrets off-cloud either

Managed identities solve credentials for code running inside Azure. Workload identity federation solves them for code running anywhere else that already has an identity provider: GitHub Actions, GitLab, Kubernetes clusters anywhere, other clouds. The idea is a trust exchange. The external platform already issues its workloads signed OIDC tokens asserting who they are; you tell Entra to accept those tokens, from that issuer, with that exact subject, as proof of identity for one of your principals. No shared secret ever exists.

The GitHub Actions case is the one to know cold, because it killed the old pattern of pasting a service-principal secret into repository settings and praying about rotation. You add a federated credential to an app registration or, neatly, to a user-assigned managed identity. The credential records three things: the issuer (https://token.actions.githubusercontent.com), the subject claim to match (something like repo:contoso/deploy-repo:ref:refs/heads/main), and the audience. At run time the workflow asks GitHub for an OIDC token, sends it to Entra, Entra verifies the signature against GitHub's published keys and checks the claims match, and returns a normal Entra access token carrying whatever role assignments the principal has. The subject matching is exact, which is both the security property and the top support issue: a workflow triggered from a pull request has a different subject than one triggered from a branch push, and the mismatch fails with an error that does not obviously say so.

Notice what this means combined with the previous sections: a GitHub workflow can deploy to production with a scoped Contributor assignment on one resource group, conditional access policies watching the sign-in, and not one secret stored anywhere in the chain. That sentence is the destination this whole page has been driving toward.

Against AWS IAM and GCP IAM

If you carry an AWS or GCP mental model, the table below is the translation layer. The deep difference is where identity lives: AWS puts an IAM universe inside every account, GCP attaches identity to a resource hierarchy with users coming from Cloud Identity or Workspace, and Azure puts the directory above everything and makes the resource containers trust it. The AWS IAM deep dive covers the other column of this table at full depth, and the cloud identity concepts page covers the ideas all three share.

 Azure / EntraAWS IAMGCP IAM
Identity containerTenant, above all subscriptionsPer-account IAM, federation via Identity CenterCloud Identity / Workspace, above the org
Machine identityManaged identity (system / user-assigned)IAM role + instance profile, IRSA / Pod IdentityService account attached to the resource
Grant modelRole assignment = principal + role + scope; additiveJSON policies; explicit deny wins; boundaries and SCPsRole bindings on the hierarchy; additive, deny policies newer
Scope hierarchyManagement group → subscription → resource group → resourceOrganization → OU → account (policies, not grants, flow down)Organization → folder → project → resource
Cross-boundary accessSame tenant: just assign; cross-tenant: B2B / multi-tenant appsAssumeRole with trust policies, per pairBindings accept principals from any org; simplest of the three
Keyless CIWorkload identity federationOIDC provider + AssumeRoleWithWebIdentityWorkload identity federation

One judgment call worth forming for interviews: Azure's model is the easiest of the three to reason about inside a single company (one directory, one additive grant model, scopes that nest predictably) and the AWS model is the most expressive for hard multi-party isolation, at the price of policy documents that need a flowchart to evaluate. GCP sits between them. None of this is praise or blame; it falls out of who each platform was built for.

Further reading

Lab — give a managed identity a real role

Ten minutes, one resource group, nothing left behind. You will create a user-assigned managed identity, watch the service principal appear in the tenant, grant it Reader at resource-group scope, inspect the assignment, and tear it all down. You need the az CLI and any subscription where you can create resource groups.

  1. Create the resource group that scopes everything.
    az group create --name entra-lab-rg --location eastus
  2. Create a user-assigned managed identity and capture its IDs.
    az identity create --resource-group entra-lab-rg --name entra-lab-mi PRINCIPAL_ID=$(az identity show --resource-group entra-lab-rg \ --name entra-lab-mi --query principalId --output tsv) CLIENT_ID=$(az identity show --resource-group entra-lab-rg \ --name entra-lab-mi --query clientId --output tsv) echo "principal: $PRINCIPAL_ID client: $CLIENT_ID"
    Two IDs, two systems: principalId is the service principal's object ID in the tenant (what role assignments point at); clientId is what code uses to request tokens. Mixing them up is the lab's first lesson.
  3. Confirm a real service principal now exists in the directory.
    az ad sp show --id $PRINCIPAL_ID --query servicePrincipalType --output tsv # ManagedIdentity
    The identity you created as an Azure resource is also a directory object. Both halves of this page in one command.
  4. Grant Reader at resource-group scope.
    RG_ID=$(az group show --name entra-lab-rg --query id --output tsv) az role assignment create \ --assignee-object-id $PRINCIPAL_ID \ --assignee-principal-type ServicePrincipal \ --role "Reader" \ --scope $RG_ID
    If this fails with "principal does not exist," wait thirty seconds and retry: the new service principal replicates through the directory asynchronously, and the lag is a known habit of fresh identities. Passing --assignee-principal-type explicitly sidesteps most of it.
  5. List role assignments from both directions.
    # everything granted at (or visible from) this scope az role assignment list --scope $RG_ID --output table # everything this principal holds, anywhere az role assignment list --assignee $PRINCIPAL_ID --all --output table
    The first view is the scope's answer to "who can touch me"; the second is the principal's answer to "what can I touch." Audits need both.
  6. Tear down.
    az role assignment delete --assignee $PRINCIPAL_ID --scope $RG_ID az identity delete --resource-group entra-lab-rg --name entra-lab-mi az group delete --name entra-lab-rg --yes --no-wait
    Deleting the group would have removed the identity anyway, but deleting the role assignment explicitly is the habit to build: assignments pointing at deleted principals linger as "Identity not found" entries, and tidy teardown keeps the subscription's assignment list honest.

Extend it if you are curious: attach the identity to a VM with az vm identity assign, SSH in, and curl the IMDS endpoint from the diagram above to hold a real token in your hands. The next page moves from who can act to where traffic flows.

Found this useful?