Azure Policy in Real Landing Zones
Onboard multiple Azure subscriptions into a real landing zone using Azure Policy, management groups, and policy-as-code for enterprise governance.
Introduction
On paper, Azure Landing Zones and Azure Policy look clean and deterministic. In reality, you're onboarding dozens or hundreds of subscriptions with existing workloads, legacy configurations, and multiple teams that all think they're special.
This article focuses on using Azure Policy to onboard multiple subscriptions into a real, enterprise-grade landing zone:
- How to structure management groups and policy initiatives for scalable governance
- How to onboard brownfield and greenfield subscriptions safely
- How to use policy as code and CI/CD to operate this at scale
- How to balance security, reliability, and delivery velocity when policies start denying real traffic
The scope is Azure-first, but the patterns map naturally to AWS Organizations + SCPs and GCP Organization policies.
Core Concepts
Azure Landing Zone in Governance Terms
Forget marketing. In governance terms, an Azure Landing Zone is:
- A management group hierarchy
- A set of policy initiatives (built-in and custom)
- A baseline of guardrails for networking, identity, security, and operations
- A repeatable way to onboard subscriptions and apps into that baseline
Key Azure building blocks:
- Management groups (MGs) – hierarchical containers for subscriptions; policies applied here flow down
- Azure Policy definitions – rules for what is allowed, denied, audited, or automatically deployed
- Policy initiatives – logical bundles of policy definitions (e.g., CAF "Azure Landing Zone" initiatives)
- Assignments – binding a definition/initiative to a scope (management group, subscription, resource group)
- Effects –
Deny, Audit, DeployIfNotExists, Modify, Append, etc.
- Remediation tasks – jobs created to apply
DeployIfNotExists/Modify to existing resources
Management Group Hierarchy for Real Landing Zones
A pragmatic, enterprise-friendly hierarchy often looks like:
Tenant Root ( / )
├─ Platform
│ ├─ Identity
│ ├─ Management
│ ├─ Connectivity
├─ LandingZones
│ ├─ Corp
│ │ ├─ Corp-Prod
│ │ ├─ Corp-NonProd
│ ├─ Online
│ ├─ Online-Prod
│ ├─ Online-NonProd
└─ Sandbox
Patterns:
- Platform MGs host shared services: identity, networking, management
- LandingZones MGs host application subscriptions grouped by business domain / criticality
- Sandbox MG for low-governance experimental spaces
Policy is usually bound at Platform, LandingZones, and specific child MGs such as Corp-Prod.
Policy as Code and DevSecOps
Azure Policy without automation becomes unmaintainable as soon as you pass 5 subscriptions.
Treat Azure Policy as code:
- Definitions and initiatives stored as JSON in Git (or Bicep/ARM modules, or Terraform)
- Pipelines (Azure DevOps or GitHub Actions) that:
- Validate policies (linting, schema checks)
- Publish definitions to the management group scope
- Assign initiatives with parameter values per environment (dev/test/prod)
- PR-based change control for all policy changes
- Security and platform teams collaborate on policy sets and exemptions through Git, not the portal
High-level mapping of key services:
- DevOps tooling: Azure DevOps Pipelines / GitHub Actions
- Governance: Azure Policy, Management Groups, Azure Blueprints (phasing out), CAF landing zone initiatives
- Security: Defender for Cloud, Key Vault, Managed Identities, Entra ID (Azure AD)
- Observability: Azure Monitor, Log Analytics, Application Insights
Step-by-Step Guide
1. Prerequisites and Roles
Prerequisites:
Tenant-level setup
- Management group hierarchy defined and approved
- Tenant Root Group locked down to a small platform/governance team
Access
- Platform team with
Owner or Contributor + Resource Policy Contributor at key MGs
- Security team with
Security Admin + rights to define/approve policies
Tools
- Git repository for policy-as-code
- CI/CD pipelines (Azure DevOps or GitHub Actions)
- Landing zone reference (CAF-aligned) selected and tailored
Team ownership:
- Platform team: management groups, policy initiatives, CI/CD
- Security team: security controls, approvals, exceptions
- App teams: subscription ownership, remediation work, exception requests
2. Design the Management Group Strategy
Define environments
- Separate Prod and NonProd at MG level where possible (
Corp-Prod, Corp-NonProd)
- Critical workloads can have their own branch (
Payments-Prod MG)
Align policies with MG scopes
- Tenant root: only minimal guardrails (e.g., logging, naming, maybe region restrictions)
- Platform MG: policies for shared services (diagnostics, DDoS plan requirements, etc.)
- LandingZones MG: baseline security, networking, and operational standards for all app subscriptions
Key rule: You should not assign policies directly to individual subscriptions unless there is a strong, documented reason. Use management groups as default.
3. Define and Version Your Policy Library
Use a structure like:
policies/
definitions/
security/
allowed-locations.json
vm-require-managed-disks.json
operations/
deploy-diagnostics-to-log-analytics.json
networking/
require-private-endpoints.json
initiatives/
alz-core.json
alz-security.json
alz-networking.json
assignments/
Corp-NonProd/
alz-core.json
alz-security.json
Corp-Prod/
alz-core.json
alz-security.json
alz-networking.json
Example Terraform snippet for an initiative assignment:
resource "azurerm_policy_set_definition" "alz_core" {
name = "alz-core"
display_name = "ALZ Core Baseline"
policy_type = "Custom"
management_group_id = data.azurerm_management_group.landingzones.id
policy_definitions = [
{
policy_definition_id = data.azurerm_policy_definition.allowed_locations.id
parameters = jsonencode({ listOfAllowedLocations = { value = ["westeurope", "northeurope"] } })
},
{
policy_definition_id = data.azurerm_policy_definition.deploy_diagnostics.id
parameters = jsonencode({ logAnalytics = { value = "/subscriptions/.../resourceGroups/rg-logs/providers/Microsoft.OperationalInsights/workspaces/lz-logs-weu" } })
}
]
}
Then assign that initiative to a specific management group:
resource "azurerm_policy_set_definition" "alz_core" {
# as above
}
resource "azurerm_policy_assignment" "alz_core_corp_prod" {
name = "alz-core-corp-prod"
display_name = "ALZ Core for Corp Prod"
policy_definition_id = azurerm_policy_set_definition.alz_core.id
scope = data.azurerm_management_group.corp_prod.id
identity {
type = "SystemAssigned"
}
location = "westeurope"
}
4. Build CI/CD for Policies
Example: GitHub Actions workflow for policy deployment (simplified):
name: Deploy Azure Policies
on:
push:
branches: [ main ]
paths:
- 'policies/**'
jobs:
deploy-policies:
runs-on: ubuntu-latest
env:
AZURE_MG_ID: /providers/Microsoft.Management/managementGroups/landingzones
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Validate policy JSON
run: |
python scripts/validate_policies.py policies/definitions
python scripts/validate_initiatives.py policies/initiatives
- name: Deploy policy definitions
run: |
az deployment mg create \
--management-group-id ${AZURE_MG_ID##*/} \
--template-file infra/policies/definitions.bicep \
--parameters definitionsPath=policies/definitions
- name: Deploy policy initiatives and assignments
run: |
az deployment mg create \
--management-group-id ${AZURE_MG_ID##*/} \
--template-file infra/policies/initiatives.bicep \
--parameters initiativesPath=policies/initiatives assignmentsPath=policies/assignments
Key points:
- Use service principals with least privilege for policy deployment
- Validate JSON / Bicep / Terraform as part of the pipeline
- Use separate pipelines or stages for dev vs prod MGs
5. Onboard Subscriptions: Brownfield vs Greenfield
5.1 Greenfield subscriptions
For new subscriptions:
- Create subscription via automation (e.g., Azure Deployment Environments, Azure DevOps pipeline, Terraform, custom portal)
- Ensure the new subscription is placed in the correct management group (
Corp-NonProd, Corp-Prod, etc.)
- Policies auto-apply from the MG
- Validate:
- Subscription appears in management group
- Policy assignments list inherited initiatives
- First deployments from app team resources comply with policy
5.2 Brownfield subscriptions (existing workloads)
Onboarding existing subscriptions is where Azure Policy meets reality.
Process:
Discovery
- Move subscription into a temporary MG (e.g.,
Brownfield-Quarantine) where only Audit policies are applied
- Run compliance scans and export non-compliance via Azure Resource Graph or
az policy state CLI
- Classify findings: high-risk (public IPs, missing encryption, no backups), medium, low
Define migration strategy
- Agree with app owners on timelines and responsibility for remediation
- For some controls, use
DeployIfNotExists or Modify to auto-remediate
- For hard-breaking changes, schedule maintenance windows
Gradual enforcement
- Phase 1: Audit-only policies at target landing zone MG
- Phase 2: Switch selected controls to Deny for new deployments, keep existing resources under
Audit/DeployIfNotExists
- Phase 3: Fully enforce policies including existing resources (if realistic)
Move subscription into its final MG once:
- Critical non-compliances are addressed or excepted
- Auto-remediation jobs have run and are monitored
- App team acknowledges and accepts the landing zone guardrails
For DeployIfNotExists / Modify policies:
- Enable system-assigned managed identity on policy assignments to perform changes
- Trigger remediation tasks via portal or CLI:
az policy remediation create \
--name fix-diag-settings \
--policy-assignment "/subscriptions/<subId>/providers/Microsoft.Authorization/policyAssignments/alz-core-corp-prod" \
--resource-group rg-app-prod
- Monitor remediation status via:
- Azure Policy compliance dashboard
- Azure Monitor alerts on failed remediation tasks
6.2 Exceptions and Exemptions
Not everything can be compliant on day one.
Use Azure Policy exemptions instead of deleting or bypassing policies:
- Scope exemptions to the smallest possible asset (resource group or resource)
- Make them time-bound where possible (expiration date)
- Track exemptions in Git (YAML/JSON manifest) to avoid "exception sprawl"
Architecture Overview

This shows:
- Policy-as-code flowing through CI/CD into MG scopes
- Subscriptions inheriting initiatives from their MGs
- Clear ownership separation between platform, security, and app teams
Best Practices
Start with management groups, not subscriptions
Design and agree your MG hierarchy before onboarding more subscriptions
Bundle policies into coherent initiatives
Organize by domain: alz-core, alz-security, alz-networking, alz-ops rather than hundreds of standalone assignments
Use Audit-first, then Deny
Introduce policies as Audit, analyze impact, then promote selected controls to Deny for new deployments
Separate baselines for Prod vs NonProd
NonProd can have more permissive policies (e.g., broader locations, less strict SKUs). Prod should be tightly controlled
Make policy-as-code mandatory
Disallow manual creation or modification of policy assignments outside pipelines (and detect them via drift monitoring)
Minimize tenant root scope policies
Place most policies at Landing Zone MG scope; keep tenant root focused on tenant-wide essentials (e.g., disallowed regions, high-level logging)
Automate subscription creation and placement
Never create subscriptions manually via portal for production scenarios. Use pipelines + APIs to enforce correct MG placement and tags
Integrate policy results into observability
Export compliance and remediation metrics to Log Analytics, build dashboards and alerts (e.g., % compliance per landing zone, per team)
Align policy with security standards
Map initiatives to CIS, NIST, ISO controls; use Defender for Cloud and CAF recommendations to inform baseline
Common Pitfalls
1. "Deny Everything" From Day One
Issue: Deploying a large set of Deny policies to all subscriptions at once.
Impact: Broken pipelines, blocked deployments, emergency bypasses, political fallout.
Detection: Sudden spike in policy evaluation failures and deployment errors (HTTP 403) across subscriptions.
Fix:
- Roll back to
Audit for high-impact controls
- Reintroduce
Deny gradually with prior impact analysis and communication
2. No Separation Between Baseline and App-Specific Policies
Issue: App-specific requirements mixed into global initiatives.
Impact: Baseline becomes cluttered, unmanageable, and difficult to update.
Detection: Initiatives that reference specific resource names, SKUs, or app tags.
Fix:
- Keep platform baseline initiatives generic
- Manage app-specific policies at subscription or dedicated MG branches
3. Manual Policy Drift
Issue: Changes made via the portal that are not reflected in Git.
Impact: Unknown behaviors, inconsistent environments, broken expectations between teams.
Detection: Regular export and diff between current assignments and Git repo.
Fix:
- Enforce policy-as-code
- Introduce a "drift detection" pipeline that compares actual state to desired state and raises alerts
4. Overuse of Tenant Root Policies
Issue: Assigning almost everything at tenant root MG.
Impact: No flexibility for different landing zones, high blast radius for mistakes.
Detection: Tenant root has dozens of initiatives; lower MGs have almost none.
Fix:
- Move most initiatives down to LandingZone / domain MGs
- Keep root very thin
Issue: Policy assignments use DeployIfNotExists but the managed identity doesn't have required permissions.
Impact: Remediation tasks fail silently or partially.
Detection: Errors in remediation jobs, non-compliant resources not fixed.
Fix:
- Assign appropriate RBAC roles (e.g.,
Contributor on target scopes) to the policy assignment identities
6. Poor Exception Governance
Issue: Exceptions created ad-hoc, without expiry or documentation.
Impact: Hidden risk, compliance gaps, and audit failures.
Detection: Large number of exemptions with no clear owner or reason.
Fix:
- Implement a standardized exception workflow (ticket + PR + approval)
- Require owner, reason, scope, and expiry for every exemption
FAQ
1. How does this compare to AWS and GCP?
- AWS equivalent: AWS Organizations + SCPs + Config; Azure Policy is closer to Config rules + some SCP-like deny behavior
- GCP equivalent: Organization policies + Constraints at org/folder/project scopes
The pattern—policy at hierarchy nodes, inheritance to accounts/subscriptions/projects—is the same.
2. How do I integrate security and compliance without blocking delivery?
Use a progressive rollout: start with Audit, expose non-compliance to teams, set SLAs for fixes, then enforce Deny only on well-understood controls. Add policy checks into CI (pre-merge) as well as at runtime.
3. How do I scale this across dozens of teams and subscriptions?
- Standardize landing zones per domain
- Centralize policy-as-code in a platform-owned repo
- Provide self-service subscription provisioning that automatically places subscriptions into the right MG with the correct initiatives
4. What about multi-region and DR landing zones?
Ensure allowed locations and networking policies support your DR regions. Create separate MGs if DR regions have different constraints, or parameterize initiatives with region-specific settings (e.g., different Log Analytics workspaces).
5. How do I integrate with legacy stacks and on-prem?
Focus your policies on connectivity and security first: VPN/ExpressRoute, private endpoints, NSGs, and logging. Ensure that hybrid connectivity does not bypass landing zone controls (e.g., enforcing private-only paths for PaaS services).
6. How do I measure success of my landing zone governance?
Track KPIs such as:
- % of subscriptions onboarded to landing zones
- Policy compliance rate per MG / per team
- Mean time to remediate non-compliant resources
- Number of active exemptions and their trend
- Deployment failure rates caused by policy (should decrease over time as baselines stabilize)
7. How do I deal with app teams who need special policies?
Use branching MGs under landing zones or subscription-scoped initiatives for special cases, but keep them managed via the same policy-as-code pipeline and documented exceptions.
8. Do I need Azure Blueprints?
Blueprints are being superseded by ARM/Bicep/Terraform + policy-as-code. Prefer those, unless you have a strong reason to keep existing Blueprint-based implementations.
Conclusion
Onboarding multiple subscriptions into a real Azure Landing Zone is fundamentally a governance and operations problem, not a button in the portal. Azure Policy gives you the enforcement engine; management groups give you the structure; policy-as-code and CI/CD give you repeatability.
Key takeaways:
- Design a management group hierarchy that reflects your organization
- Use policy initiatives aligned to CAF domains (core, security, networking, ops)
- Treat Azure Policy as code, integrated with pipelines and reviews
- Phase in enforcement: audit, analyze, remediate, then deny
- Manage exceptions and remediation with discipline, not one-off hacks
Next steps:
- Draft your MG hierarchy and get cross-team buy-in
- Stand up a policy-as-code repo and CI/CD pipeline
- Start with a non-prod landing zone, onboard a few subscriptions, and refine
- Gradually extend to prod and high-criticality workloads
References