MICROSOFT PURVIEW

Data Governance at Scale

Enterprise Architecture, Implementation Patterns & AI-Era Governance

Data Platform Practice  |  Microsoft Ecosystem

Veratas  |  2026 Edition  |  Version 2.0

Executive Summary

Data is the most valuable asset enterprises possess — yet for most organizations, it remains ungoverned, undiscovered, and untrustworthy. Regulatory scrutiny has never been higher. GDPR fines exceeded €4.2 billion in 2023. The average cost of a data breach reached $4.45 million in 2024 (IBM). And yet, surveys consistently find that fewer than 30% of enterprise data assets are formally catalogued or classified.

Microsoft Purview has evolved significantly since its launch in 2022. As of early 2026, it is no longer merely a data catalog — it is Microsoft’s unified data security, governance, and compliance platform for the era of AI. New capabilities including Data Security Posture Management (DSPM), AI Observability for agents, and the Generally Available Unified Catalog have fundamentally expanded what Purview can do for enterprise data programs.

This white paper is written for data architects, Chief Data Officers, compliance leaders, and senior engineers who need a current, production-grade reference. It reflects the state of the platform as of Q1 2026 — incorporating AI governance capabilities, updated terminology (Microsoft Entra ID, not Azure Active Directory), and the latest Fabric integration innovations announced March 2026.

Key Measured Outcomes

DimensionWithout PurviewWith Purview (Measured)
Data DiscoveryManual inventory; 40–60% assets undocumented95%+ automated classification across 200+ source types
Time to Compliance Audit6–12 weeks manual preparation2–3 days with automated evidence collection
Data Breach DetectionMean time 197 days (IBM 2024)Near-real-time DLP alerts and sensitivity label enforcement
AI Data Risk Visibility0% — no tooling for agent/AI data flowsFull AI Observability via DSPM; agent risk scoring and remediation
Data Consumer ProductivityAvg 4.2 hours/week searching for trusted data68% reduction in data search time (customer benchmark)
Governance TCO (3-year)Distributed tools: $2.1M–$4.8MPurview consolidation: $800K–$1.6M (45–65% reduction)
Key Insight: Microsoft Purview is not merely a data catalog — it is an integrated governance, risk, and compliance (GRC) platform that creates a closed-loop governance operating model. As of 2026, it is also your primary control plane for AI data risk. Organizations that treat it as only a catalog leave 60% of its capability untapped.

Chapter 1: The Data Governance Imperative

1.1 Why Governance Fails Without Architecture

Most governance programs fail not because of lack of intent, but because of architectural sprawl. Governance teams operate spreadsheets, data stewards work in isolation, and lineage is tracked manually if at all. The root causes are structural:

  • Federated data ownership without centralized metadata: Business units own data but metadata lives nowhere.
  • Tool fragmentation: Organizations accumulate Collibra, Informatica, Alation, Apache Atlas, and custom wikis — each partial, none authoritative.
  • Classification as a one-time project: Point-in-time inventories that decay immediately as new data arrives.
  • Compliance as reactive audit response: Evidence collection happens at audit time, not continuously.
  • AI data flows with no governance: Copilot prompts, agent responses, and AI-generated data traverse the estate invisibly — a new and rapidly growing gap.

1.2 The 2026 Regulatory Landscape

RegulationKey RequirementPurview CapabilityImplementation Evidence
GDPR (EU)Data subject rights, consent tracking, cross-border controlsData Map classification, Subject Rights Requests, sensitivity labelsAutomated PII detection; SRR workflow audit trail
HIPAA (US)PHI identification, access controls, audit loggingCustom PHI classification rules, policy enforcement, access reviewsClassification report; access policy audit log export
CCPA (California)Consumer data inventory, opt-out rightsData estate inventory export, lineage documentationAsset inventory report; consent metadata tagging
PCI-DSS v4.0Cardholder data scoping, encryption, access loggingSensitive info type: Credit Card, DLP policies, encryption insightsData map export; DLP incident reports
EU AI ActAI system risk classification, data quality for AI trainingDSPM AI Observability, Unified Catalog data quality, agent governanceAgent risk inventory; data quality scan reports
ISO 27001:2022Information classification, asset inventory, supplier managementFull data catalog, sensitivity labels, third-party scanner integrationControl mapping export from Compliance Manager
War Story: A global bank with 47 data sources spent 11 weeks preparing for a GDPR audit. After deploying Purview with automated scanning, the same audit was prepared in 4 days — with full lineage documentation. Annual compliance preparation cost reduced from $1.8M to $340K.

Chapter 2: Microsoft Purview — Platform Architecture (2026)

2.1 Architectural Overview

Microsoft Purview is delivered as a cloud-native SaaS service. As of 2026, its architecture has expanded to five primary planes:

PlaneComponentsPrimary Function
Data Map PlaneAutomated scanning, classification engine, lineage collector, Atlas-compatible metadata storeDiscovery, classification, and relationship mapping across all data sources
Unified Catalog Plane (GA 2025)Search index, glossary engine, data products, automated access workflows, data quality toolsSelf-service discovery, business context, data access, and quality management for consumers
Governance Insights PlaneEstate health dashboards, sensitivity coverage reports, stewardship metrics, Data Estate InsightsMeasurement, reporting, and continuous improvement of governance program
Compliance & Protection PlaneInformation Protection (sensitivity labels), DLP engine, Compliance Manager, eDiscovery, AuditRegulatory compliance, data protection, legal hold, and investigation
DSPM & AI Governance Plane (New 2025/26)Data Security Posture Management, AI Observability, Agent Risk Management, Security Copilot integrationVisibility and remediation of data risks across human and AI agent activity

2.2 Identity & RBAC Architecture

Purview uses Microsoft Entra ID (formerly Azure Active Directory, renamed October 2023) for authentication and implements its own RBAC model on top.

Purview RoleScopePermissionsRecommended Assignment
Collection AdminPer-collectionManage sub-collections, assign roles within scopeBusiness unit data governance leads
Data Source AdminPer-collectionRegister and manage data sources, create scan rule setsData engineering team leads
Data CuratorPer-collectionEdit metadata, apply glossary terms, manage classificationsData stewards, domain data owners
Data ReaderPer-collectionRead-only access to catalog, lineage, and classificationsData consumers, analysts, report developers
Insights ReaderAccount-levelAccess Data Estate Insights dashboardsCDO, governance program manager
Policy AuthorAccount-levelCreate and publish data access policiesSecurity architects, data governance lead

Chapter 3: Data Map — Discovery & Classification at Scale

3.1 Scanning Architecture

The Data Map’s scanning engine is the foundation of Purview governance. Understanding scan architecture is critical to building a reliable governance program. Three scan execution models are available:

  • Managed Virtual Network (MVNet) — Recommended: Purview manages the integration runtime within a Microsoft-managed VNet. No infrastructure to deploy. Best for most Azure-native deployments.
  • Self-Hosted Integration Runtime (SHIR): Customer-deployed VM running the Purview runtime agent. Required for on-premises sources, private network sources, and non-Azure cloud sources.
  • Azure Integration Runtime (AIR): Used for public-endpoint Azure sources. Not recommended for sensitive environments.

SHIR Sizing Guide

Data Volume (Assets)CPURAMNetwork BandwidthNode Count
< 100K assets4 vCPU8 GB100 Mbps1 (no HA)
100K – 1M assets8 vCPU16 GB1 Gbps2 (active-active HA)
1M – 10M assets16 vCPU32 GB10 Gbps4 (2 + 2 failover)
> 10M assets32 vCPU64 GB10 Gbps dedicated8+ (scale-out cluster)

3.2 Supported Data Sources

Source CategorySupported SourcesLineage SupportClassification
Azure DataADLS Gen1/Gen2, Azure Blob, Azure SQL DB, SQL MI, Synapse, Cosmos DB, PostgreSQL, MySQLYes (native connectors)Full (all classification types)
Microsoft FabricFabric Lakehouse, Fabric Warehouse, Dataflows, Power BI datasets/reportsYes (deep integration, column-level)Full, bidirectional label sync
On-PremisesSQL Server 2012+, Oracle 12c+, SAP HANA, Teradata, HDFSSQL Server: Yes. Others: LimitedFull classification
Multi-CloudAWS S3, AWS RDS, Google BigQuery, GCS, SnowflakeLimited (no native lineage)Full classification
Third-Party (DSPM)Salesforce (via Varonis), Databricks (via BigID), Snowflake (via Cyera), GCP (via OneTrust)Via partner connectors into DSPMClassification via partner signals
SaaS & Office 365SharePoint Online, Exchange, Teams, OneDriveN/A (unstructured)Full M365 sensitivity label integration

Chapter 4: Unified Catalog — Enterprise Search, Lineage & Data Quality

4.1 Unified Catalog (Generally Available, 2025)

The Microsoft Purview Unified Catalog reached General Availability in late 2025, consolidating data discovery into a single experience and replacing the previous bifurcated catalog model. Key advances over the prior catalog:

  • Automated access workflows replace manual approval chains for data product access requests and glossary term publishing.
  • Built-in data quality tools: measure, monitor, and remediate issues such as incomplete records, inconsistencies, and redundancies.
  • Critical Data Column table: new self-service analytics capability allowing users to report glossary terms and concepts associated with data asset columns.
  • Data quality error record publishing to cloud storage: generally available in all supported Azure regions, enabling dashboards and continuous improvement tracking.
  • Integration with external catalogs: Fabric OneLake, Databricks Unity Catalog, and Snowflake Polaris metadata can be unified into a single view.

4.2 Business Glossary Design

A well-designed glossary is the semantic backbone of the catalog. Flat glossaries fail at scale — a 2,000-term flat list is unsearchable. Structure terms in a parent-child hierarchy:

  • L1 — Domain: Customer, Product, Finance, Risk, Operations, HR
  • L2 — Subdomain: Customer → Prospect, Active, Churned
  • L3 — Concept: Customer → Active → Customer Lifetime Value, Net Promoter Score
  • L4 — Attribute: CLV → Predicted CLV (12-month), Actual CLV (trailing 12-month)
Term StatusMeaningWho Sets ItCatalog Behavior
DraftTerm being developed; not yet authoritativeTerm authors (data stewards)Discoverable but not recommended for use
ApprovedReviewed and endorsed by domain ownerDomain data ownerShown as authoritative in search results
DeprecatedTerm being replaced; avoid new usageGovernance teamShown with deprecation warning; redirects to replacement term
ExpiredTerm no longer valid; historical referenceGovernance program managerHidden from default search; accessible via filter

4.3 Lineage Architecture

Data lineage answers the questions that matter most: ‘Where does this metric come from?’, ‘What would break if we changed this table?’, ‘How was this data transformed?’

  • Automated lineage (preferred): Purview automatically extracts lineage from ADF, Synapse Spark, Synapse Pipelines, Fabric Dataflows, and Power BI. Zero code required.
  • SQL-based lineage parsing: Purview parses stored procedures, views, and CTAS statements for column-level lineage. Supports Azure SQL Database, Synapse Dedicated Pool, SQL Server.
  • Custom lineage via Atlas API: For dbt, custom Spark jobs, Informatica, Talend — lineage submitted programmatically via the Apache Atlas REST API.
  • Fabric lineage (recommended 2025+): Column-level lineage through Lakehouse, Warehouse, Dataflows, and Power BI reports in a single unbroken chain.
Lineage Troubleshooting: If lineage gaps appear between Lakehouse and Warehouse, ensure Fabric Warehouse is using shortcuts to Lakehouse (not COPY INTO). COPY INTO breaks automated lineage — use Lakehouse shortcuts or Dataflows instead.

Chapter 5: Data Security Posture Management & AI Governance

This chapter covers what is arguably the most significant expansion of Purview in 2025–2026: Data Security Posture Management (DSPM) and AI governance capabilities. These address risks that traditional data governance tools were never designed to handle.

5.1 Why AI Governance Requires New Capabilities

As organizations adopt Microsoft 365 Copilot, Copilot Studio agents, and Azure AI Foundry models, entirely new data risk vectors emerge:

  • 86% of organizations lack visibility into what data flows through AI systems (2025 Microsoft survey).
  • 40% of data security incidents now occur within AI applications (Microsoft research, 2025).
  • 78% of AI users bring their own AI tools to work — creating ‘Shadow AI’ exposure.
  • AI agents operate autonomously and access large volumes of sensitive data, creating risk profiles tied to behavior, not just identity.

5.2 Data Security Posture Management (DSPM)

The new DSPM experience (public preview December 2025, GA target April 2026) unifies the previous DSPM classic and DSPM for AI classic experiences into a single, outcome-based platform:

  • Outcome-based guided workflows: Choose a data security objective and receive step-by-step remediation guidance.
  • AI Observability: A dedicated inventory of all AI apps and agents — including first-party (Copilot Studio, Azure AI Foundry), third-party, and custom-built agents — with activity in the last 30 days, risk levels, and sensitive interaction counts.
  • Item-level remediation: Bulk disable overshared SharePoint links, apply sensitivity labels, and activate protection policies directly from DSPM.
  • External platform visibility: Third-party signals from Salesforce (Varonis), Databricks (BigID), Snowflake (Cyera), and Google Cloud Platform (OneTrust) surface in a unified view via Microsoft Sentinel Data Lake.
  • Advanced reports: Instant visibility into sensitivity label coverage, DLP policy activity, and posture trends with drill-down filters.

5.3 AI Agent Governance

AI agents are now first-class entities in Purview’s governance model — not afterthoughts:

CapabilityWhat It DoesApplies To
AI ObservabilityInventory of all AI apps and agents; risk level assignment per agent; sensitive interaction countCopilot Studio, Azure AI Foundry, third-party agents, Agent 365
Agentic Risk in IRMAgent-specific risk indicators detect unauthorized data access and anomalous behaviorsAll agents with M365 access
DLP for AgentsAgents inherit DLP protections — prevented from accessing labeled files or sending sensitive data via TeamsFirst-party and Copilot Studio agents
Communication ComplianceDetects non-compliant activity in human-agent interactions; proactive policy-based governanceAll agent interactions in M365
eDiscovery & AuditAgent prompt/response retention, deletion policies, and legal hold extended to agent interactionsAll M365-connected agents
Risky Agents Policy TemplateIRM template detects anomalous agent behaviors including exfiltration patternsCopilot Studio and Microsoft Foundry agents
Key Architectural Principle: Purview treats AI agents as data principals — they inherit the same protections as human users. A Highly Confidential labeled file cannot be accessed by an agent any more than by an unauthorized human. This governance-by-design approach eliminates the ‘Shadow AI’ data exposure gap.

Chapter 6: Data Policy & Access Governance

6.1 Purview Data Policy Architecture

Purview’s Data Policy capability represents a fundamental shift from infrastructure-level ACLs managed by engineers to business-level policies managed by data owners and governance teams. A Purview data access policy states: ‘Users in group X can perform action Y on data assets matching classification Z.’

6.2 Policy Types

  • Data owner policies: Grant read or read/modify access to Azure Storage, ADLS Gen2, Azure SQL, and Fabric without involving the infrastructure team.
  • DevOps policies: Grant SQL performance monitoring access (VIEW DATABASE STATE) to DevOps engineers without granting data read permissions.
  • Self-service data access policies: Data consumers request access through the Unified Catalog. Automated workflow routes to data owner. Access provisioned or rejected with full audit trail.
  • Attribute-based access control (ABAC): Grant access based on asset classifications rather than specific named assets. New assets automatically inherit correct policies as they are classified.

6.3 Policy Enforcement Architecture

Data SourcePolicy Enforcement PointLatency to EnforceGranularity
ADLS Gen2Azure Storage RBAC (via Purview policy propagation)< 5 minutesContainer, folder, file level
Azure SQL DatabaseSQL permissions (system-managed)< 2 minutesDatabase, schema, table level
Microsoft FabricFabric workspace and item permissions< 10 minutesWorkspace, lakehouse, table level
Azure Synapse AnalyticsSynapse workspace RBAC< 5 minutesWorkspace, pool level
Fabric Warehouse/KQL DBs (New GA)DLP policy tip triggering on sensitive data uploadNear real-timeAsset-level with sensitive data detection
Architectural Constraint: Purview data policies do NOT replace row-level security (RLS), column masking, or Dynamic Data Masking (DDM) in SQL databases. Purview policies govern who can connect and query. RLS/DDM governs what data they see within an allowed connection. Both layers are required for complete access governance.

Chapter 7: Information Protection & DLP

7.1 Sensitivity Label Taxonomy

Sensitivity labels are the governance primitive that spans cloud storage, databases, Office documents, Teams messages, third-party applications, and — as of 2026 — AI agent interactions. The taxonomy must balance usability with enforcement precision. More than 10 labels typically causes label fatigue.

LabelDefinitionProtection ActionsAuto-labelling Trigger
PublicApproved for external publication. No restrictions.NoneNo sensitive classifications detected
InternalBusiness information for employee use. Not for public sharing.Watermark on documentsDefault label applied to all unlabelled items
ConfidentialSensitive business data. External sharing requires approval.Encryption, external sharing DLP block, watermarkPII classification, financial data patterns
Highly ConfidentialRegulated data, trade secrets, executive communications.Encryption, download restrictions, MFA for access, audit loggingPHI, PCI data, credentials, classified IP
RestrictedLegal hold, regulatory investigation, M&A sensitive.Encryption, access list restricted to named individuals, no forwardingLegal trigger or manual assignment only

7.2 DLP Policy Architecture (2026)

As of 2026, Purview DLP has been restructured (the table of contents for DLP documentation was reorganized for clarity) and now explicitly covers three scenarios: protecting enterprise data, protecting enterprise data on devices, and inline data protection.

DLP ScopeTrigger ConditionActionBusiness Justification Override
Exchange EmailHighly Confidential label; external recipientBlock delivery; notify sender; generate incidentYes — manager approval workflow
SharePoint/OneDriveConfidential label; public sharing link createdBlock link creation; notify user; generate incidentYes — data owner approval
Teams MessagesCredit card number, SSN pattern in messageBlock send; notify user; policy tip displayedNo — hard block (financial regulatory)
Fabric Warehouse (New GA)Sensitive data detected in asset uploaded to WarehousePolicy tip trigger; restrict access for KQL/SQL DBsAdmin configurable
AI Agents (New)Agent attempts to access Highly Confidential labeled fileBlock agent access; audit log entry; alert to adminNo — security team review required

Chapter 8: Deployment Architecture & Operating Model

8.1 Deployment Architecture Patterns

  • Pattern 1 — Centralized Governance (Single Account): Best for organizations with <50,000 data assets, single-geography operation, or strong central governance team. Simple operations, lower cost.
  • Pattern 2 — Federated Governance (Hub-and-Spoke): Best for large multi-geography organizations with autonomous business units. Central CDO office hub; business unit spoke accounts synchronized via Purview metadata API.
  • Pattern 3 — Domain-Aligned (Data Mesh): Best for organizations implementing Data Mesh. Each data domain owns its own Purview account. Enterprise governance sets standards; federated computational governance via shared glossary and classification taxonomy.

8.2 Governance Operating Model

RoleResponsibilitiesTime CommitmentPurview Role
Chief Data OfficerSet governance strategy, approve glossary, report metrics to board2–4 hrs/weekInsights Reader, Collection Admin (Root)
Data Governance LeadOperate Purview program, manage stewards, evolve policiesFull timeCollection Admin, Policy Author
Domain Data OwnerOwn data quality for domain, approve certifications and access requests4–8 hrs/weekData Curator (domain collection)
Data StewardEnrich metadata, link glossary terms, resolve classification issues, review flagged assets50–100% FTEData Curator
Data EngineerRegister data sources, configure scans, build custom lineage integration20% allocationData Source Admin
AI Governance Analyst (New Role)Monitor AI agent risk scores in DSPM, review AI Observability reports, manage agentic risk policies20–50% FTEInsights Reader + Security Admin
Operating Model Insight: A Purview deployment without assigned Data Stewards is a catalog that fills with metadata but never becomes trusted. For a 50,000 asset estate, plan for 2–3 full-time stewards in year one. Automation can increase effective capacity to 1 FTE per 100,000 assets at maturity. For organizations deploying Copilot or AI agents, an AI Governance Analyst role is now essential — this is not optional in the AI era.

Chapter 9: Scale, Performance & Cost Optimization

9.1 Capacity Planning

ResourceScale Limit (2025)Recommendation
Assets in Data Map100 million assets per accountFor >80M assets, begin planning federated architecture
Registered sources3,000 sources per accountConsolidate similar source types into single registered sources where possible
Concurrent scans100 concurrent scan runsUse scan scheduling to avoid peak concurrency; prioritize by source criticality
Glossary terms100,000 terms per accountMaintain term hygiene; deprecate unused terms quarterly
Collections256 collections per accountDesign flat-ish hierarchies; max 4–5 levels for most organizations
Custom classification rules500 per accountConsolidate similar patterns; use regex groups over multiple single patterns

9.2 Scan Performance Optimization

Optimization LeverDescriptionPerformance ImpactTradeoff
Incremental scanningScan only new/modified assets since last scan using watermark-based detection60–80% reduction in scan time for stable sourcesMay miss classification changes on unmodified assets
Targeted scanningScope scans to specific folders, schemas, or file patterns40–70% fasterRequires good source naming conventions
Classification rule optimizationUse targeted rule sets per source type; reduce rules per scan rule set30–50% faster per scanned assetRequires maintaining multiple scan rule sets
Off-hours schedulingSchedule large scans for 2–6 AM to avoid competing with production workloadsNo throughput gain but avoids source contentionDelayed freshness; not suitable for compliance triggers

9.3 Cost Optimization

Purview pricing is based on Data Map capacity units (CUs), scan compute, and Microsoft 365 Compliance licensing. A new pay-as-you-go pricing model (available alongside the Suite license) covers data estates, analytics, and AI apps — use the DSPM Usage Center to track consumption per investigation and avoid over-provisioning.

Cost ComponentBilling ModelOptimization Strategy
Data Map capacity units$0.496/CU/hour (1 CU = 1GB metadata storage + processing capacity)Incremental scans reduce CU consumption by 60–75%
Scan computeBilled per vCore-hour for SHIR; Managed VNet included in CUsRight-size SHIR VMs; schedule to minimize runtime; use MVNet where possible
M365 Compliance (DLP, Labels)Included in M365 E5 or E5 Compliance add-onAudit license assignments; unused Compliance seats are common waste
DSPM & AI Governance (New)Pay-as-you-go; Data Security Investigation Compute Units (DSICUs) replaced SCUsUse the Usage Center dashboard to track per-investigation consumption
Defender for Cloud integrationCharged per resource per hourEnable only for in-scope regulated workloads

Chapter 10: Real-World Case Studies

Case Study 1: Global Financial Services Firm — GDPR & PCI Compliance Transformation

DimensionDetails
OrganizationPan-European retail bank, 12,000 employees, 85 data sources
ChallengeGDPR audit failed in 2022 due to inability to demonstrate PII data inventory. €2.3M fine issued. Compliance team spent 14 weeks per audit cycle manually documenting data assets.
Purview ScopeData Map (85 sources), full estate classification, GDPR & PCI assessments in Compliance Manager, sensitivity labels across M365 and Azure Storage, DLP policies for credit card and IBAN patterns
Timeline16 weeks to full production deployment across all 85 sources
MetricBefore PurviewAfter Purview (12 months)
Compliance audit preparation time14 weeks manual6 days automated
PII coverage (classified assets)23% (manual inventory)94% (automated)
Sensitivity label coverage8% (M365 only)87% across Azure + M365
GDPR SRR response time28 days4 days
Annual compliance staffing cost£1.2M (12 FTE)£380K (3 FTE + automation)
Purview TCO (Year 1)£420K (all-in)

Case Study 2: Healthcare Network — PHI Governance & HIPAA Continuous Compliance

DimensionDetails
OrganizationUS regional hospital network, 22 hospitals, 6,500 clinical staff, 140TB of health data across Azure and on-premises
ChallengeInability to demonstrate minimum-necessary access principle for PHI (HIPAA §164.514). Multiple breaches of PHI to non-clinical staff through misconfigured Power BI reports.
Purview ScopeHealthcare-specific classification rules (34 custom PHI types), Data Map across Epic EHR integration layer + Azure SQL + ADLS Gen2, Purview policy for PHI access restriction, DLP to block PHI in Teams/email, Compliance Manager HIPAA assessment
Key ResultClassification accuracy validated at 96.3% against 10,000 manually labelled records. First external HIPAA audit post-deployment: no significant findings.

Case Study 3: Retail Enterprise — AI-Ready Data Mesh with Fabric & Purview (2026)

A FTSE 100 retailer with 8 data domains implemented a Data Mesh on Microsoft Fabric. The governance challenge evolved in 2025: beyond interoperability and trust, they needed to govern Copilot-powered analytics agents accessing domain-owned data products.

  • Deployed DSPM AI Observability to inventory 47 AI agents accessing the Fabric estate — 12 were flagged as high-risk due to oversharing patterns.
  • Applied DLP policies to Fabric Warehouse and KQL DBs (newly GA) to prevent sensitive data leakage through Copilot agent responses.
  • Insider Risk Management extended to Fabric lakehouses with built-in risk indicators for potential data exfiltration by agents.
KPITargetAchieved (Month 12)
Data products certified80% of published products84%
Cross-domain data access time< 3 business days1.2 days average
AI agent data risk incidents resolved< 5/month2.1/month average
Time to identify root cause of cross-domain data issue< 4 hours47 minutes average

Chapter 11: 90-Day Implementation Roadmap

Based on 20+ Purview deployments, the following 90-day roadmap represents the optimal sequencing for enterprise governance programs in 2026 — incorporating AI governance activation alongside traditional catalog and classification work.

Phase 1: Foundation (Days 1–30)

WeekActivityOwnerSuccess Criteria
1Purview account provisioning, network design (MVNet vs SHIR), Microsoft Entra ID group creation, collection hierarchy designData Engineer + ArchitectPurview account live; network connectivity validated; collection hierarchy approved
1–2Source inventory: document all data sources including AI systems and Copilot deploymentsData Governance LeadComplete source inventory; sources prioritized by regulatory risk; AI systems catalogued
2Glossary foundation: identify 50–100 core business terms per domain with definitions, owners, and related termsData Governance Lead + Domain OwnersCore glossary terms in Draft status; term owners assigned
2–3Register and scan Tier 1 sources (PCI scope, PHI scope, GDPR-critical sources)Data EngineerTier 1 sources scanned; classification results reviewed; false positive rate <5%
3–4Classification review and tuning: build custom rules for organizational patternsData Steward + Data EngineerCustom classification rules deployed; accuracy >90% on validation set
4RBAC configuration: assign roles to domain teams using Microsoft Entra ID groupsData Governance LeadAll roles assigned; domain teams can access catalog; engineers can register sources

Phase 2: Activation (Days 31–60)

WeekActivityOwnerSuccess Criteria
5–6Register and scan all remaining data sources; configure incremental scan schedulesData Engineer>90% of data estate registered and scanned; scan schedule operational
6Quick win: publish Governance Maturity Score dashboard in Power BI; present to leadershipData Governance LeadDashboard live; leadership briefing completed; program funded for Phase 3
6–7Sensitivity label deployment: configure taxonomy; deploy auto-labelling policies in simulation modeSecurity + Data GovernanceLabels published; simulation mode running; simulation report reviewed
7–8Enable DSPM: activate AI Observability; inventory all AI apps and agents; assign initial risk levelsSecurity + AI Governance AnalystAI agent inventory complete; high-risk agents identified; initial DSPM posture established
7–8Lineage validation: verify ADF, Synapse, Fabric lineage; build custom lineage via Atlas API for non-native sourcesData EngineerEnd-to-end lineage visible for >3 critical pipelines; column-level lineage for Power BI reports
8Compliance Manager setup: create GDPR, HIPAA, or relevant regulatory assessmentsCompliance OfficerAt least one regulatory assessment active; initial compliance score baseline established

Phase 3: Optimization (Days 61–90)

WeekActivityOwnerSuccess Criteria
9–10Enable sensitivity labels in production (exit simulation mode); deploy DLP policies including Fabric Warehouse DLPSecurity + Data GovernanceLabels applying to new content; DLP incidents in dashboard; <10% false positive rate
10–11AI governance policies: configure IRM Risky Agents template; deploy agent-specific DLP; establish AI data access policiesAI Governance Analyst + SecurityRisky agent policies active; agent DLP blocking tested; AI governance posture report generated
11Glossary completion: approve Tier 1 glossary terms; link to classified assets via bulk assignmentData Steward>80% of Tier 1 assets linked to at least one approved glossary term
12Program review: measure Governance Maturity Score vs. Week 1 baseline; document lessons; plan 90–180 day roadmapCDO + Data Governance LeadGovernance score improvement documented; 90–180 day roadmap approved; operating model confirmed
Critical Success Factor: Governance programs that fail typically do so in Days 31–60 — the ‘activation phase.’ Quick wins must be demonstrated by Day 45 to maintain organizational momentum. The Power BI Governance Maturity Score dashboard is designed as this early value demonstration. In 2026, demonstrating AI agent governance to leadership is an equally powerful motivator for program continuation.

Appendix: Governance Maturity Model & Quick Reference

A. Governance Maturity Model

LevelNameCharacteristicsTarget ScoreTypical Timeline
L1InitialAd hoc governance; no systematic catalog; manual compliance; governance by tribal knowledge0–20Starting point
L2ManagedData sources registered and scanned; basic classification applied; glossary under development; ownership partially assigned20–500–6 months post-deployment
L3DefinedFull estate classification; glossary approved and linked; lineage documented; compliance assessments active50–706–12 months post-deployment
L4Quantitatively GovernedGovernance Maturity Score tracked weekly; stewardship SLAs enforced; access policies active; DLP protecting sensitive data; AI agent inventory established70–8512–24 months
L5OptimizingAutomated certification; continuous compliance; AI-assisted stewardship; full AI governance with DSPM; governance embedded in CI/CD pipelines85–10024–36 months

B. Key Purview REST API Reference

OperationMethodEndpointUse Case
List collectionsGET/account/collectionsAudit collection hierarchy; governance reports
Get asset by qualified nameGET/catalog/api/atlas/v2/entity/uniqueAttribute/type/{typeName}Look up specific asset metadata in automation
Update asset contactsPUT/catalog/api/atlas/v2/entity/guid/{guid}/businessattribute/ContactsBulk owner assignment in onboarding automation
Submit lineagePOST/catalog/api/atlas/v2/entity/bulkCustom lineage for non-native sources (dbt, custom ETL)
Run scanPOST/scan/datasources/{dsName}/scans/{scanName}/runsTrigger scan on-demand from CI/CD on schema change
Create glossary termPOST/catalog/api/atlas/v2/glossary/termBulk glossary population from existing business dictionaries
Get DSPM agent inventory (New)GET/security/dspm/agentsList all AI agents with risk levels and sensitive interaction counts

C. Glossary of Key Terms

TermDefinition
Apache AtlasOpen-source metadata management framework; the foundational metadata model underlying Purview’s Data Map
Business GlossaryCurated vocabulary of business terms linked to data assets; provides semantic context and shared language
CollectionHierarchical container in Purview that scopes metadata, access control, and policy enforcement
Data LineageDocumentation of data origin, movement, and transformation — tracing how data flows from source to consumption
DSPMData Security Posture Management — Purview’s unified plane for discovering, protecting, and investigating data risks across traditional and AI workloads
AI ObservabilityDSPM capability providing an inventory of all AI apps and agents, their risk levels, and sensitive data interactions
Microsoft Entra IDThe current name for Azure Active Directory (rebranded October 2023). All Purview documentation and configurations should use this term.
Unified CatalogGA feature (2025) consolidating data discovery, data quality, automated access workflows, and glossary management into a single experience
OpenLineageOpen standard for data lineage metadata; used by Purview Spark connector to emit lineage from Spark jobs
Sensitivity LabelClassification tag applied to data assets and documents that drives downstream protection actions across the entire Microsoft ecosystem

Leave a Reply

Your email address will not be published. Required fields are marked *