MICROSOFT PURVIEW

March 19, 2026 Puneet No comments yet

Data Governance at Scale

Enterprise Architecture, Implementation Patterns & AI-Era Governance

Data Platform Practice | Microsoft Ecosystem

Veratas | 2026 Edition | Version 2.0

Executive Summary

Data is the most valuable asset enterprises possess — yet for most organizations, it remains ungoverned, undiscovered, and untrustworthy. Regulatory scrutiny has never been higher. GDPR fines exceeded €4.2 billion in 2023. The average cost of a data breach reached $4.45 million in 2024 (IBM). And yet, surveys consistently find that fewer than 30% of enterprise data assets are formally catalogued or classified.

Microsoft Purview has evolved significantly since its launch in 2022. As of early 2026, it is no longer merely a data catalog — it is Microsoft’s unified data security, governance, and compliance platform for the era of AI. New capabilities including Data Security Posture Management (DSPM), AI Observability for agents, and the Generally Available Unified Catalog have fundamentally expanded what Purview can do for enterprise data programs.

This white paper is written for data architects, Chief Data Officers, compliance leaders, and senior engineers who need a current, production-grade reference. It reflects the state of the platform as of Q1 2026 — incorporating AI governance capabilities, updated terminology (Microsoft Entra ID, not Azure Active Directory), and the latest Fabric integration innovations announced March 2026.

Key Measured Outcomes

Dimension	Without Purview	With Purview (Measured)
Data Discovery	Manual inventory; 40–60% assets undocumented	95%+ automated classification across 200+ source types
Time to Compliance Audit	6–12 weeks manual preparation	2–3 days with automated evidence collection
Data Breach Detection	Mean time 197 days (IBM 2024)	Near-real-time DLP alerts and sensitivity label enforcement
AI Data Risk Visibility	0% — no tooling for agent/AI data flows	Full AI Observability via DSPM; agent risk scoring and remediation
Data Consumer Productivity	Avg 4.2 hours/week searching for trusted data	68% reduction in data search time (customer benchmark)
Governance TCO (3-year)	Distributed tools: $2.1M–$4.8M	Purview consolidation: $800K–$1.6M (45–65% reduction)

Key Insight: Microsoft Purview is not merely a data catalog — it is an integrated governance, risk, and compliance (GRC) platform that creates a closed-loop governance operating model. As of 2026, it is also your primary control plane for AI data risk. Organizations that treat it as only a catalog leave 60% of its capability untapped.

Chapter 1: The Data Governance Imperative

1.1 Why Governance Fails Without Architecture

Most governance programs fail not because of lack of intent, but because of architectural sprawl. Governance teams operate spreadsheets, data stewards work in isolation, and lineage is tracked manually if at all. The root causes are structural:

Federated data ownership without centralized metadata: Business units own data but metadata lives nowhere.
Tool fragmentation: Organizations accumulate Collibra, Informatica, Alation, Apache Atlas, and custom wikis — each partial, none authoritative.
Classification as a one-time project: Point-in-time inventories that decay immediately as new data arrives.
Compliance as reactive audit response: Evidence collection happens at audit time, not continuously.
AI data flows with no governance: Copilot prompts, agent responses, and AI-generated data traverse the estate invisibly — a new and rapidly growing gap.

1.2 The 2026 Regulatory Landscape

Regulation	Key Requirement	Purview Capability	Implementation Evidence
GDPR (EU)	Data subject rights, consent tracking, cross-border controls	Data Map classification, Subject Rights Requests, sensitivity labels	Automated PII detection; SRR workflow audit trail
HIPAA (US)	PHI identification, access controls, audit logging	Custom PHI classification rules, policy enforcement, access reviews	Classification report; access policy audit log export
CCPA (California)	Consumer data inventory, opt-out rights	Data estate inventory export, lineage documentation	Asset inventory report; consent metadata tagging
PCI-DSS v4.0	Cardholder data scoping, encryption, access logging	Sensitive info type: Credit Card, DLP policies, encryption insights	Data map export; DLP incident reports
EU AI Act	AI system risk classification, data quality for AI training	DSPM AI Observability, Unified Catalog data quality, agent governance	Agent risk inventory; data quality scan reports
ISO 27001:2022	Information classification, asset inventory, supplier management	Full data catalog, sensitivity labels, third-party scanner integration	Control mapping export from Compliance Manager

War Story: A global bank with 47 data sources spent 11 weeks preparing for a GDPR audit. After deploying Purview with automated scanning, the same audit was prepared in 4 days — with full lineage documentation. Annual compliance preparation cost reduced from $1.8M to $340K.

Chapter 2: Microsoft Purview — Platform Architecture (2026)

2.1 Architectural Overview

Microsoft Purview is delivered as a cloud-native SaaS service. As of 2026, its architecture has expanded to five primary planes:

Plane	Components	Primary Function
Data Map Plane	Automated scanning, classification engine, lineage collector, Atlas-compatible metadata store	Discovery, classification, and relationship mapping across all data sources
Unified Catalog Plane (GA 2025)	Search index, glossary engine, data products, automated access workflows, data quality tools	Self-service discovery, business context, data access, and quality management for consumers
Governance Insights Plane	Estate health dashboards, sensitivity coverage reports, stewardship metrics, Data Estate Insights	Measurement, reporting, and continuous improvement of governance program
Compliance & Protection Plane	Information Protection (sensitivity labels), DLP engine, Compliance Manager, eDiscovery, Audit	Regulatory compliance, data protection, legal hold, and investigation
DSPM & AI Governance Plane (New 2025/26)	Data Security Posture Management, AI Observability, Agent Risk Management, Security Copilot integration	Visibility and remediation of data risks across human and AI agent activity

2.2 Identity & RBAC Architecture

Purview uses Microsoft Entra ID (formerly Azure Active Directory, renamed October 2023) for authentication and implements its own RBAC model on top.

Purview Role	Scope	Permissions	Recommended Assignment
Collection Admin	Per-collection	Manage sub-collections, assign roles within scope	Business unit data governance leads
Data Source Admin	Per-collection	Register and manage data sources, create scan rule sets	Data engineering team leads
Data Curator	Per-collection	Edit metadata, apply glossary terms, manage classifications	Data stewards, domain data owners
Data Reader	Per-collection	Read-only access to catalog, lineage, and classifications	Data consumers, analysts, report developers
Insights Reader	Account-level	Access Data Estate Insights dashboards	CDO, governance program manager
Policy Author	Account-level	Create and publish data access policies	Security architects, data governance lead

Chapter 3: Data Map — Discovery & Classification at Scale

3.1 Scanning Architecture

The Data Map’s scanning engine is the foundation of Purview governance. Understanding scan architecture is critical to building a reliable governance program. Three scan execution models are available:

Managed Virtual Network (MVNet) — Recommended: Purview manages the integration runtime within a Microsoft-managed VNet. No infrastructure to deploy. Best for most Azure-native deployments.
Self-Hosted Integration Runtime (SHIR): Customer-deployed VM running the Purview runtime agent. Required for on-premises sources, private network sources, and non-Azure cloud sources.
Azure Integration Runtime (AIR): Used for public-endpoint Azure sources. Not recommended for sensitive environments.

SHIR Sizing Guide

Data Volume (Assets)	CPU	RAM	Network Bandwidth	Node Count
< 100K assets	4 vCPU	8 GB	100 Mbps	1 (no HA)
100K – 1M assets	8 vCPU	16 GB	1 Gbps	2 (active-active HA)
1M – 10M assets	16 vCPU	32 GB	10 Gbps	4 (2 + 2 failover)
> 10M assets	32 vCPU	64 GB	10 Gbps dedicated	8+ (scale-out cluster)

3.2 Supported Data Sources

Source Category	Supported Sources	Lineage Support	Classification
Azure Data	ADLS Gen1/Gen2, Azure Blob, Azure SQL DB, SQL MI, Synapse, Cosmos DB, PostgreSQL, MySQL	Yes (native connectors)	Full (all classification types)
Microsoft Fabric	Fabric Lakehouse, Fabric Warehouse, Dataflows, Power BI datasets/reports	Yes (deep integration, column-level)	Full, bidirectional label sync
On-Premises	SQL Server 2012+, Oracle 12c+, SAP HANA, Teradata, HDFS	SQL Server: Yes. Others: Limited	Full classification
Multi-Cloud	AWS S3, AWS RDS, Google BigQuery, GCS, Snowflake	Limited (no native lineage)	Full classification
Third-Party (DSPM)	Salesforce (via Varonis), Databricks (via BigID), Snowflake (via Cyera), GCP (via OneTrust)	Via partner connectors into DSPM	Classification via partner signals
SaaS & Office 365	SharePoint Online, Exchange, Teams, OneDrive	N/A (unstructured)	Full M365 sensitivity label integration

Chapter 4: Unified Catalog — Enterprise Search, Lineage & Data Quality

4.1 Unified Catalog (Generally Available, 2025)

The Microsoft Purview Unified Catalog reached General Availability in late 2025, consolidating data discovery into a single experience and replacing the previous bifurcated catalog model. Key advances over the prior catalog:

Automated access workflows replace manual approval chains for data product access requests and glossary term publishing.
Built-in data quality tools: measure, monitor, and remediate issues such as incomplete records, inconsistencies, and redundancies.
Critical Data Column table: new self-service analytics capability allowing users to report glossary terms and concepts associated with data asset columns.
Data quality error record publishing to cloud storage: generally available in all supported Azure regions, enabling dashboards and continuous improvement tracking.
Integration with external catalogs: Fabric OneLake, Databricks Unity Catalog, and Snowflake Polaris metadata can be unified into a single view.

4.2 Business Glossary Design

A well-designed glossary is the semantic backbone of the catalog. Flat glossaries fail at scale — a 2,000-term flat list is unsearchable. Structure terms in a parent-child hierarchy:

L1 — Domain: Customer, Product, Finance, Risk, Operations, HR
L2 — Subdomain: Customer → Prospect, Active, Churned
L3 — Concept: Customer → Active → Customer Lifetime Value, Net Promoter Score
L4 — Attribute: CLV → Predicted CLV (12-month), Actual CLV (trailing 12-month)

Term Status	Meaning	Who Sets It	Catalog Behavior
Draft	Term being developed; not yet authoritative	Term authors (data stewards)	Discoverable but not recommended for use
Approved	Reviewed and endorsed by domain owner	Domain data owner	Shown as authoritative in search results
Deprecated	Term being replaced; avoid new usage	Governance team	Shown with deprecation warning; redirects to replacement term
Expired	Term no longer valid; historical reference	Governance program manager	Hidden from default search; accessible via filter

4.3 Lineage Architecture

Data lineage answers the questions that matter most: ‘Where does this metric come from?’, ‘What would break if we changed this table?’, ‘How was this data transformed?’

Automated lineage (preferred): Purview automatically extracts lineage from ADF, Synapse Spark, Synapse Pipelines, Fabric Dataflows, and Power BI. Zero code required.
SQL-based lineage parsing: Purview parses stored procedures, views, and CTAS statements for column-level lineage. Supports Azure SQL Database, Synapse Dedicated Pool, SQL Server.
Custom lineage via Atlas API: For dbt, custom Spark jobs, Informatica, Talend — lineage submitted programmatically via the Apache Atlas REST API.
Fabric lineage (recommended 2025+): Column-level lineage through Lakehouse, Warehouse, Dataflows, and Power BI reports in a single unbroken chain.

Lineage Troubleshooting: If lineage gaps appear between Lakehouse and Warehouse, ensure Fabric Warehouse is using shortcuts to Lakehouse (not COPY INTO). COPY INTO breaks automated lineage — use Lakehouse shortcuts or Dataflows instead.

Chapter 5: Data Security Posture Management & AI Governance

This chapter covers what is arguably the most significant expansion of Purview in 2025–2026: Data Security Posture Management (DSPM) and AI governance capabilities. These address risks that traditional data governance tools were never designed to handle.

5.1 Why AI Governance Requires New Capabilities

As organizations adopt Microsoft 365 Copilot, Copilot Studio agents, and Azure AI Foundry models, entirely new data risk vectors emerge:

86% of organizations lack visibility into what data flows through AI systems (2025 Microsoft survey).
40% of data security incidents now occur within AI applications (Microsoft research, 2025).
78% of AI users bring their own AI tools to work — creating ‘Shadow AI’ exposure.
AI agents operate autonomously and access large volumes of sensitive data, creating risk profiles tied to behavior, not just identity.

5.2 Data Security Posture Management (DSPM)

The new DSPM experience (public preview December 2025, GA target April 2026) unifies the previous DSPM classic and DSPM for AI classic experiences into a single, outcome-based platform:

Outcome-based guided workflows: Choose a data security objective and receive step-by-step remediation guidance.
AI Observability: A dedicated inventory of all AI apps and agents — including first-party (Copilot Studio, Azure AI Foundry), third-party, and custom-built agents — with activity in the last 30 days, risk levels, and sensitive interaction counts.
Item-level remediation: Bulk disable overshared SharePoint links, apply sensitivity labels, and activate protection policies directly from DSPM.
External platform visibility: Third-party signals from Salesforce (Varonis), Databricks (BigID), Snowflake (Cyera), and Google Cloud Platform (OneTrust) surface in a unified view via Microsoft Sentinel Data Lake.
Advanced reports: Instant visibility into sensitivity label coverage, DLP policy activity, and posture trends with drill-down filters.

5.3 AI Agent Governance

AI agents are now first-class entities in Purview’s governance model — not afterthoughts:

Capability	What It Does	Applies To
AI Observability	Inventory of all AI apps and agents; risk level assignment per agent; sensitive interaction count	Copilot Studio, Azure AI Foundry, third-party agents, Agent 365
Agentic Risk in IRM	Agent-specific risk indicators detect unauthorized data access and anomalous behaviors	All agents with M365 access
DLP for Agents	Agents inherit DLP protections — prevented from accessing labeled files or sending sensitive data via Teams	First-party and Copilot Studio agents
Communication Compliance	Detects non-compliant activity in human-agent interactions; proactive policy-based governance	All agent interactions in M365
eDiscovery & Audit	Agent prompt/response retention, deletion policies, and legal hold extended to agent interactions	All M365-connected agents
Risky Agents Policy Template	IRM template detects anomalous agent behaviors including exfiltration patterns	Copilot Studio and Microsoft Foundry agents

Key Architectural Principle: Purview treats AI agents as data principals — they inherit the same protections as human users. A Highly Confidential labeled file cannot be accessed by an agent any more than by an unauthorized human. This governance-by-design approach eliminates the ‘Shadow AI’ data exposure gap.

Chapter 6: Data Policy & Access Governance

6.1 Purview Data Policy Architecture

Purview’s Data Policy capability represents a fundamental shift from infrastructure-level ACLs managed by engineers to business-level policies managed by data owners and governance teams. A Purview data access policy states: ‘Users in group X can perform action Y on data assets matching classification Z.’

6.2 Policy Types

Data owner policies: Grant read or read/modify access to Azure Storage, ADLS Gen2, Azure SQL, and Fabric without involving the infrastructure team.
DevOps policies: Grant SQL performance monitoring access (VIEW DATABASE STATE) to DevOps engineers without granting data read permissions.
Self-service data access policies: Data consumers request access through the Unified Catalog. Automated workflow routes to data owner. Access provisioned or rejected with full audit trail.
Attribute-based access control (ABAC): Grant access based on asset classifications rather than specific named assets. New assets automatically inherit correct policies as they are classified.

6.3 Policy Enforcement Architecture

Data Source	Policy Enforcement Point	Latency to Enforce	Granularity
ADLS Gen2	Azure Storage RBAC (via Purview policy propagation)	< 5 minutes	Container, folder, file level
Azure SQL Database	SQL permissions (system-managed)	< 2 minutes	Database, schema, table level
Microsoft Fabric	Fabric workspace and item permissions	< 10 minutes	Workspace, lakehouse, table level
Azure Synapse Analytics	Synapse workspace RBAC	< 5 minutes	Workspace, pool level
Fabric Warehouse/KQL DBs (New GA)	DLP policy tip triggering on sensitive data upload	Near real-time	Asset-level with sensitive data detection

Architectural Constraint: Purview data policies do NOT replace row-level security (RLS), column masking, or Dynamic Data Masking (DDM) in SQL databases. Purview policies govern who can connect and query. RLS/DDM governs what data they see within an allowed connection. Both layers are required for complete access governance.

Chapter 7: Information Protection & DLP

7.1 Sensitivity Label Taxonomy

Sensitivity labels are the governance primitive that spans cloud storage, databases, Office documents, Teams messages, third-party applications, and — as of 2026 — AI agent interactions. The taxonomy must balance usability with enforcement precision. More than 10 labels typically causes label fatigue.

Label	Definition	Protection Actions	Auto-labelling Trigger
Public	Approved for external publication. No restrictions.	None	No sensitive classifications detected
Internal	Business information for employee use. Not for public sharing.	Watermark on documents	Default label applied to all unlabelled items
Confidential	Sensitive business data. External sharing requires approval.	Encryption, external sharing DLP block, watermark	PII classification, financial data patterns
Highly Confidential	Regulated data, trade secrets, executive communications.	Encryption, download restrictions, MFA for access, audit logging	PHI, PCI data, credentials, classified IP
Restricted	Legal hold, regulatory investigation, M&A sensitive.	Encryption, access list restricted to named individuals, no forwarding	Legal trigger or manual assignment only

7.2 DLP Policy Architecture (2026)

As of 2026, Purview DLP has been restructured (the table of contents for DLP documentation was reorganized for clarity) and now explicitly covers three scenarios: protecting enterprise data, protecting enterprise data on devices, and inline data protection.

DLP Scope	Trigger Condition	Action	Business Justification Override
Exchange Email	Highly Confidential label; external recipient	Block delivery; notify sender; generate incident	Yes — manager approval workflow
SharePoint/OneDrive	Confidential label; public sharing link created	Block link creation; notify user; generate incident	Yes — data owner approval
Teams Messages	Credit card number, SSN pattern in message	Block send; notify user; policy tip displayed	No — hard block (financial regulatory)
Fabric Warehouse (New GA)	Sensitive data detected in asset uploaded to Warehouse	Policy tip trigger; restrict access for KQL/SQL DBs	Admin configurable
AI Agents (New)	Agent attempts to access Highly Confidential labeled file	Block agent access; audit log entry; alert to admin	No — security team review required

Chapter 8: Deployment Architecture & Operating Model

8.1 Deployment Architecture Patterns

Pattern 1 — Centralized Governance (Single Account): Best for organizations with <50,000 data assets, single-geography operation, or strong central governance team. Simple operations, lower cost.
Pattern 2 — Federated Governance (Hub-and-Spoke): Best for large multi-geography organizations with autonomous business units. Central CDO office hub; business unit spoke accounts synchronized via Purview metadata API.
Pattern 3 — Domain-Aligned (Data Mesh): Best for organizations implementing Data Mesh. Each data domain owns its own Purview account. Enterprise governance sets standards; federated computational governance via shared glossary and classification taxonomy.

8.2 Governance Operating Model

Role	Responsibilities	Time Commitment	Purview Role
Chief Data Officer	Set governance strategy, approve glossary, report metrics to board	2–4 hrs/week	Insights Reader, Collection Admin (Root)
Data Governance Lead	Operate Purview program, manage stewards, evolve policies	Full time	Collection Admin, Policy Author
Domain Data Owner	Own data quality for domain, approve certifications and access requests	4–8 hrs/week	Data Curator (domain collection)
Data Steward	Enrich metadata, link glossary terms, resolve classification issues, review flagged assets	50–100% FTE	Data Curator
Data Engineer	Register data sources, configure scans, build custom lineage integration	20% allocation	Data Source Admin
AI Governance Analyst (New Role)	Monitor AI agent risk scores in DSPM, review AI Observability reports, manage agentic risk policies	20–50% FTE	Insights Reader + Security Admin

Operating Model Insight: A Purview deployment without assigned Data Stewards is a catalog that fills with metadata but never becomes trusted. For a 50,000 asset estate, plan for 2–3 full-time stewards in year one. Automation can increase effective capacity to 1 FTE per 100,000 assets at maturity. For organizations deploying Copilot or AI agents, an AI Governance Analyst role is now essential — this is not optional in the AI era.

Chapter 9: Scale, Performance & Cost Optimization

9.1 Capacity Planning

Resource	Scale Limit (2025)	Recommendation
Assets in Data Map	100 million assets per account	For >80M assets, begin planning federated architecture
Registered sources	3,000 sources per account	Consolidate similar source types into single registered sources where possible
Concurrent scans	100 concurrent scan runs	Use scan scheduling to avoid peak concurrency; prioritize by source criticality
Glossary terms	100,000 terms per account	Maintain term hygiene; deprecate unused terms quarterly
Collections	256 collections per account	Design flat-ish hierarchies; max 4–5 levels for most organizations
Custom classification rules	500 per account	Consolidate similar patterns; use regex groups over multiple single patterns

9.2 Scan Performance Optimization

Optimization Lever	Description	Performance Impact	Tradeoff
Incremental scanning	Scan only new/modified assets since last scan using watermark-based detection	60–80% reduction in scan time for stable sources	May miss classification changes on unmodified assets
Targeted scanning	Scope scans to specific folders, schemas, or file patterns	40–70% faster	Requires good source naming conventions
Classification rule optimization	Use targeted rule sets per source type; reduce rules per scan rule set	30–50% faster per scanned asset	Requires maintaining multiple scan rule sets
Off-hours scheduling	Schedule large scans for 2–6 AM to avoid competing with production workloads	No throughput gain but avoids source contention	Delayed freshness; not suitable for compliance triggers

9.3 Cost Optimization

Purview pricing is based on Data Map capacity units (CUs), scan compute, and Microsoft 365 Compliance licensing. A new pay-as-you-go pricing model (available alongside the Suite license) covers data estates, analytics, and AI apps — use the DSPM Usage Center to track consumption per investigation and avoid over-provisioning.

Cost Component	Billing Model	Optimization Strategy
Data Map capacity units	$0.496/CU/hour (1 CU = 1GB metadata storage + processing capacity)	Incremental scans reduce CU consumption by 60–75%
Scan compute	Billed per vCore-hour for SHIR; Managed VNet included in CUs	Right-size SHIR VMs; schedule to minimize runtime; use MVNet where possible
M365 Compliance (DLP, Labels)	Included in M365 E5 or E5 Compliance add-on	Audit license assignments; unused Compliance seats are common waste
DSPM & AI Governance (New)	Pay-as-you-go; Data Security Investigation Compute Units (DSICUs) replaced SCUs	Use the Usage Center dashboard to track per-investigation consumption
Defender for Cloud integration	Charged per resource per hour	Enable only for in-scope regulated workloads

Chapter 10: Real-World Case Studies

Case Study 1: Global Financial Services Firm — GDPR & PCI Compliance Transformation

Dimension	Details
Organization	Pan-European retail bank, 12,000 employees, 85 data sources
Challenge	GDPR audit failed in 2022 due to inability to demonstrate PII data inventory. €2.3M fine issued. Compliance team spent 14 weeks per audit cycle manually documenting data assets.
Purview Scope	Data Map (85 sources), full estate classification, GDPR & PCI assessments in Compliance Manager, sensitivity labels across M365 and Azure Storage, DLP policies for credit card and IBAN patterns
Timeline	16 weeks to full production deployment across all 85 sources

Metric	Before Purview	After Purview (12 months)
Compliance audit preparation time	14 weeks manual	6 days automated
PII coverage (classified assets)	23% (manual inventory)	94% (automated)
Sensitivity label coverage	8% (M365 only)	87% across Azure + M365
GDPR SRR response time	28 days	4 days
Annual compliance staffing cost	£1.2M (12 FTE)	£380K (3 FTE + automation)
Purview TCO (Year 1)	—	£420K (all-in)

Case Study 2: Healthcare Network — PHI Governance & HIPAA Continuous Compliance

Dimension	Details
Organization	US regional hospital network, 22 hospitals, 6,500 clinical staff, 140TB of health data across Azure and on-premises
Challenge	Inability to demonstrate minimum-necessary access principle for PHI (HIPAA §164.514). Multiple breaches of PHI to non-clinical staff through misconfigured Power BI reports.
Purview Scope	Healthcare-specific classification rules (34 custom PHI types), Data Map across Epic EHR integration layer + Azure SQL + ADLS Gen2, Purview policy for PHI access restriction, DLP to block PHI in Teams/email, Compliance Manager HIPAA assessment
Key Result	Classification accuracy validated at 96.3% against 10,000 manually labelled records. First external HIPAA audit post-deployment: no significant findings.

Case Study 3: Retail Enterprise — AI-Ready Data Mesh with Fabric & Purview (2026)

A FTSE 100 retailer with 8 data domains implemented a Data Mesh on Microsoft Fabric. The governance challenge evolved in 2025: beyond interoperability and trust, they needed to govern Copilot-powered analytics agents accessing domain-owned data products.

Deployed DSPM AI Observability to inventory 47 AI agents accessing the Fabric estate — 12 were flagged as high-risk due to oversharing patterns.
Applied DLP policies to Fabric Warehouse and KQL DBs (newly GA) to prevent sensitive data leakage through Copilot agent responses.
Insider Risk Management extended to Fabric lakehouses with built-in risk indicators for potential data exfiltration by agents.

KPI	Target	Achieved (Month 12)
Data products certified	80% of published products	84%
Cross-domain data access time	< 3 business days	1.2 days average
AI agent data risk incidents resolved	< 5/month	2.1/month average
Time to identify root cause of cross-domain data issue	< 4 hours	47 minutes average

Chapter 11: 90-Day Implementation Roadmap

Based on 20+ Purview deployments, the following 90-day roadmap represents the optimal sequencing for enterprise governance programs in 2026 — incorporating AI governance activation alongside traditional catalog and classification work.

Phase 1: Foundation (Days 1–30)

Week	Activity	Owner	Success Criteria
1	Purview account provisioning, network design (MVNet vs SHIR), Microsoft Entra ID group creation, collection hierarchy design	Data Engineer + Architect	Purview account live; network connectivity validated; collection hierarchy approved
1–2	Source inventory: document all data sources including AI systems and Copilot deployments	Data Governance Lead	Complete source inventory; sources prioritized by regulatory risk; AI systems catalogued
2	Glossary foundation: identify 50–100 core business terms per domain with definitions, owners, and related terms	Data Governance Lead + Domain Owners	Core glossary terms in Draft status; term owners assigned
2–3	Register and scan Tier 1 sources (PCI scope, PHI scope, GDPR-critical sources)	Data Engineer	Tier 1 sources scanned; classification results reviewed; false positive rate <5%
3–4	Classification review and tuning: build custom rules for organizational patterns	Data Steward + Data Engineer	Custom classification rules deployed; accuracy >90% on validation set
4	RBAC configuration: assign roles to domain teams using Microsoft Entra ID groups	Data Governance Lead	All roles assigned; domain teams can access catalog; engineers can register sources

Phase 2: Activation (Days 31–60)

Week	Activity	Owner	Success Criteria
5–6	Register and scan all remaining data sources; configure incremental scan schedules	Data Engineer	>90% of data estate registered and scanned; scan schedule operational
6	Quick win: publish Governance Maturity Score dashboard in Power BI; present to leadership	Data Governance Lead	Dashboard live; leadership briefing completed; program funded for Phase 3
6–7	Sensitivity label deployment: configure taxonomy; deploy auto-labelling policies in simulation mode	Security + Data Governance	Labels published; simulation mode running; simulation report reviewed
7–8	Enable DSPM: activate AI Observability; inventory all AI apps and agents; assign initial risk levels	Security + AI Governance Analyst	AI agent inventory complete; high-risk agents identified; initial DSPM posture established
7–8	Lineage validation: verify ADF, Synapse, Fabric lineage; build custom lineage via Atlas API for non-native sources	Data Engineer	End-to-end lineage visible for >3 critical pipelines; column-level lineage for Power BI reports
8	Compliance Manager setup: create GDPR, HIPAA, or relevant regulatory assessments	Compliance Officer	At least one regulatory assessment active; initial compliance score baseline established

Phase 3: Optimization (Days 61–90)

Week	Activity	Owner	Success Criteria
9–10	Enable sensitivity labels in production (exit simulation mode); deploy DLP policies including Fabric Warehouse DLP	Security + Data Governance	Labels applying to new content; DLP incidents in dashboard; <10% false positive rate
10–11	AI governance policies: configure IRM Risky Agents template; deploy agent-specific DLP; establish AI data access policies	AI Governance Analyst + Security	Risky agent policies active; agent DLP blocking tested; AI governance posture report generated
11	Glossary completion: approve Tier 1 glossary terms; link to classified assets via bulk assignment	Data Steward	>80% of Tier 1 assets linked to at least one approved glossary term
12	Program review: measure Governance Maturity Score vs. Week 1 baseline; document lessons; plan 90–180 day roadmap	CDO + Data Governance Lead	Governance score improvement documented; 90–180 day roadmap approved; operating model confirmed

Critical Success Factor: Governance programs that fail typically do so in Days 31–60 — the ‘activation phase.’ Quick wins must be demonstrated by Day 45 to maintain organizational momentum. The Power BI Governance Maturity Score dashboard is designed as this early value demonstration. In 2026, demonstrating AI agent governance to leadership is an equally powerful motivator for program continuation.

Appendix: Governance Maturity Model & Quick Reference

A. Governance Maturity Model

Level	Name	Characteristics	Target Score	Typical Timeline
L1	Initial	Ad hoc governance; no systematic catalog; manual compliance; governance by tribal knowledge	0–20	Starting point
L2	Managed	Data sources registered and scanned; basic classification applied; glossary under development; ownership partially assigned	20–50	0–6 months post-deployment
L3	Defined	Full estate classification; glossary approved and linked; lineage documented; compliance assessments active	50–70	6–12 months post-deployment
L4	Quantitatively Governed	Governance Maturity Score tracked weekly; stewardship SLAs enforced; access policies active; DLP protecting sensitive data; AI agent inventory established	70–85	12–24 months
L5	Optimizing	Automated certification; continuous compliance; AI-assisted stewardship; full AI governance with DSPM; governance embedded in CI/CD pipelines	85–100	24–36 months

B. Key Purview REST API Reference

Operation	Method	Endpoint	Use Case
List collections	GET	/account/collections	Audit collection hierarchy; governance reports
Get asset by qualified name	GET	/catalog/api/atlas/v2/entity/uniqueAttribute/type/{typeName}	Look up specific asset metadata in automation
Update asset contacts	PUT	/catalog/api/atlas/v2/entity/guid/{guid}/businessattribute/Contacts	Bulk owner assignment in onboarding automation
Submit lineage	POST	/catalog/api/atlas/v2/entity/bulk	Custom lineage for non-native sources (dbt, custom ETL)
Run scan	POST	/scan/datasources/{dsName}/scans/{scanName}/runs	Trigger scan on-demand from CI/CD on schema change
Create glossary term	POST	/catalog/api/atlas/v2/glossary/term	Bulk glossary population from existing business dictionaries
Get DSPM agent inventory (New)	GET	/security/dspm/agents	List all AI agents with risk levels and sensitive interaction counts

C. Glossary of Key Terms

Term	Definition
Apache Atlas	Open-source metadata management framework; the foundational metadata model underlying Purview’s Data Map
Business Glossary	Curated vocabulary of business terms linked to data assets; provides semantic context and shared language
Collection	Hierarchical container in Purview that scopes metadata, access control, and policy enforcement
Data Lineage	Documentation of data origin, movement, and transformation — tracing how data flows from source to consumption
DSPM	Data Security Posture Management — Purview’s unified plane for discovering, protecting, and investigating data risks across traditional and AI workloads
AI Observability	DSPM capability providing an inventory of all AI apps and agents, their risk levels, and sensitive data interactions
Microsoft Entra ID	The current name for Azure Active Directory (rebranded October 2023). All Purview documentation and configurations should use this term.
Unified Catalog	GA feature (2025) consolidating data discovery, data quality, automated access workflows, and glossary management into a single experience
OpenLineage	Open standard for data lineage metadata; used by Purview Spark connector to emit lineage from Spark jobs
Sensitivity Label	Classification tag applied to data assets and documents that drives downstream protection actions across the entire Microsoft ecosystem

Contact sales

MICROSOFT PURVIEW

Leave a Reply Cancel reply

Company

Resources

© Veratas. All rights reserved.