Azure Purview - Metadata Management & Data Governance Service | Nitor Infotech
Send me Nitor Infotech's Monthly Blog Newsletter!
×
nitor logo
  • Company
    • About
    • Leadership
    • Partnership
  • Resource Hub
  • Blog
  • Contact
nitor logo
Add more content here...
Artificial intelligence Big Data Blockchain and IoT
Business Intelligence Careers Cloud and DevOps
Digital Transformation Healthcare IT Manufacturing
Mobility Product Modernization Software Engineering
Thought Leadership
Aastha Sinha Abhijeet Shah Abhishek Suranglikar
Abhishek Tanwade Abhishek Tiwari Ajinkya Pathak
Amit Pawade Amol Jadhav Ankita Kulkarni
Antara Datta Anup Manekar Ashish Baldota
Chandra Gosetty Chandrakiran Parkar Deep Shikha Bhat
Dr. Girish Shinde Gaurav Mishra Gaurav Rathod
Gautam Patil Harish Singh Chauhan Harshali Chandgadkar
Kapil Joshi Madhavi Pawar Marappa Reddy
Milan Pansuriya Minal Doiphode Mohit Agarwal
Mohit Borse Nalini Vijayraghavan Neha Garg
Nikhil Kulkarni Omkar Ingawale Omkar Kulkarni
Pooja Dhule Pranit Gangurde Prashant Kamble
Prashant Kankokar Priya Patole Rahul Ganorkar
Ramireddy Manohar Ravi Agrawal Robin Pandita
Rohan Chavan Rohini Wwagh Sachin Saini
Sadhana Sharma Sambid Pradhan Sandeep Mali
Sanjeev Fadnavis Saurabh Pimpalkar Sayanti Shrivastava
Shardul Gurjar Shravani Dhavale Shreyash Bhoyar
Shubham Kamble Shubham Muneshwar Shubham Navale
Shweta Chinchore Sidhant Naveria Souvik Adhikary
Sreenivasulu Reddy Sujay Hamane Tejbahadur Singh
Tushar Sangore Vasishtha Ingale Veena Metri
Vidisha Chirmulay Yogesh Kulkarni
Big Data | 09 Feb 2022 |   17 min

Azure Purview – Metadata Management and Data Governance Service

featured image

Businesses collect, store, and analyze a constantly rising quantity of data, and so, tools for helping them discover, catalog, track and manage how that data is shared are also growing in importance. With Azure Purview, Microsoft has launched a new data governance service into public preview that combines these capabilities in a fresh data catalog including discovery and data governance features.

Azure Purview is a unified data governance solution that allows you to manage as well as govern your on-premises, multi-cloud, and software-as-a-service (SaaS) data. It creates a detailed and proper map of your data and also performs the classification of end-to-end lineage of your data. It enables data consumers to find trustworthy data and is created on Apache Atlas which is an open-source project for metadata management and governance for data assets.

In my blog, I am going to shed some light on the key features and functionality of this solution, and also take you through the process of setting it up.

For starters, take a look at some characteristics of Azure Purview I have outlined below:

  • Data discovery, finding lineage of that data, and classification of that data.
  • Glossary containing business and technical search terms to support data discovery
  • Tracking of movement of sensitive data across your entire data estate
  • Azure Purview can connect to on-premises, multi-cloud and a Software-as-a-service (Saas) platform and can govern their data

Now that you are acquainted with the nature of this solution, let’s dive into its major components!

Functionality of Azure Purview

Here are the three components that describe the functionality of Azure Purview:

1. Data Map:

The Azure Purview Data Map provides you the facility to register your data source which you want to scan under the Root collection you selected. It makes your data informative by mapping your data assets and their relationships across your data estate. Data map automatically scan your data, classifies your data and also lineage your data.

2. Data Catalog:

The Data Catalog is a search bar of your purview account in which you can search assets of your registered sources and get detailed information about your searched asset. You can also view a business glossary, lineage visualization, catalog insights as well as sensitive data identification insights.

3. Data Insights:

With the help of Data Insights, you can watch the data briefly, understand what data is actively scanned, where sensitive data is, and how it moves.

Before I reveal the problem statement and solution approach that I have focused my blog on, I’d like to give you an idea about certain concepts related to Azure Purview.

Concepts related to Azure Purview:

A. Collection:

A collection is a group of related data sources. It is a root of structural hierarchy of Azure Purview. A collection can have zero or more sources. Ex. A company name can be a collection and the department of that company like IT, HR can be its sources.

Sources:

A source is the next object in the hierarchy of Azure Purview data mapping. Basically, the source means from where your data is coming. (Ex. Azure Data Lake Storage, Azure SQL Database, Storage account etc.)

A Source can have:

  • Collections between zero to one
  • Assets between zero to many
  • Scans between zero to many

B. Asset:

An asset is an instance of an asset type. It can be found in the Azure Purview window by using the Azure Purview search bar after the metadata has been indexed. Ex. If a source of Azure SQL Database has three tables, then generally there are three assets.

An asset has one source and can have:

  • Terms between zero to many
  • Classifications between zero to many

C. Scan:

Scanning is a process wherein the catalog links directly to a data source on a schedule defined by the user. Azure Purview scans your data on the basis of three levels – Basic, Schema and Classification. On the basic level, it scans file name, file size, fully qualified name, etc. and on the schema level, it scans database tables and structured file types, etc. In classification, a scan is based on classification rules.

A scan has:

  • one source
  • one credential
  • one scan rule set

Problem Statement:

The purpose of this blog is to maintain the data catalog, lineage and data governance of the data platform which has various components such as Sources, ETL process, Data warehouse and Reporting layers.

Solution approach:

  • In the solution, firstly we are scanning and classifying our line-of-business sources metadata.
  • Then we are giving source data to ETL transformation unit through Data Factory, passing this metadata to the Purview and taking scanned and classified data and loading it into the Data warehouse.
  • Finally, we are performing analysis operations on it by using analysis tools.

Now, read on to discover the process of setting up Azure Purview.

Steps to set up Azure Purview:

Step 1: Open a window and register the source

This is the first window when we open the account and go to the root collection window.

In this window, we must register our source which we want to scan. For that, follow these steps:

  • Click the ‘Register’ option and you will see the list of sources on the right side of your screen. (Shown in the second step)

Step 2: Click the Register icon to register the source

Here, select your source which you want to scan. Enter your source details and then it will add your source under the root collection. (Shown in the next step)

Step 3: It will add under Root collection

  • Now you have added your source and if you want to view its details or want to edit or delete it, there is an option under the source icon.
  • Check properly whether the source has been added correctly or not by clicking on the ‘View details’ option under the source icon.

Step 4: Create a scan for your source

Now create a scan for your source to scan it and get various kinds of details about your source assets like its schema, its lineage, classifications of your columns, etc.

  • After clicking on the ‘Scan’ option as shown in the above figure, enter your scan name and select a credential (security option).
  • Then select the scan rule set. You can select a standard scan rule provided by Microsoft or you can create your own scan rule set for scanning of your source assets.
  • After that, select the periodicity of scanning (Recurring or Once) as per your functionality and click on the ‘Save and Run’ option (shown in the next step).

Step 5: Review your scan

Now just review your scan and check all the information in the ‘Review your Scan’ window. Then click on the ‘Save and Run’ option and the scanning of your source will start.

Step 6: Add your Data factories or other things to get Lineage view

Now select your data engineering service which you used to transform your data (I used Data Factory) from the management icon on the bottom left side and select the ‘Data Factory’ option under Lineage connection.

  • Click on ‘New icon’ and select your subscription name from the dropdown list.
  • After that, select your Data Factory name from the dropdown list and click on the ‘Create’ option.
  • After successful connection setup to your Data Factory, it shows the connected status.

Step 7: See your scanning details

To check the status of your scanning (whether it is completed or in progress), follow these steps:

  • Go to the ‘Data map’ section and select the ‘View details’ option.
  • View the details of your scan.

Step 8: Search for your Assets using a Data Catalog

The ‘Data Catalog’ is a search bar of your Purview account in which you can search for the assets of your registered sources and get detailed information about your searched asset.

It also contains three important sections:

1. Browse Assets:

Here you can browse your assets on the basis of your source type or on the basis of your collection.

2. Manage Glossary:

You can manage your glossaries in this section. You can create or add glossaries as well.

3. Knowledge Center:

The knowledge center helps you with learning materials and tutorials.

Step 9: Set a connection to Power BI from Azure Purview by creating a security group in the Azure Active Directory

If you want to register Power BI under the root collection and scan it, then follow these steps:

  • Register your source under the root collection.
  • Then go to your Azure Active directory, select a group option and add a new security group.

Step 10: Power BI setting for Azure Purview

  • Now go to your Power BI portal, login for Power BI and navigate to the Admin Portal section.
  • After that, in the tenant setting, select ‘Allow service principle to use read-only power BI admin API’s.’
  • Then Enable it, select ‘Specific security group’ and enter your security group name which you created in Azure active directory.
  • Finally click the ‘Apply’ button, go to Purview and scan your Power BI source which you registered under your root collection.

Step 11: See lineage of your data

  • This is the lineage of data, and it shows how data is moving from one end to another end.
  • As shown in the above figure, the data from two sources gets copied into only one target destination by using Azure Data Factory with the help of copy activity.
  • The lineage shows the changes in the column names from source to target. For instance, in the above figure, EmployeeName from source changes into Emp_Name in the target.

And there you have it! This is how Azure Purview helps you to get the fundamental structural metadata and lineage of the registered sources. With its help, you can get a great bird’s eye view of your data flow and data transformation from source end to destination end!

Feel free to write to us at Nitor Infotech with any comments and suggestions you may have. Also, while navigating the Azure universe, you will find this blog useful if you would like to learn how you can deploy an existing Azure Data Factory branch on a new ADF environment via Git configuration.

Related Topics

Artificial intelligence

Big Data

Blockchain and IoT

Business Intelligence

Careers

Cloud and DevOps

Digital Transformation

Healthcare IT

Manufacturing

Mobility

Product Modernization

Software Engineering

Thought Leadership

<< Previous Blog fav Next Blog >>
author image

Abhishek Suranglikar

Trainee Software Engineer

Profile Description: Abhishek is interested in data engineering, data pipelining, machine learning, and business intelligence-related technologies and is currently working on the Azure platform. He has completed Azure Data Fundamental and Azure AI Fundamental certifications. He is tech-savvy and believes that books can change your life. He is thankful for everything and moves forward by believing in himself.

   

You may also like

featured image

10 Heuristic Principles in UX Engineering

Say, you’ve built a modern, cutting-edge application. It has a complex, multi-layered user interface (UI), that is the basis for some amazing features. Since you’re the one who has built the applic...
Read Blog


featured image

ETL Testing: A Detailed Guide

Just in case the term is new to you, ETL is defined from data warehousing and stands for Extract-Transform-Load. It covers the process of how the data is loaded from the multiple source system to t...
Read Blog


featured image

Getting Started with ArcGIS Online

GeoServer is an open-source server that facilitates the sharing, processing and editing of geospatial data. When we are dealing with a large set of geospatial d...
Read Blog


subscribe

Subscribe to our fortnightly newsletter!

We'll keep you in the loop with everything that's trending in the tech world.

Services

    Modern Software Engineering


  • Idea to MVP
  • Quality Engineering
  • Product Engineering
  • Product Modernization
  • Reliability Engineering
  • Product Maintenance

    Enterprise Solution Engineering


  • Idea to MVP
  • Strategy & Consulting
  • Enterprise Architecture & Digital Platforms
  • Solution Engineering
  • Enterprise Cognition Engineering

    Digital Experience Engineering


  • UX Engineering
  • Content Engineering
  • Peer Product Management
  • RaaS
  • Mobility Engineering

    Technology Engineering


  • Cloud Engineering
  • Cognitive Engineering
  • Blockchain Engineering
  • Data Engineering
  • IoT Engineering

    Industries


  • Healthcare
  • Retail
  • Manufacturing
  • BFSI
  • Supply Chain

    Company


  • About
  • Leadership
  • Partnership
  • Contact Us

    Resource Hub


  • White papers
  • Brochures
  • Case studies
  • Datasheet

    Explore More


  • Blog
  • Career
  • Events
  • Press Releases
  • QnA

About


With more than 16 years of experience in handling multiple technology projects across industries, Nitor Infotech has gained strong expertise in areas of technology consulting, solutioning, and product engineering. With a team of 700+ technology experts, we help leading ISVs and Enterprises with modern-day products and top-notch services through our tech-driven approach. Digitization being our key strategy, we digitally assess their operational capabilities in order to achieve our customer's end- goals.

Get in Touch


  • +1 (224) 265-7110
  • marketing@nitorinfotech.com

We are Social 24/7


© 2023 Nitor Infotech All rights reserved

  • Terms of Usage
  • Privacy Policy
  • Cookie Policy
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Accept Cookie policy