< Previous Module - Home - Next Module >
To populate Microsoft Purview with assets for data discovery and understanding, we must register sources that exist across our data estate so that we can leverage the out of the box scanning capabilities. Scanning enables Microsoft Purview to extract technical metadata such as the fully qualified name, schema, data types, and apply classifications by parsing a sample of the underlying data.
In this module, you'll walk through how to register and scan data sources. You'll create a new collection for your first data source, upload data and configure scanning. By the end of this module you'll have technical metadata, such as schema information, stored in Purview. You can use this to start linking to business terms, allowing your team members to find data more easily.
- An Azure account with an active subscription.
- An Azure SQL Database (see module 00).
- A Microsoft Purview account (see module 01).
- Register and scan an Azure SQL Database using SQL authentication credentials stored in Azure Key Vault.
# | Section | Role |
---|---|---|
1 | Key Vault Access Policy #1 (Grant Yourself Access) | Azure Administrator |
2 | Key Vault Access Policy #2 (Grant Microsoft Purview Access) | Azure Administrator |
3 | Generate a Secret | Azure Administrator |
4 | Add Credentials to Microsoft Purview | Microsoft Purview Administrator |
5 | Register a Source (Azure SQL DB) | Data Source Administrator |
6 | Scan a Source with Azure Key Vault Credentials | Data Source Administrator |
7 | View Assets | Data Reader |
💡 Did you know?
Azure Key Vault is a cloud service that provides a secure store for secrets. Azure Key Vault can be used to securely store keys, passwords, certificates, and other secrets. For more information, check out About Azure Key Vault.
Before we can add secrets (such as passwords) to Azure Key Vault, we need to set up an Access Policy. The access policy being created in this particular step, ensures that our account has sufficient permissions to create a secret, which will later be used by Microsoft Purview to perform a scan.
-
Navigate to your Azure Key Vault resource and click Access policies.
-
Click ➕ Create.
-
Under Secret permissions, click Select all. Then, click Next.
-
Search for your account name, select your account name from the search results, then click Next.
-
Skip the Application (optional) page by clicking Next again.
-
Review your selections then click Create.
In this next step, we are creating a second access policy which will provide Microsoft Purview the necessary access to retrieve secrets from the Key Vault.
-
Navigate to your Azure Key Vault resource and click Access policies.
-
Click ➕ Create.
-
Under Secret permissions, select Get and List. Then, click Next.
-
Search for the name of your Microsoft Purview account (e.g.
pvlab-{randomID}-pv
), select the item, then click Next. -
Skip the Application (optional) page by clicking Next again.
-
Review your selections then click Create.
In order to securely store our Azure SQL Database password, we need to generate a secret.
-
Navigate to Secrets and click Generate/Import.
-
Copy and paste the values below into the matching fields and then click Create.
Name
sql-secret
Value
sqlPassword!
To make the secret accessible to Microsoft Purview, we must first establish a connection to Azure Key Vault.
-
Open the Microsoft Purview Governance Portal, navigate to Management Center > Credentials, click Manage Key Vault connections.
-
Click New.
-
Copy and paste the value below to set the name of your Key Vault connection, and then use the drop-down menu items to select the appropriate Subscription and Key Vault name, then click Create.
Name
myKeyVault
-
Since we have already granted the Microsoft Purview managed identity access to our Azure Key Vault, click Confirm.
-
Click Close.
-
Under Credentials click New.
-
Using the drop-down menu items, set the Authentication method to
SQL authentication
and the Key Vault connection tomyKeyVault
. Once the drop-down menu items are set, Copy and paste the values below into the matching fields, and then click Create.Name
credential-SQL
User name
sqladmin
Secret name
sql-secret
-
Open the Microsoft Purview Governance Portal, navigate to Data map > Sources, and click Register.
-
Search for
SQL Database
, select Azure SQL Database, and click Continue. -
Select the Azure subscritpion, Server name, and Collection. Click Register.
-
Open the Microsoft Purview Governance Portal, navigate to Data map > Sources, and within the Azure SQL Database tile, click the New Scan button.
-
Select your Database (e.g.
pvlab-{randomID}-sqldb
), set the Credential tocredential-SQL
, turn Lineage extraction toOff
, and click Test connection. Once the connection test is successful, click Continue.Note: If the "Test connection" appears to be hanging, click Cancel and re-try.
-
Click Continue.
-
Click Continue.
-
Set the trigger to Once, click Continue.
-
Click Save and Run.
-
To monitor the progress of the scan, click View Details.
-
Click Refresh to periodically update the status of the scan. Note: It will take approximately 5 to 10 minutes to complete.
-
To view the assets that have materialised as an outcome of running the scans, perform a wildcard search by typing the asterisk character (
*
) into the search bar and hitting the Enter key to submit the query and return the search results.
Note: This is the same knowledge check referenced in Module 2A. If you have already completed the knowledge check from the previous module, please skip this step.
-
What type of object can help organize data sources into logical groups?
A ) Buckets
B ) Collections
C ) Groups -
At which point does Microsoft Purview begin to populate the data map with assets?
A ) After a Microsoft Purview account is created
B ) After a Data Source has been registered
C ) After a Data Source has been scanned -
Which of the following attributes is not automatically assigned to an asset as a result of the system-built scanning functionality?
A ) Technical Metadata (e.g. Fully Qualified Name, Path, Schema, etc)
B ) Glossary Terms (e.g. columnSales Tax
is tagged with theSales Tax
glossary term)
C ) Classifications (e.g. columnccnum
is tagged with theCredit Card Number
classification)
This module provided an overview of how to create a collection, register a source, and trigger a scan.