Dissecting Data Masking Feature in Azure SQL DB That collaborated with SSMS 18

This article describes a new feature for SQL databases in SSMS, static data masking. Previously, the static data masking function was only available in Azure SQL DB. Database administrators are responsible for database security and compliance issues.

In a typical workflow, we had multiple environments for applications such as production, staging, UAT, development, and sandbox. It is a regular task of the DBA to update the sub-environment (staging, testing, UAT, etc.) with a backup of the production database backup.

Sometimes we need to share our database backups with outside providers. The database may contain sensitive information such as personal information (PII data) such as email address, contact information, country ID number, etc.

Table of Contents on this article below:

Overview

You may have other sensitive information such as bank account number, credit card number, CVV, etc. I don’t want anyone to access it. We also need to comply with GDPR, PCI and SOX regulations.

Restoring the database to a lower environment puts sensitive data at risk because all data is transferred to the lower environment and no security or encryption is enabled at that level. You may need to mask or shuffle your data before you can restore the database.

Consider the following example where you have sensitive credit card information in a production database and have performed a database recovery with development, UAT, and sandbox environments.

From here, you can see that data is transmitted in all environments, making it accessible to a large audience and at risk.

In SQL Server 2016, we discovered a new solution called dynamic data masking that prevents unauthorized users from accessing defined sensitive information. You need to define truncation rules to mask your data.

When a user requests data, SQL Server verifies access. If you don’t have the necessary permissions, you will receive the masked data. There are no changes to the source data in this process. However, this does not match our case described above.

SSMS 18.0 includes static data masking for SQL databases. Previously it was only available in Azure SQL DB. Static data masking creates a copy of the database and applies data transformation rules to it.

You can back up this masked database in a low-level environment to ensure that sensitive data is not transferred to the other end. Starting with SQL Server 2012, you can use this function for databases.

From the diagram above, you can see that using static data masking, sensitive data information is not transferred to other environments.

You can use static data masking for the following purposes:

  • Preparing a non-production database environment
  • Database development
  • Database troubleshooting
  • Data exchange with third-party providers

Worked example

Let’s use an example to understand static data masking.

First, create a ‘StaticDemo’ database with tables. Paste sample data with the following query. This table contains the date of birth, email address and credit card number that you want to mask with this function. For an example, see the script in Appendix A.

Right click on the database name -> Tasks -> Mask Database (Preview)

This will start the static data masking (preview) wizard as shown here.

This wizard will show you Step 1: Configure Masking. In “Configure Masking” you can mask all columns in the database. Ideally you don’t want to mask all columns in the database.

You can also filter the columns below this. You can also see all suitable tables under the filtered column.

You can do inline search on filtered columns. As soon as I write something in the text field I get a table with columns with these names. For example, in the image below, entering your email will give you table and column details.

If you select all the columns in this table, you will see the error regarding the columns that are not suitable for this data masking. Here you can see that the identity column is not suitable for static data masking.

Now all you have to do is select the column to apply static data masking. You can see that the action is “shuffled” for all selected columns and there are configuration options for each selected column.

This is it 5 masking functions that properly available:

  1. Null: replace data in specific columns with NULL values.
  2. Single Value masking: if you specify a single value in this mask, that value is copied to all rows of data in a particular column.
  3. Shuffle: in this masking function, values are blended onto a new line.
  4. Group Shuffle: in this shuffle, several columns are grouped into shuffling groups.
  5. String Composite: Defines a string format and values are substituted according to the given string

The Shuffle Mask function is selected by default. You can change it in the drop-down list. I will change the dates as follows.

You need to define a string format for string combinations. To do this, click on Configure and a pop-up screen will open.

Define the string and click OK. Leave an extra ‘\’ here to see how it works during static data masking activity.

In step 2 you can choose the backup location for the .bak clone files By default this is the default location set at the instance level. However, you can change the location here as needed.

In the next step, enter the name of the masked database in the text box shown below.

Click to start the static data masking process. The following error message is displayed.

Let me explain the error message.

You are prompted to manually remove the following artifacts.

  • Partially masked database ‘StaticDemo_Masked’
  • Backup file used for cloning operation location at ‘…’

I intentionally generated this error for better understanding. For test purposes, I performed this static data masking for this database. So I ask you to manually remove the masked database along with the backup files used for the replication process.

  • Long escape character ‘\’ at the end:

This is because of the “\” placed in the DOB column during the string masking function. So, go to the configuration options in the DOB column and remove the “\” at the end.

If you remove the ‘\’ character, you can also see the sample value below the pattern.

Go back and click OK to apply static data masking. The message ‘Masking Complete’ is displayed.

You will also be prompted to clean up the backup files used during cloning.

Static configuration

You can save the masking configuration to an XML file. You can use this XML file later to apply the masking function directly without doing the configuration again. Click “Save Configuration” and enter the location along with the file name to save. You can use “load config” to load an existing configuration XML file.