Stefan Grigat

In this post I will introduce a new feature of SQL Server 2016. Dynamic Data Masking. Especially in the Business Intelligence World or when it comes to reporting or data analytics you often need to anonymize data and don’t report on personal information. This new feature helps here.

There are two aspects for this topic. On the one hand you don’t want to see personal information like phone numbers, credit-cards, e-mail-addresses and so on in your BI System or Report and you have to hide this data. On the other you need to know thru the whole BI stack and all systems where your data comes from so you can detect bugs when it comes to wrong data or numbers.

Because of the second aspect you cannot only overwrite the personal information but need an agreed data hiding/ data masking strategy. Now in SQL Server 2016 there is this new feature that helps to handle both aspects.

The SQL Server 2016 has three pre-defined functions for masking and the ability to define your masks

  1. Default: That replaces characters with ‚XXXX‘ and Numbers with 0
  2. Email: That replaces a part before the @ and puts a ‚@XXXX.com‘ at the end
  3. Random: That replaces numbers with random values
  4. Custom String: In this function you can define your own padding string

As this function does not really change the data itself but the display behavior, here are a few aspects about security settings:
Users with data_reader or simple select-permission will see the masked data.
Elevated users with owner-permission of the database or control-permission of the table will get the chance to see unmasked data.

Now let’s go to the code and see result sets:

We create a table:
CREATE TABLE [dbo].[MaskedTable](
[ID] [int] NULL,
[Name] [nvarchar](50)NULL,
[StringMaskDefault] [varchar](20)NULL,
[StringMaskPart] [varchar](20)NULL,
[IntMask] [int] NULL,
[RandomIntMask] [int] NULL,
[Email] [nvarchar](60)NULL
)ON [PRIMARY];

I named the columns in the way I will use the mask function.
StringMaskDefault could be a street name or some string you completely want to hide.
StringMaskPart could be something you only want to hide partial. For example credit card information.
IntMask could be some integer value like salary or age.
RandomIntMask delivers a random number which is different every time you query the data and could be within a defined range.
Email uses the function to hide email addresses

Now let’s insert some data:
INSERT INTO [dbo].[MaskedTable]
([ID]

,[Name]
,[StringMaskDefault]
,[StringMaskPart]
,[IntMask]
,[RandomIntMask]
,[Email])
VALUES
(1,‚Hans Hansen‘,‚JustANormalString‘,‚JustANormalString2‘,123,123,‚hans@hansen.de‘),
(2,‚Max Muster‘,‚JustANormalString‘,‚JustANormalString2‘,123,123,‚max@muster.de‘)

The normal result would look like:

ID Name StringMaskDefault StringMaskPart IntMask RandomIntMask Email

1

Hans Hansen JustANormalString JustANormalString2

123

123

hans@hansen.de

2

Max Muster JustANormalString JustANormalString2

123

123

max@muster.de

Now let’s alter the columns with the masking functions:
ALTER Table [MaskedTable]
ALTER COLUMN [StringMaskDefault] ADD MASKED WITH (FUNCTION=‚default()‘)

ALTER Table [MaskedTable]
ALTER COLUMN [StringMaskPart] ADD MASKED WITH (FUNCTION=‚partial(5,“YY“,5)‘)
–In this 20 char column I replace everything of the string except the first and last 5 chars with YY

ALTER Table [MaskedTable]
ALTER COLUMN [Email] ADD MASKED WITH (FUNCTION=‚email()‘)

ALTER Table [MaskedTable]
ALTER COLUMN [IntMask] ADD MASKED WITH (FUNCTION=‚default()‘)

ALTER Table [MaskedTable]
ALTER COLUMN [RandomIntMask] ADD MASKED WITH (FUNCTION =‚random(1, 20)‘)

For a user with just select permission as a service account for a Business Intelligence environment could be the result looks like this:

EXECUTE AS user=‚BI_Reader‘;
SELECT
TOP 1000 [ID]
,[Name]
,[StringMaskDefault]
,[StringMaskPart]
,[IntMask]
,[RandomIntMask]
,[Email]
FROM [Test].[dbo].[MaskedTable]

ID Name StringMaskDefault StringMaskPart IntMask RandomIntMask Email

1

Hans Hansen xxxx JustAYYring2

0

6

hXXX@XXXX.com

2

Max Muster xxxx JustAYYring2

0

12

mXXX@XXXX.com

I think this is really a great feature and with only some ALTER statements you can easily hide personal or sensitive data from read-only users like Business Intelligence or reporting users.