Stephan Köppen

In most cases a dataware house includes string columns which are either nullable or not nullable. So lets assume you have a table with the following structure:

 

CREATE TABLE BLANK_TEST

(

ID INT NOT NULL

, Testname NVARCHAR(100) NOT NULL

, Blank NVARCHAR(100) NULL

, Comment NVARCHAR(100) NOT NULL

)

WITH (DISTRIBUTION = REPLICATE)

 

 

The DWLoader gives you two parameters to modify the load process: 

Parameter –c:

Removes white space characters from the left and right side of char, nchar, varchar, and nvarchar fields. Converts each field that contains only white space characters to the empty string.   Examples:  ‚ ‚ gets truncated to “  ‚ abc ‚ gets truncated to ‚abc‘

The important part is the following sentence right at the bottom:

When –c is used with -E, the –E operation occurs first. Fields that contain only white space characters are converted to the empty string, and not to NULL.

Parameter –E:

Convert empty strings to NULL. The default is to not perform these conversions.

 

So let’s see what happens when playing with the options:

Let’s create 5 CSV Files with each on line. Store them under the Name oft he ID.

ID;Testname;Blank;Comment

1;Parameter -E;;Converted to NULL

2;Parameter -c; ;Blank removed

3;Parameter –c;;Nothing to remove

4;Parameter –E –c;;Converted to Null due to missing blank

5;Parameter –E –c; ;Converted to Blank

 

Use the prepaid statements to load the flatfiles into the table. Please adjust to your own needs.

„C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\dwloader.exe“ -T Test_DB..Blank_Test -U test -P test -t „;“ -R c:\test\1.reject -i c:\test\1.csv -E

„C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\dwloader.exe“ -T Test_DB..Blank_Test -U test -P test -t „;“ -R c:\test\2.reject -i c:\test\2.csv -c

„C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\dwloader.exe“ -T Test_DB..Blank_Test -U test -P test -t „;“ -R c:\test\3.reject -i c:\test\3.csv -c

„C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\dwloader.exe“ -T Test_DB..Blank_Test -U test -P test -t „;“ -R c:\test\4.reject -i c:\test\4.csv -E -c

„C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\dwloader.exe“ -T Test_DB..Blank_Test -U test -P test -t „;“ -R c:\test\5.reject -i c:\test\5.csv -E -c

 

After executing the statements we will check the result :

 

Test 1: –E is used which will convert the blank into NULL

Test 2:  -c is used and will remove the space. No conversion into NULL

Test 3: -c  is used and result stays the same.

Test 4: -E –c is used which will convert into NULL

Test 5: -E –C is used and the space will be removed without converting to NULL

 

So in general it is possible to load NULLs as well as blanks. In order to do so you have to add a fake space into the expected column. The easiest way is probably to use ETL to transform the expected column. Please keep in mind that –c gets rid of all whitespaces.