It is quite common in a SMP Environment to update a table. Sometimes it happens that you have to update a big table. And to make it even worse most of the data in the hug table needs to be updated. For example when you have added a new column and want to update it with a certain value.
It is important that the transaction log is not going to be maxed out and the whole transaction will be rolled back. In our SMP world we can use something like the query below to go over a partial result and update only a certain amount. In our case wedeclare @rows int = 1; while (@rows > 0) begin update top(100000) HugeTable set updatecol = updatecol where Month_ID = 201312 and updatecol is null set @rows = @@ROWCOUNT end
This query will only update 1.000.000 rows at a time to make sure the transaction log doesn’t get into FULL state.
If you had the chance to experience the awesome performance of the PDW you will also experience a couple of things that won’t work right away because some things are not implemented yet. One of those features is the feature “update top (XXXXXX)”.
But what can we do if we have sort of the same requirement and need to update the table iteratively. The answer to this question is “modulo”. The best thing to do, is to select e.g. an integer column with a great selectivity as well as a great cardinality.
In our scenario I selected the Contract_ID with modulo and splittet the table into 10 equal result sets. Select the min and max values and iterate as you are used to over the table.declare @min int declare @max int
set @min = (select min(Contract_ID%10) from HugeTable where Month_ID = 201312) set @max = (select max(Contract_ID%10) from HugeTable where Month_ID = 201312)
while @min <= @max begin update HugeTable set updatecol = updatecol where Month_ID = 201312 and updatecole is null and Contract_ID %10 = @min set @min = @min + 1 end
With this simple rewrite you can achieve and easy update for your tables.