Wednesday, October 18, 2017

tempdb becomes full when updating Clustered Columnstore index

I had a table with 3.5 billion records and wanted to update one column in all records. This table was not partitioned though the design says that it should be partitioned. With spec of 2 processors (12 cores) and 64 GB RAM, it could never complete the update as it makes the tempdb full, almost 800GB, consuming all space in the drive.

I am still not sure about the exact reason for this but listing out possible reasons that could help you too.
  1. Table is not partitioned. I should have applied partitioning before loading billions of records.
  2. Clustered columnstore index needs to be rebuilt. May it takes long time because the index is fragmented (had no way of checking as everything was urgent).
  3. May be, it is not something to be surprised, update means, delete and insert, that makes records adding delta and then move to rowgroups, hence takes time and needs extra support from tempdb.
  4. Not enough memory for completing the UPDATE process, hence use tempdb.
However, I manage to execute the UPDATE as a batch process. Although it took a significant time, it did not use tempdb (it might have slightly used but did not notice).

Here is the code I finally used;

WHILE EXISTS (SELECT * FROM dbo.Table01 WHERE Column1 IS NULL)
BEGIN

 UPDATE TOP (1000000) dbo.Table01
  SET Column1 = RIGHT(Column2, 4)
 WHERE Column1 IS NULL
END

No comments: