SQL DBCC CHECKDB repair_allow_data_loss example

By | October 14, 2020

For those who are learning about SQL Server database corruption and questioning about when a single row in a SQL Server data page is corrupted, for example due to page CHECKSUM error, here is what will happen if your run BDCC CHECKDB with repair_allow_data_loss.

Depending on the kind of corruption, data loss using CHECKDB with repair_allow_data_loss can vary to some different degrees. To the problem of the description, in particular, even when everyone knows for sure only a single row is corrupted in a data page that cause a CHECKSUM issue, the entire data page will be deallocated when SQL Server repairs with the allow data loss.

For those who are not really familiar with CHECKSUM, it is calculated and stored at the page level and not at each row level, that is why it provides a high level of data file integrity. Basically, CHECKSUM works by assuming a data page contain 100 rows with a single value of 2 in each row. The CHECKSUM algorithm will sum these values and locates the sum, which is 200, to the page header as a checksum. Then, the external aspect outside of SQL Server somehow changes one of the rows into a value of 0 in the data page. The calculated checksum value would be 198 when the SQL Server reads this data page into the buffer pool and validates the page checksum, but the checksum value located in the page header would remain be 200. That’s explains why the SQL Server would raise this as a CHECKSUM corruption issue, because the original and expected checksum values are different.

The thing below will step you through an example of a page CHECKSUM issue and you will be able to see that the SQL Server deallocates the entire data page even when only one row is corrupted.

The code demonstration is displayed on SQL Server 20116 Developer Edition on Service Pack 2. This one will keep an eye on what data are discarded when rapair_allow_data_loss option is specified with DBCC CHECKDB.

This following script will create a database (CorruptionTest) and populate the table dbo.mssqtips with 250,000 rows. There is a column BigCol in table dbo.mssqtips with data type CHAR(2000). With this one, you can expect the data page to fit three row per page.

USE master

GO

DROP DATABASE IF EXISTS [CorruptionTest]

GO

CREATE DATABASE [CorruptionTest]

GO

ALTER DATABASE [CorruptionTest] MODIFY FILE ( NAME = N’CorruptionTest’, SIZE = 2GB )

GO

ALTER DATABASE [CorruptionTest] MODIFY FILE ( NAME = N’CorruptionTest_log’, SIZE = 2GB )

GO

ALTER DATABASE [CorruptionTest] SET RECOVERY FULL;

GO

ALTER DATABASE [CorruptionTest] SET PAGE_VERIFY CHECKSUM 

GO

CREATE TABLE CorruptionTest.dbo.mssqltips

(increment INT, randomGUID uniqueidentifier, randomValue INT, BigCol CHAR(2000) DEFAULT ‘a’,

INDEX CIX_SQLShack_increment1 UNIQUE CLUSTERED (increment))

GO 

SET NOCOUNT ON;

DECLARE @counter INT = 1;

BEGIN TRAN

   WHILE @counter <= 250000

   BEGIN

      INSERT INTO CorruptionTest.dbo.mssqltips (increment, randomGUID, randomValue)

      VALUES (@counter, NEWID(), ABS(CHECKSUM(NewId())) % 140000000) 

      SET @counter += 1

   END;

COMMIT TRAN;

GO 

The question below outputs the table dbo.msstips top 10 rows across 3 data pages: Page ID 224, 226, and 227. As you may have expected, you are able to see that each data page has three rows. for this corruption demonstration, you can randomly select row wit increment = 5 in blue to work with. The value in the column BigCol in Page ID 226 Slot 1 will be corrupted with 0x0.

SELECT TOP 10

sSys.fn_PhysLocFormatter(%%physloc%%) PageId,

*

FROM [CorruptionTest] . [dbo] . [mssqltips]

How to analyze the corrupt data page content in SQL Server? so, the value in BigCol has been corrupted with 0x0 Page Id 226 Slot 1. By using DBCC PAGE, you can look at the content of Page Id 226. From the output, the Slot 0 and Slot 2 in Page Id 226 is still carrying all the value that has been inserted. Meanwhile, the value of BigCol in Slot 1 is carrying blank, because it has been overwritten with value 0x0.

DB =CC TRACEON (3604);

GO

DBCC PAGE (‘CorruptionTest’, 1, 226, 3);

GO

Corruption can remain hidden in a database with the exception when the corrupted one is read by storage engine into memory and buffer pool validates the page checksum, or the corruption is detected when performing a database consistency check.

In order to check whether the statement above is true, you can execute three queries. The first two ones will not access the corrupt page. As for the third one, it will hit the corrupt data page with SELECT TOP 24995.

The first two ones execute successfully and return rows because it does not have to touch the corrupted data page. Meanwhile, the third query fails because it needs accessing the corrupted page. the query was terminated and returns an error as soon as the storage engine tries to retrieve the corrupted data page.

Here is the method to repair the database with allow data loss. You will have to set the database into single user mode, run the repair, and then set the database back into multi user mode.

USE master;

ALTER DATABASE [CorruptionTest] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

GO

DBCC CHECKDB (‘CorruptionTest’, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO

ALTER DATABASE [CorruptionTest] SET MULTI_USER;

The original question is run to retrieve the top 10 rows again and it looks like increment 4, 5, and 6 are now gone. The main reason is because the entire Page Id 226 was deallocated. Each row in the table dbo.mssqltips get a new Page Id because the table has been rebuilt and allocated new pages. If you try to read the output messages of the DBCC CHECKDB closely, all the details were described in the message pane.

Page Id Incement randomGUID randomValue BigCol
1 (1:168024:0) 1 818765E0-7B21-4F5F-A0F7-2521A40B60C 2866302 a
2 (1:168024:1) 2 13036C66-ECEF-42A9-9B7A-20FA941DB322 113273600 a
3 (1:168024:2) 3 1B0004964-65DE-4CB2-B7B2-CCB95F5F0B61 92630249 a
4 (1:168025:0) 7 4413379F-9527-48ED-BC4A-1C68B267784F 2478275 a
5 (1:168025:1) 8 162DB064-3B30-4DB1-896E-C9F3AE0FE677 36710420 a
6 (1:168025:2) 9 3ACA4174-331F-44341-8169-5FC7A96CBB26 60098378 a
7 (1:168026:0) 10 50DC192A-CC37-4CD0-B76F-A9CC89DB46FE 16776958 a
8 (1:168026:1) 11 B0CA7B1F-6A9C-4C30-A704-3A196C69E459 116332878 a
9 (1:168026:2) 12 AAC67528-40DA-4084-8BAF-3AB768C2D747 92329147 a
10 (1:168027:0) 13 D19EB664-AE95-42A0-989F-316FE7C2243F 138873660 a

Leave a Reply

Your email address will not be published. Required fields are marked *