SQL – Change Data Capture
Change data capture is a SQL feature to capture any changes to data in tables/columns. This feature is mainly useful to capture changes in a database while doing auditing.
Implementing the CDC using BDD (Behavioural Driven Development) methodology with the following four cases
1. Enabling CDC
2. Capturing changes-Insert/Update/Delete
3. Analysis of data from CDC
4. Disabling CDC
Here considering AdventureWorks Database for CDC implementation
BDD Case 1: Enabling CDC
Step 1: Create two source tables with foreign key relationship as
USE [AdventureWorks] GO CREATE TABLE [dbo].[OrderHeader]( [SalesOrderID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, [RevisionNumber] [tinyint] NOT NULL, [OrderDate] [datetime] NOT NULL DEFAULT GETDATE(), [DueDate] [datetime] NOT NULL DEFAULT GETDATE(), [ModifiedDate] [datetime] NOT NULL DEFAULT GETDATE() CONSTRAINT [PK_OrderHeader_SalesOrderID] PRIMARY KEY CLUSTERED ( [SalesOrderID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO CREATE TABLE [dbo].[OrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] IDENTITY(1,1) NOT NULL, [ModifiedDate] [datetime] NOT NULL DEFAULT GETDATE(), CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] PRIMARY KEY CLUSTERED ( [SalesOrderID] ASC, [SalesOrderDetailID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO USE [AdventureWorks] GO ALTER TABLE [dbo].[OrderDetail] WITH CHECK ADD CONSTRAINT [FK_OrderDetail_OrderHeader_SalesOrderID] FOREIGN KEY([SalesOrderID]) REFERENCES [dbo].[OrderHeader] ([SalesOrderID]) ON DELETE CASCADE GO ALTER TABLE [dbo].[OrderDetail] CHECK CONSTRAINT [FK_OrderDetail_OrderHeader_SalesOrderID] GO
Step 2: Check the record count in the source tables
-- Record Count USE AdventureWorks SELECT COUNT(*) AS [OrderHeaderCount] FROM [dbo].[OrderHeader] SELECT COUNT(*) AS [OrderDetailCount] FROM [dbo].[OrderDetail]
Step 3 : Check whether CDC is enabled for Database
--To check whether CDC is enabled for database. USE master GO SELECT [name], database_id, is_cdc_enabled FROM sys.databases WHERE [name] = 'AdventureWorks' --Take of WHERE clause to check for all the DBs' on server GO
is_cdc_enabled value denotes as
0 — not enabled
1 — enabled
Step 4: Enable CDC at Database level as
-- Enable CDC in AdventureWorks database. USE AdventureWorks GO EXEC sys.sp_cdc_enable_db GO
Step 5: Check the CDC enable status at Database level
--To check whether CDC is enabled for database. USE master GO SELECT [name], database_id, is_cdc_enabled FROM sys.databases WHERE [name] = 'AdventureWorks' --Take of WHERE clause to check for all the DBs' on server GO
Step 6: Check any tables in database have already enabled for CDC as
--To check tables of database have already been enabled for CDC. USE AdventureWorks GO SELECT [name], is_tracked_by_cdc FROM sys.tables GO
Step 7: Check for any tables in system tables
Step 8: Enable CDC for 2 source tables as
--The script to enable CDC on dbo.OrderHeader table, i.e. enabling on all the columns of table USE AdventureWorks GO EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'OrderHeader', @role_name = NULL --Permission to access the CDC tables to particular table --here passing NULL means everyone could get the access to CDC table data --@captured_column_list = '[RevisionNumber],[OrderDate]' To enable onlyparticular columns in the table GO
On enabling CDC on first table, it will create two SQL Agent Jobs, CDC requires SQL Server Agent services on the server.
Step 9: Enabling CDC will create system tables to hold the tracking data on data change at source tables
BDD Case 2: Capturing changes Insert/Update/Delete
Step 1: Check for any existing records in the source and cdc system tables
--Check for any existing records. USE AdventureWorks GO SELECT * FROM dbo.OrderHeader GO USE AdventureWorks GO SELECT * FROM dbo.OrderDetail GO USE AdventureWorks GO SELECT * FROM cdc.dbo_OrderHeader_CT GO USE AdventureWorks GO SELECT * FROM cdc.dbo_OrderDetail_CT GO
Step 2: CDC – insert and check CDC tables
Insert one record in each table in a single transaction, here using single transaction to for both tables insertion to check how CDC is giving the change status of records
--Insert a new record in user tables in a single transaction. USE AdventureWorks GO INSERT INTO [dbo].[OrderHeader] ([RevisionNumber]) VALUES (1) INSERT INTO [AdventureWorks].[dbo].[OrderDetail] ([SalesOrderID]) VALUES (1) GO
Check for CDC tables as using source query
Notes:
1. Insertion has captured and recorded in the CDC tables
2. _$start_lsn (Last Sequence Number) is different for both records even though insertion in a single transaction so CDC is making records capture individually but not as a group.
3. _$operation value is 2 (which is for Insertion capture of CDC)
 
Step 3: CDC – Update and check CDC tables
Updating the existing records in source tables as
--Update Records in user tables in a single transaction. USE AdventureWorks GO UPDATE [dbo].[OrderHeader] SET ModifiedDate = GETDATE() UPDATE [dbo].[OrderDetail] SET ModifiedDate = GETDATE() GO
Check for CDC tables using source query
Notes:
1. Updates has captured and recorded in the CDC tables
2. _$start_lsn (Last Sequence Number) is different for both records even though updates in a single transaction so CDC is making records capture individually but not as a group.
3. _$operation values are 3,4 (these are for updates capture of CDC)
Here 3, is for the record before the update
4, is for the record after the update
Step 4: CDC – Delete and check CDC tables
Delete the existing records in source tables as
--Delete Records in user tables in a single transaction. USE AdventureWorks GO DELETE FROM [dbo].[OrderDetail] DELETE FROM [dbo].[OrderHeader] GO
Check for CDC tables using source query
Notes:
1. Deletion has captured and recorded in the CDC tables
2. _$start_lsn (Last Sequence Number) is different for both records even though update in a single transaction so CDC is making records capture individually but not as a group.
3. _$operation value is 1 (which is for Delete capture of CDC)
BDD Case 3: Analysis of data from CDC
Analysis of data from CDC tables is required to get the changes in the source tables.
Step 1: CDC will create the functions as
The script will give the changes in the database from yesterday and could also use other functions according to the requirement to get the data from CDC tables
USE AdventureWorks GO DECLARE @start_time DATETIME, @end_time DATETIME, @start_lsn BINARY(10), @end_lsn BINARY(10); SELECT @start_time = GETDATE()-1, @end_time = GETDATE(); SELECT @start_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @start_time); SELECT @end_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time); SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_OrderHeader(@start_lsn,@end_lsn,'all') GO
BDD Case 4: Disabling CDC
Disabling the CDC is a two-step process as equal to enabling process
Step 1: Disabling CDC on tables
--Disabling Change Data Capture on a table USE AdventureWorks; GO EXECUTE sys.sp_cdc_disable_table @source_schema = N'dbo', @source_name = N'OrderHeader', @capture_instance = N'dbo_OrderHeader'; GO EXECUTE sys.sp_cdc_disable_table @source_schema = N'dbo', @source_name = N'OrderDetail', @capture_instance = N'dbo_OrderDetail'; GO
Step 2: Disabling CDC on Database
--Disable Change Data Capture on Database USE AdventureWorks GO EXEC sys.sp_cdc_disable_db GO
On disabling the CDC on tables and Database, all the CDC tables and functions will be removed from the database along with CDC SQL Agent jobs.
Points to consider:
1. CDC capture the changes in database individual record level but as a group of records changes so getting the changes in table with a group by transaction is not possible as here didn’t find the changes using single transaction DML operations.
2. CDC performance will go down if we are using any bulk operations as it needs to capture changes besides regular bulk operation
3. CDC will not accept Truncation on table