CDC Basics here
--STEP 1
CREATE DATABASE TESTCDC
--STEP2
USE TESTCDC
GO
--DROP Table TestTable
CREATE TABLE TESTTable
(ID INT Primary key,
NAME VARCHAR(50))
--STEP3
INSERT INTO TESTTable(ID, NAME)
VALUES (1,'Siva')
INSERT INTO TESTTable(ID, NAME)
VALUES (2,'Raj')
INSERT INTO TESTTable(ID, NAME)
VALUES (3,'Mike')
INSERT INTO TESTTable(ID, NAME)
VALUES (4,'Lloyd')
--STEP 4
--Enable CDC at DB Level
declare @rc int
exec @rc = sys.sp_cdc_enable_db
select @rc
--Query to Verify it
select name, is_cdc_enabled from sys.databases
--STEP 5
--Enable CDC for Table
exec sys.sp_cdc_enable_table
@source_schema = 'dbo',
@source_name = 'TESTTable' ,
@role_name = 'TestRole',
@supports_net_changes = 1
--Query to Verify DB
select name, type, type_desc, is_tracked_by_cdc from sys.tables
--STEP 6
--Insert Few Records
INSERT INTO TESTTable(ID, NAME)
VALUES (5,'Tauty')
INSERT INTO TESTTable(ID, NAME)
VALUES (6,'Andy')
Update TESTTable SET ID = 10 WHERE ID = 1
--STEP7
--Populated CDC Records
Select * FROM cdc.dbo_TESTTable_CT
--STEP 8
Check Data Populated
DECLARE @Yesterday_Time smalldatetime, @to_lsn binary(10), @from_lsn binary(10);
SELECT @Yesterday_Time = DATEADD(DAY, -1, GETDATE())
SELECT @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', GETDATE()-2);
SELECT @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', GETDATE());
SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_TESTTable (@from_lsn, @to_lsn ,'all')
--STEP 9
Putting DataPull Part in a SSIS package. For Full pull it is direct dataload (DataFlowTask of all Data). Delta Pull is Tricky and we will be looking into it. We are going to pull data from TESTTable to TestTable1. Changes in TestTable need to be reflected in TestTable1
Once CDC is enabled. You will see like CDC Capturing system tables.
--STEP 10
I am going to create a simple package as below to pull data
--CREATE a Table to Store Last LSN Value (Log Sequence Number)
USE TESTCDC
GO
CREATE TABLE CDCTracking
(LSN VARCHAR(100))
--GET Current Max LSN
SELECT sys.fn_cdc_map_time_to_lsn('largest less than or equal', GETDATE())
--Insert Data
INSERT INTO CDCTracking
VALUES('0x0000001D0000011A0001')
--STEP 11
Package looks simple as below
--STEP 12
--GET FROM and TO LSN (Excecute SQLTask)SELECT LSN as 'FROMLSN' ,(select CONVERT(varchar(100), sys.fn_cdc_get_max_lsn(), 1))
AS 'TOLSN'
FROM CDCTracking
Assign output variables for the Execute SQL Task
--STEP13
Below Variables are defined for the package
Assign the value for variable QueryCDC Variable.
SELECT *,__$operation as Operation FROM cdc.fn_cdc_get_net_changes_dbo_TESTTable (CONVERT(binary(10),'0x0000001D0000011A0001',1),CONVERT(binary(10),'0x0000001E000000B00006',1),'all')
--STEP 14
The following is the variable details set for package
--STEP 15
Add below code in C#
// TODO: Add your code here
Dts.Variables["QueryCDC"].Value = "SELECT *,__$operation as Operation" +
" FROM cdc.fn_cdc_get_net_changes_dbo_TESTTable " +
"(CONVERT(binary(10),'" + Dts.Variables["FromLsn"].Value +
"',1),CONVERT(binary(10),'" + Dts.Variables["ToLsn"].Value+
"',1),'all')";
MessageBox.Show(Dts.Variables["QueryCDC"].Value.ToString());
MessageBox.Show(Dts.Variables["FromLsn"].Value.ToString());
MessageBox.Show(Dts.Variables["ToLsn"].Value.ToString());
--STEP 16
Data Flow Task I am doing as below
--STEP 17
First assign OLEDB source editor as provided below
--STEP 18
Add a conditional splitter operator
--STEP 19
Create a OLEDB Command to update as shown below
DELETE Operation as below
Insert is straight forward Data Insert, OLEDB Destination
End to End would Look Like below
INSERT INTO TESTTable(ID, NAME)
VALUES (91,'Tamy')
INSERT INTO TESTTable(ID, NAME)
VALUES (92,'Mosses')
Update TESTTable SET ID = 110 WHERE ID = 5
Handling Errors in SSIS
I basically wrote a simple script for for SSIS Error Handling for one of packages developed. This is pretty easy one. Getting Started on this
Table to log errors
Step 1
CREATE TABLE ErrorLog (
ErrorDesc varchar(MAX),
Taskname varchar(100))
Step 2
Stored Proc to Log Errors
CREATE PROCEDURE dbo.LogError (@ErrorDesc varchar(2000), @Taskname varchar(100))
AS
BEGIN
INSERT INTO [dbo].[ErrorLog] ([ErrorDesc], [Taskname])
VALUES (@ErrorDesc, @Taskname)
END
Step 3
Create SQL Task on Error in the package
Step 4
Parameter mapping
This is simple example.
Further improvements
- Error Handling in proc
- Include Errorlogid and other parameters in errorlog table
References - MSSQLTips
SQL Server Integration Services (SSIS) - Design Best Practices
Top 10 SQL Server Integration Services Best Practices
Handling Slowly Changing Dimensions in SSIS
About the BI Monkey SSIS ETL Framework
ETL Framework for SSIS 2008 - Part 1
ETL Framework for SSIS 2008 - Part 2
Stonemeadow Solutions ETL Framework
SSIS Community Tasks and Components
SQL Bits Session - Change Data Capture for real-time BI
Change Data Capture Best Practices
Happy Reading!!
No comments:
Post a Comment