SSIS – Balanced Data Distributor Transformation
Balanced Data Distributor – Transformation
Installation:
Source: http://www.microsoft.com/download/en/details.aspx?id=4123
1. Downloaded 32 bit – BalanedDataDistributor-x86.msi
2. Installation process will start and will register the .dll of the component
After Installation
- Open Visual Studio
- Create SSIS project
- Toolbox –> Right Click –> Choose Items
- SSIS Data Flow Items –> Balanced Data Distributor
Balanced Data Distributor available in Toolbox under Data Flow Transformations
Test cases:
- Normal Load Test1 – Load data from source to target without using BDD.
- BDD Load Test2 – Load
data from source to target with BDD and two destination components to same target table. - BDD Load Test3 – Load data from source to target with BDD and three destination components to same target table.
Databases:
Source DB : AdventureWorks, Table : Sales.Customer
Destination DB : SANDBOX for testing created this DB, target table schema same as Sales.Customer
SSIS – Package details:
- Created BalancedDataDistributor project
- BalancedDataDistributor.dtsx package
In all my test cases, my target is to test how the distribution of data through BDD and time it has taken to transfer data from source to destination table.
Planning:
- Considering only screen shots while Data Flow components executing
- Data reconciliation and time calculations from the SSIS data flow.
Results:
Test Case 1:
- Executing the “DF Normal Load Test1” component.
- In the data flow of the component, to get data count – 19,185
- To get the Elapsed time of the process, time to transfer the data from source to target table – 2.558 sec
Test Case 2:
- Executing the “DF BDD Load Test2” component.
- In the data flow of the component, to get data count, Balanced Data Distributor Output 1- 9,830 and Balanced Data Distributor Output 2- 9,355
- To get the Elapsed time of the process, time to transfer the data from source to target table – 0.718 sec
Test Case 3:
- Executing the “DF BDD Load Test3” component.
- In the data flow of the component, to get data count, Balanced Data Distributor Output 1- 9,830, Balanced Data Distributor Output 2- 9,355 and Balanced Data Distributor Output 3 – 0
- To get the Elapsed time of the
process, time to transfer the data from source to target table – 0.624 sec
Stats:
- Data distribution: Data distribution is not even on both data flow paths, we could expect almost equal distribution of data.
- Performance of approx. more than 60% when using BDD
- When using three BDD output to destination, we couldn’t see any data transfer through 3rd path, the data flow doesn’t use the 3rd resource hence whenever the data flow required a parallel processing only the flow will be used otherwise it may not be used
Points to consider:
- BDD works well with parallelism hardware so we could use these results for understanding only.
- The performance of data flow may also varies with hardware configurations, this test developed on personal
laptop with 4GB RAM, Dual Core i7 processor (normal laptop) - Implementation of BDD is simple as user doesn’t require setup any options except the paths to target components.
For any additional info, please refer to reference list –
Reference: http://blogs.msdn.com/b/sqlperf/archive/2011/05/25/the-balanced-data-distributor-for-ssis.aspx