Archive

Posts Tagged ‘SQL Performance’

SQL Performance – varchar(max) Vs varchar(50)

August 5, 2011 Leave a comment

Almost couple of years back, I got a challenge from my project manager to improve the performance of a SSIS package.
The SSIS package is just loading data into SQL database from text file. At first instance of checking package, I am almost surprised for poor performance with no idea on how to improve the performance.

Scenario:

Step 1:Target table definition as

Target Table Definition - varchar(max)

Target Table Definition - varchar(max)

Step 2: SSIS package as

Data Flow design

Data Flow design

Step 3: Flat File source options as

Flat File Source Options

Flat File Source Options

 

Flat File Source Options 2

Flat File Source Options 2

Step 4: Executing the package and Progress results as

Data Load

Data Load

 

Data Load Progress

Data Load Progress

Solution

Updated process:

Step 1: Changed the target table structure to have columns lengths to 50 from max as

Updated - Target Table Definition - varchar(50)

Updated - Target Table Definition - varchar(50)

Step 2: SSIS package updated with a new Data Flow for testing here DF Address_varchar50 with a new flat file source connection manager as

Updated Package

Updated Package

Step 3: Updated Flat File Source properties as the length of all columns updated to 50

Updated - Flat File Source Options

Updated - Flat File Source Options

 

Updated - Source Options

Updated - Source Options

Step 4: Executing the DF Address_varchar50 and Progress results as

Updated - Data Flow

Updated - Data Flow

 

Updated - Data Load Progress

Updated - Data Load Progress


 
Performance Stats:

1. In the first instance with data length varchar(MAX), executing time 7.644 sec
2. Upon updating the data length to varchar(50), execution time 5.928 sec, almost 20% improvement on my laptop (normal configuration) but I got almost 50% higher performance on production machines.
3. Here I consider to change the length to 50 after conforming the maximum length 50 using data profiling the source data
4. Further test cases as
 
changing the length to varchar(1000) leads poor performance
changing the length to varchar(8000) leads even futher poor performance
 
5. The data length and load is indirectly related to performance.

Advertisements
Categories: SSIS Tags: