Reading as fast as possible from a table with SSIS (Part II)

Recently I blogged on how to be as selective as possible while reading from and OLE DB Data Source (Part I) and how to load the data from a single flat file as fast as possible into a SQL Server table. But you probably noticed that “Out of the box” SSIS reads data faster from a flat file than from a SQL table. In this article I will share my latest trick on how to speed up the reading from a single table, typically at least twice as fast!

Time to beat, the “Native” OLE DB source speed

The first step, like always, is to check the default throughput and duration by counting rows from the Native OLE DB Data Source.
In our case it takes 14 minutes and 6 seconds to read all approx. 180 million rows or 12.89 GB from a SQL Server table with 16 columns of data. For comparison; reading the same data from a single flat file takes only 7 minutes 57 seconds.

So where is this difference coming from? Primarily because of the IO Packet size that’s being used I would say. SSIS reads with 128 KB blocks from a flat file and uses by default 4KB (the default network packet size) which is changeable into max. 32 KB to request data from SQL Server (32KB is also the maximum for the new SQL2008 R2 version). To retrieve the data faster we have to find a way to handle more IO requests in parallel!

SQL Source Table properties 

To set the baseline just build a package and count the rows from the Native OLEDB Source. (Don’t forget to change the Packet Size from 0 into 32767 in the Connection Manager).

image

The Windows Performance Monitor counters to check are the CPU load of both the SSIS process (DtsDebughost when executing the package from within BIDS or DTEXEC when started from the command line) and the SQL Server process. Also check the amount of bytes we are reading: select the IO Read bytes/sec counter from the process. 

Throughput and CPU load reading from the data source 

Adding parallelism by using the Modulo algorithm

The optimization trick I would like to point out to you is based on using the Modulo arithmetic operator that can be used with the numeric data type. It returns the remainder of one number divided by another. For example, if you want to read with 3 parallel streams use can use modulo 3 ( % 3 ). You can pick up the 3 output streams by specifying as the remainder values = 0,1 and 2.  (Using value’s less than 0 or larger than 2 will return 0 rows when modulo 3 is used.)

You can check the output also by running the query in SSMS or with the BIDS Preview option from the OLEDB Source Editor.

select * from dbo.LINEITEM_Hash96Key_SSD
WHERE (L_ORDERKEY% 3) = 2

Query output

When you build this query into an SSIS package, you will notice that initially it does creates parallelism; the rows are quite nice divided and read from the two data sources, But unfortunately the total package execution time doesn’t come down, which is a bit strange.  (After all,  adding more sources is an optimization trick we discovered already long time ago, back in 2004 while testing an early Beta 2 release of SSIS).

Add the query to the OLE DB Source, specifying the correct stream with "=xx" which should always be less than the modulo factor. 

Specify in your package the multiple data sources and merge the result with the Union All component:

Merge the streams with the Union All component

The Activity monitor shows us quickly that indeed multiple SPIDS are fired by the SSIS package but many are Suspended and actually causing unnecessary synchronization:

SQL Activity Monitor shows CXPACKET waits

Luckily this can be resolved quickly by specifying the query hint “OPTION (MAXDOP 1)”. This removes the synchronization overhead without sacrificing on throughput in this case.

Specifiying the MAXDOP 1 hint

 

Reading multiple times from the same SQL table

Time to put it to the test and increase the number of Data source components in the package. Each OLE DB data source should point to the same input table,  just modify the modulo factor and the output separator when adding more. The best result I typically achieve while reading from the same data source 3 or 4 times in parallel:

 Package with 3 data source readers

When we check the perfmon counters one more time, you’ll see the IO throughput actually went up from initially 21. MB/sec to 46.6 MB/sec on average, so it has more than doubled!

The CPU consumption of both the SSIS and SQLServer process have also increased ;  SSIS from 100% to 350% (3.5 CPU) and SQLServer is using also an extra CPU.

Perfmon data while executing the package with 3 data readers 

Results

The overall package execution duration to read the same 180 million rows (12.89 GB)  decreased from initially 14 minutes 6 seconds to only 6 minutes and 33 seconds when reading the data with 3 parallel streams!

The test results 

 

Wrap Up

When you have to speed up reading large amounts of data from a single table in your SSIS package, you can decrease the duration by more than half by adding some form of parallelism into your package by using this  Modulo trick. Typically reading  data from the same SQL Server table with 3 or 4 parallel streams provides the best result! 

SQL Server 2008R2 launch events

the upcoming weeks there are many events scheduled around the SQL2008R2 launch;

I will be presenting at the following events: Miracle OpenWorld 2010, SQLbits VI and SQLPass European conference!

Hope to see you there !

clip_image002

clip_image004

clip_image006