Henk's tech blog

Recently I blogged on how to be as selective as possible while reading from and OLE DB Data Source (Part I) and how to load the data from a single flat file as fast as possible into a SQL Server table. But you probably noticed that “Out of the box” SSIS reads data faster from a flat file than from a SQL table. In this article I will share my latest trick on how to speed up the reading from a single table, typically at least twice as fast!

Time to beat, the “Native” OLE DB source speed

The first step, like always, is to check the default throughput and duration by counting rows from the Native OLE DB Data Source.
In our case it takes 14 minutes and 6 seconds to read all approx. 180 million rows or 12.89 GB from a SQL Server table with 16 columns of data. For comparison; reading the same data from a single flat file takes only 7 minutes 57 seconds.

So where is this difference coming from? Primarily because of the IO Packet size that’s being used I would say. SSIS reads with 128 KB blocks from a flat file and uses by default 4KB (the default network packet size) which is changeable into max. 32 KB to request data from SQL Server (32KB is also the maximum for the new SQL2008 R2 version). To retrieve the data faster we have to find a way to handle more IO requests in parallel!

To set the baseline just build a package and count the rows from the Native OLEDB Source. (Don’t forget to change the Packet Size from 0 into 32767 in the Connection Manager).

The Windows Performance Monitor counters to check are the CPU load of both the SSIS process (DtsDebughost when executing the package from within BIDS or DTEXEC when started from the command line) and the SQL Server process. Also check the amount of bytes we are reading: select the IO Read bytes/sec counter from the process.

Adding parallelism by using the Modulo algorithm

The optimization trick I would like to point out to you is based on using the Modulo arithmetic operator that can be used with the numeric data type. It returns the remainder of one number divided by another. For example, if you want to read with 3 parallel streams use can use modulo 3 ( % 3 ). You can pick up the 3 output streams by specifying as the remainder values = 0,1 and 2. (Using value’s less than 0 or larger than 2 will return 0 rows when modulo 3 is used.)

You can check the output also by running the query in SSMS or with the BIDS Preview option from the OLEDB Source Editor.

select * from dbo.LINEITEM_Hash96Key_SSD
 WHERE (L_ORDERKEY% 3) = 2

When you build this query into an SSIS package, you will notice that initially it does creates parallelism; the rows are quite nice divided and read from the two data sources, But unfortunately the total package execution time doesn’t come down, which is a bit strange. (After all, adding more sources is an optimization trick we discovered already long time ago, back in 2004 while testing an early Beta 2 release of SSIS).

Specify in your package the multiple data sources and merge the result with the Union All component:

The Activity monitor shows us quickly that indeed multiple SPIDS are fired by the SSIS package but many are Suspended and actually causing unnecessary synchronization:

Luckily this can be resolved quickly by specifying the query hint “OPTION (MAXDOP 1)”. This removes the synchronization overhead without sacrificing on throughput in this case.

Reading multiple times from the same SQL table

Time to put it to the test and increase the number of Data source components in the package. Each OLE DB data source should point to the same input table, just modify the modulo factor and the output separator when adding more. The best result I typically achieve while reading from the same data source 3 or 4 times in parallel:

When we check the perfmon counters one more time, you’ll see the IO throughput actually went up from initially 21. MB/sec to 46.6 MB/sec on average, so it has more than doubled!

The CPU consumption of both the SSIS and SQLServer process have also increased ; SSIS from 100% to 350% (3.5 CPU) and SQLServer is using also an extra CPU.

Results

The overall package execution duration to read the same 180 million rows (12.89 GB) decreased from initially 14 minutes 6 seconds to only 6 minutes and 33 seconds when reading the data with 3 parallel streams!

Wrap Up

When you have to speed up reading large amounts of data from a single table in your SSIS package, you can decrease the duration by more than half by adding some form of parallelism into your package by using this Modulo trick. Typically reading data from the same SQL Server table with 3 or 4 parallel streams provides the best result!

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Henk's tech blog
a weblog about IT optimization

Reading as fast as possible from a table with SSIS (Part II)

Time to beat, the “Native” OLE DB source speed

Adding parallelism by using the Modulo algorithm

Reading multiple times from the same SQL table

Results

Wrap Up

SQL Server 2008R2 launch events

Pages

Categories

Archives

Meta

Henk's tech blog a weblog about IT optimization

Reading as fast as possible from a table with SSIS (Part II)

Time to beat, the “Native” OLE DB source speed

Adding parallelism by using the Modulo algorithm

Reading multiple times from the same SQL table

Results

Wrap Up

SQL Server 2008R2 launch events

Pages

Categories

Archives

Meta

Henk's tech blog
a weblog about IT optimization