IT-Efficiency: Things to check during your coffee break – BizTalk MBT

Co-author: Jeroen Kooiman (Unisys)

Most people prefer to handle BizTalk as a Black-box and don’t have a clue of what’s going on (or don’t want to ) inside the SQL databases. The overall performance of BizTalk very much depends on the performance of its databases, managed by SQL Server. As BizTalk operations are asynchronous processes by nature, SQL Server and the underlying infrastructure strongly determines the overall throughput of BizTalk. During BizTalk Quick scans & Performance optimization projects at customer production sites we noticed that there’s typically a lot of load thrown at SQL Server simply because full message body tracking (MBT) is enabled, storing lots of unnecessary debug & trace data that nobody ever will have a look at. Collecting MBT data in a production environment isn’t a recommended setting (except of course,  if you are troubleshooting …). So in many cases a quick fix to reduce the overall BizTalk workload significant and to speed up the overall environment is to check if message body tracking is enabled and where possible, to turn it off.

We have seen BizTalk environments that came to a stall completely because of MBT and and also sites where 60% of all the SQL transactional load was caused by MBT.

How to check quickly if there’s unnecessary Message Body tracking activity in your environment

The MBT data is stored inside the dta_DebugTrace table of the BizTalk DTA database. With SSMS simply check during your coffee break if the size of this table is growing rapidly; (sp_spaceused “dta_DebugTrace”)

Space usage of dta_DebugTrace table from the DTA Database

But some of the interfaces are maybe used only once a month, so we should look for a method that is more accurate.

Root cause analysis: which artifacts have tracking enabled?

There are two ways to check where tracking is enabled;

1) The hard way: you can go over each of the artifacts one by one and check manual if the “Track Message Bodies” options are enabled, but that’s a waste of your precious coffee break ;-)

2) The easy way: let a SQL query do the work for you! If you have many artifacts (hundred’s) to look after you might want to check them all  periodically with a query, which we would also like to recommend to include as a formal sign-off item on your check list for new deployments.

There are multiple locations in the BizTalk Server 2006 Administration Console (or through the HAT in BizTalk 2004 ) where Message body track can be enabled and disabled;

– On the Pipelines:

image

– On the Send/ Receive Ports and also the Orchestrations:

Receive Port Tracking Options Orchestration Tracking Options

Execute the following query on the BizTalkMgmtDB to check which artifacts have tracking enabled:

select     
bts_application.nvcName as ApplicationName,
'Pipeline Send' as [Artifact],
bts_pipeline.[name] as [Artifact Name],
bts_sendport.[nvcname] as [Artifact Owner Name],
bts_sendport_transport.nvcAddress as [Artifact Address],
StaticTrackingInfo.ismsgBodyTrackingEnabled as [TrackingBits]
from bts_pipeline
left outer join bts_sendport on bts_sendport.nSendPipelineID = bts_pipeline.[ID]
left outer join bts_sendport_transport on bts_sendport_transport.nSendPortID = bts_sendport.[nID]
left outer join adm_SendHandler on adm_SendHandler.[Id] = bts_sendport_transport.nSendHandlerId
left outer join adm_Adapter on adm_Adapter.[Id] = adm_SendHandler.AdapterId
left outer join bts_application on bts_sendport.nApplicationID = bts_application.nID
left outer join StaticTrackingInfo on StaticTrackingInfo.strServiceName = bts_pipeline.[name]
where StaticTrackingInfo.ismsgBodyTrackingEnabled is not null AND StaticTrackingInfo.ismsgBodyTrackingEnabled <> 0

union

select     
bts_application.nvcName as ApplicationName,
'Pipeline Receive' as [Artifact],
bts_pipeline.[name] as [Artifact Name],
bts_receiveport.[nvcname] as [Artifact Owner Name],
adm_receivelocation.InboundTransportURL  as [Artifact Address],
StaticTrackingInfo.ismsgBodyTrackingEnabled as [TrackingBits]
from bts_pipeline
left outer join adm_receiveLocation on adm_receiveLocation.ReceivePipelineID = bts_pipeline.[ID]
left outer join bts_receiveport on adm_receivelocation.receiveportid = bts_receiveport.nID
left outer join adm_Adapter on adm_Adapter.Id = adm_receivelocation.AdapterID
left outer join bts_application on bts_receiveport.nApplicationID = bts_application.nID
left outer join StaticTrackingInfo on StaticTrackingInfo.strServiceName = bts_pipeline.[name]
where StaticTrackingInfo.ismsgBodyTrackingEnabled is not null AND StaticTrackingInfo.ismsgBodyTrackingEnabled <> 0

union

select     
bts_application.nvcName as ApplicationName,
'Orchestration' as [Artifact],
bts_orchestration.nvcName as [Artifact Name],
null as [Artifact Owner Name],
bts_orchestration.nvcFullName as [Artifact Address],
StaticTrackingInfo.ismsgBodyTrackingEnabled as [TrackingBits]
from bts_orchestration
left outer join bts_assembly on bts_assembly.nID = bts_orchestration.nAssemblyID
left outer join bts_application on bts_assembly.nApplicationID = bts_application.nID
left outer join StaticTrackingInfo on StaticTrackingInfo.strServiceName = bts_orchestration.nvcFullName
where StaticTrackingInfo.ismsgBodyTrackingEnabled is not null and StaticTrackingInfo.ismsgBodyTrackingEnabled <> 0

union

select     
bts_application.nvcName as ApplicationName,
'Send Port' as [Artifact],
bts_sendport.[nvcname] as [Artifact Name],
null as [Artifact Owner Name],
bts_sendport_transport.nvcAddress as [Artifact Address],
nTracking as [TrackingBits]
from  bts_sendport
left outer join bts_sendport_transport on bts_sendport_transport.nSendPortID = bts_sendport.[nID]
left outer join adm_SendHandler on adm_SendHandler.[Id] = bts_sendport_transport.nSendHandlerId
left outer join adm_Adapter on adm_Adapter.[Id] = adm_SendHandler.AdapterId
left outer join bts_application on bts_sendport.nApplicationID = bts_application.nID
where nTracking is not null AND nTracking <> 0

union

select     
bts_application.nvcName as ApplicationName,
'Receive Port' as [Artifact],
bts_receiveport.[nvcname] as [Artifact Name],
null as [Artifact Owner Name],
adm_receiveLocation.InboundTransportURL as [Artifact Address],
nTracking as [TrackingBits]
from bts_receiveport
left outer join adm_receiveLocation on adm_receivelocation.receiveportid = bts_receiveport.nID
left outer join adm_Adapter on adm_Adapter.Id = adm_receivelocation.AdapterID
left outer join bts_application on bts_receiveport.nApplicationID = bts_application.nID
where nTracking is not null AND nTracking <> 0

order by 1, 2, 3

Sample query output

Sample Query output with the Artificats with MBT enabled

Background information

Message body tracking options for orchestrations and pipelines are stored in the BizTalkMgmtDb Database, to be more precise, in the ismsgBodyTrackingEnabled column of the StaticTrackingInfo table. The value of the integer column indicate which kinds of tracking are enabled. The first two bits of this number indicate whether tracking is enabled before (bit 0) or after (bit 1) processing.

image

Message body tracking for receive ports is stored in the bts_receiveport table:

image

Message body tracking for send ports is stored in the bts_sendport table:

image 

Disabling MBT

Check with the owner of each of the artifacts which have MBT enabled if they enabled it on purpose because of some ongoing investigation in production (?)…  or more likely, because they forgot to disable it ;-).  You have to disable them manually through the BT Administrator Console or HAT  (Note… do not modify this query to disable “things” for you; updating the BT SQL databases yourself may have serious impact on product support.) Also Purge the Tracking Database to remove unnecessary tracking data.  (see: http://technet.microsoft.com/en-us/library/aa578470(BTS.20).aspx)

Wrap-up

Increase the IT-efficiency of your BizTalk environment(s) by checking the Message Body Tracking settings. The above query will provide you a list of all the artifacts that have tracking enabled for you quickly, before you even finished your first cup of coffee. Disabling unnecessary tracking will reduce the overall load on the BizTalk SQL Server tracking database, and therefore also reduces the network, disk and CPU load significantly.

Speeding Up SSIS Bulk Inserts into SQL Server

Since we set the ETL World Record  with the introduction of SQL2008, a lot of insights, tips & tricks have been published on what it takes to load data fast into SQL (see also the Data loading Performance Guide). The trick described below is one to remember if you are being challenged to load a single flat file as fast as possible into SQL Server.

Challenge: Loading the data from a single large file as fast as possible into a table

The fastest way to do this is to build an SSIS package to handle the pre-processing of the data type conversions so this doesn’t have to be done on the SQL Server side, saving precious CPU ticks and more important – latency. Next thing is to run this SSIS package on the SQL Server itself and use a fast, low latency, in-memory connection to eliminate the network stack overhead.

Step 1)  Check the speed of your data source

As always, start with checking the throughput of the data source by using a Row Count component (see picture). Select the input columns to insert (see also Speeding up reading from a data source). Out of the box it takes 8 minutes and 6 seconds to read the Full 22.7 GB of data from the flat file with 16 columns of data.

Reading all 16 columns of data

By enabling the Fast Parse option on the appropriate Output columns it takes only 5 minutes 53 seconds to read the entire file, which is 13.7 % faster (or 2 minutes 13 seconds  less).

Use the Flatfile Advanced Editor function to enable FastParse

The Perfmon counters show we read from the file with about 70 MBytes/sec, 128 KB blocks and it takes approx. 1 CPU and “0” milliseconds per transfer to read from Solid State:

Measuring ing Flat File throughput 

Step 2) Determine the native bulk Insert speed

First apply a couple of optimizations to the SSIS package to speed up the Bulk Insert;

(- Use the Fast Parse option)

– Use the SQL Server Native Client 10.x OLE DB provider for an In-Memory, high performance connection

– Set the Packet Size to 32767

– Select the OLE DB Destination  Data Access mode “Table or View – fast load”  option (see picture ):

image image

Bulk Insert Result: It takes 19 minutes 55 seconds to insert 22.7 GByte of data , or almost 180 million rows.

This is an average 19 MByte/sec or 150000 rows/sec.

Native OLEDB-SQL Server  Native Client

According the Waitstats the execution of this task seems pretty efficient:

Waitstats after a single BulkInsert 

During the validation of the source throughput we measured an effective Read speed which was 3.5 times faster than the current throughput we are getting with Bulk Inserting. It’s good to know that the reader isn’t our primary bottleneck. Looking at some other basic perfmon counters, like the disk write queue length and the CPU load of both the SQLServer and the DTSDebughost (BIDS) process don’t show any significant bottleneck; both the processes use less than a CPU each.

No Significant Disk Queueing

CPU utilization is less than 2 cores.

Optimizing the Bulk Insert

To increase the overall throughput we have to build some parallelism into the data flow. SSIS will allow us to do just that! Let’s find a way to spread the load and bulk insert the data from the same flat file into 4 different destination tables, instead of just one, and find out if we can get the throughput up !

 

Step 3) Adding a conditional Split to the SSIS Package

Since we can read 3.5 times faster from our data source than we are writing it out to SQL Server, we should  spread the load across at least 4 streams. In SSIS you can build this into a package with the Modulo function (%). The Modulo function provides the integer remainder after dividing the first numeric expression by the second one, the ultimate striping mechanism!

 Conditional split - use of the Modulo  function

Use the 4 outputs to feed 4 different table destinations: 

image

Result:  Magic! the package completed loading the same 22.7 GByte in only 39% of the time: 7 minutes 47 seconds !  (versus 19 min. 55 sec.) 

That’s  at least 2.5 times faster !

image

The SQL Waitstats show some different figures also;

Waitstats

Unfortunately, I think there isn’t much we can do about either the “ASYNC_SYNC_NETWORK_IO”; we are using the max. 32 KB packet size and an in memory connection. Also the “PAGEIOLATCH_EX” (caused by an exclusive lock on the buffer while the page is transferred from “disk” to cache) is a tricky one; I tried to reduce the PAGEIOLATCH_EX  by adding a second flat file data source that reads from the same input file, each processing 2 tables, but no, I can’t get it any faster.  Think we have to live with the 2.5+ times faster for now!

 

Both the SQL Server process and BIDS use on average more than 250% Processor time, or 2.5 CPU’s each to service this optimized SSIS Package. 

5+ Cores  effectively in use by SSIS and SQL to load a single flatfile

Step 4)  Consolidation

Create either a View across the 4 tables or build a partitioned table that allows you to switch the 4 tables back in into a single large table if that’s needed.

Wrap-Up

When you have to load the data from a single flat file as quickly as possible into SQL Server, the technique described above will bring you a significant increase in Bulk Insert throughput and a decrease in execution time needed. Also start the SSIS package on the SQL Server, use a solid State disk as staging area for the flat file and stripe the data into multiple tables. The example above showed we can load the same data file at least 2.5x times faster!