Introducing Microsoft SQL Server 2016 R Services

With the release of CTP3 SQL Server 2016 and its native In-database support for the open source R language you can now call both R, RevoScaleR functions and scripts directly from within a SQL query and benefit from multi-threaded and multi-core in-DB computations. Yet another SQLServer milestone! The R integration brings a lot of benefits and new scenarios to your end users; it brings the utility of data science to your applications without the need to ‘export’ the data to your R environment!

Installing & Enabling SQLR integration

To get started you will need to install and configure SQL Server 2016 CTP3, Revolution R Open (RRO) and Revolution E Enterprise (RRE) on the server.

 

SQLServer 2016 CTP3 Setup   RRE

The versions I installed:
– SQL Server 2016 (CTP3.0) – 13.0.700.139 (X64)
– Revolution R Open, The Enhanced Open Source R distribution (R version 3.2.2 (2015-08-14) )
– Revolution R Enterprise for SQL Server 2016 Community Technology Preview (CTP) 3 – (7.5.0)

(To download the bits: https://msdn.microsoft.com/en-us/library/mt604847.aspx)

The R integration uses a new service called ‘SQL Server Launchpad’  to ‘launch’ Advanced Analytics Extensions processes; it enables the integration with Microsoft R Open using standard T-SQL statements. (Disabling this service will make Advanced Analytics features of SQL Server unavailable).

New SQL Server Launchpad service

To enable the R integration, you have to run the sp_configure ‘external scripts enabled’ command and grant permissions to users to execute R scripts via the new db_rrerole role:

sp_configure 'external scripts enabled', 1
reconfigure
GO

alter role db_rrerole add member HenkR;

 

Connecting to SQL Server from your favorite R environment

Traditionally you would connect from your favorite R IDE to SQL Server to retrieve data for analytics; or you can use the R GUI environment that is part of the installers bits. The default installation path to RGUI is: “C:\Program Files\RRO\RRO-3.2.2-for-RRE-7.5.0\R-3.2.2\bin\x64\Rgui.exe”. Also popular is the R Studio Desktop that is separate downloadable from: www.Rstudio.com. (the version I installed to get started is V0.99.486).

 

1) Connecting R to SQLServer

To connect your RStudio environment to SQL Server the traditional way to read data (and import it into memory or a local *.xdf file) would like something like this:  (it requires  to setup a proper connection string;

use the ‘hostname\\instancename’  to connect to a SQLServer named instance and pull some data from a table:)

sqlServerConnString <- “Driver=SQL Server;Server=SQL2016EE\\SQL2016CTP3;
Database=Oldskool_DB;Uid=sa;Pwd=MyPassword”

 

sqlServerDataDS <- RxSqlServerData(sqlQuery =

SELECT * FROM LINEITEM_Small”,connectionString = sqlServerConnString )

 

2)  Offloading  R scripts to SQLServer

 

Pulling a large dataset from a database and processing it locally would have required the writing into a local file which is a single threaded process. This can take a long time… So luckily with the CTP3 we can now bring the R script to the Data and process it there! To execute and R script directly from an SQL query use the new ‘sp_execute_external_script’ to, for example, calculate the mean of L_Quantity via R:

EXECUTE  sp_execute_external_script
                @language = N'R'
              , @script = N'OutputDataSet <- data.frame(mean(InputDataSet[,1]))'
              , @input_data_1 = N'SELECT  L_QUANTITY from LINEITEM_small as L_Quantity '
              WITH RESULT SETS ((col int not null));
go

SQL Server will execute this external script via its new launchpad service in the BxLserver.exe process (The Revolution Analytics Communication Component for SQL Server).

R Memory allocation

What would an article about ‘R’ be without a note about memory usage!
By default the SQLR integration service in the CTP3 build will use up to 20% of the total physical memory available on the host for all its In-DB running R sessions.  You can monitor the Memory usage of the BxLserver.exe process via the Taskmanager Working Set memory column.

Monitor the BxlServer process memory usage via Windows Task manager

If you cross the line your query will fail with a message similar to the one below:

Msg 39004, Level 16, State 20, Line 8

A ‘R’ script error occurred during execution of ‘sp_execute_external_script’ with HRESULT 0x80004004.

Msg 39019, Level 16, State 1, Line 8

An external script error occurred:

Error in sqlSatelliteCall() :

Calls: sqlSatelliteCall -> .Call

Execution halted

Invalid BXL stream

STDOUT message(s) from external script:

Failed to allocate or reallocate memory.

 

 

The resolution for running out of memory like this would be to increase the max. memory allocation which is a setting in the Rlauncher.config file (which is located in C:\Program Files\Microsoft SQL Server\MSSQL13… \MSSQL\Binn directory). Increase the default setting to a value that is still safe so it will not interfere with the sqlserver memory allocation.

The last line in the file ‘MEMORY_LIMIT_PERCENT=20’  controls the maximum percentage of physical memory which gets allocated to all the R sessions. Increase the value to allow the R service to consume more memory:

Rlauncher Config file

 

Wrap-Up

Embracing and operationalizing R functionality via the new SQLServer R Services integration has become very simple! It brings a lot of new possibilities to apply advanced analytics to your SQL data, ranging from data exploration to Predictive Modeling. Getting started with R might be a bit challenging but it will enrich your reports for sure!

Henk posted at 2015-10-28 Category: R, SQL2016 | Tags: , ,

6 Responses Leave a comment

  1. #1K Boogie @ 2015-10-31 01:48

    Do you need to have RevoScaleR to use sp_execute_external_script with language = N’R’?

  2. Henk @ 2015-10-31 09:29

    Hi Lawrence,
    yes you will have have to install all the bits as described in the link. http://henkvandervalk.com/wp-admin/edit-comments.php#comments-form
    Also if you would like to run R from a separate system you will need to install:
    – “RRO-3.2.2-for-RRE-7.5.0-Windows.exe” (Note you are installing a special flavor of RRO that works with a specific version of RRE).
    and:
    – Install Revolution R Enterprise (RRE) on the client (“Revolution-R-Enterprise-SQL-7.5.0-Windows”)
    Enjoy!

  3. #2Lonny Niederstadt @ 2015-11-3 17:44

    Excellent! Thanks, Henk! Extremely valuable for those of us on the admin side setting up SQL Server 2016 for developers to explore R integration.

    Any tips for additional references for SQL Server 2016 R integration? Didn’t see it in the 2016 BOL, though maybe I’m not looking in the right spot :-)

    -Lonny

  4. #3cozmoash @ 2016-4-19 11:14

    Hi

    Thanks for the article.

    In SQL 2016 RC2 it looks like MEMORY_LIMIT_PERCENT is not present in the config. Any idea how to set this?

    Cheers

  5. #4cozmoash @ 2016-4-19 11:25

    managed to find it. Looks there is an article here https://msdn.microsoft.com/en-us/library/bb895148.aspx

  6. Henk @ 2016-4-19 13:17

    Hi, stay tuned! I’m working on some new articles;

    To create an EXTERNAL resource pool:
    CREATE EXTERNAL RESOURCE POOL [EP_1] WITH
    (
    AFFINITY CPU = AUTO,
    MAX_MEMORY_PERCENT = 50
    )

    to monitor Parallelism:
    select * from sys.dm_external_script_requests

    (and also
    select * from sys.dm_external_script_execution_stats
    where counter_value > 0)

Leave a Reply

You must be logged in to post a comment.