Introducing Microsoft SQL Server 2016 R Services
With the release of CTP3 SQL Server 2016 and its native In-database support for the open source R language you can now call both R, RevoScaleR functions and scripts directly from within a SQL query and benefit from multi-threaded and multi-core in-DB computations. Yet another SQLServer milestone! The R integration brings a lot of benefits and new scenarios to your end users; it brings the utility of data science to your applications without the need to ‘export’ the data to your R environment!
Installing & Enabling SQLR integration
To get started you will need to install and configure SQL Server 2016 CTP3, Revolution R Open (RRO) and Revolution E Enterprise (RRE) on the server.
The versions I installed:
– SQL Server 2016 (CTP3.0) – 13.0.700.139 (X64)
– Revolution R Open, The Enhanced Open Source R distribution (R version 3.2.2 (2015-08-14) )
– Revolution R Enterprise for SQL Server 2016 Community Technology Preview (CTP) 3 – (7.5.0)
(To download the bits: https://msdn.microsoft.com/en-us/library/mt604847.aspx)
The R integration uses a new service called ‘SQL Server Launchpad’ to ‘launch’ Advanced Analytics Extensions processes; it enables the integration with Microsoft R Open using standard T-SQL statements. (Disabling this service will make Advanced Analytics features of SQL Server unavailable).
To enable the R integration, you have to run the sp_configure ‘external scripts enabled’ command and grant permissions to users to execute R scripts via the new db_rrerole role:
sp_configure 'external scripts enabled', 1
reconfigure
GO
alter role db_rrerole add member HenkR;
Connecting to SQL Server from your favorite R environment
Traditionally you would connect from your favorite R IDE to SQL Server to retrieve data for analytics; or you can use the R GUI environment that is part of the installers bits. The default installation path to RGUI is: “C:\Program Files\RRO\RRO-3.2.2-for-RRE-7.5.0\R-3.2.2\bin\x64\Rgui.exe”. Also popular is the R Studio Desktop that is separate downloadable from: www.Rstudio.com. (the version I installed to get started is V0.99.486).
1) Connecting R to SQLServer
To connect your RStudio environment to SQL Server the traditional way to read data (and import it into memory or a local *.xdf file) would like something like this: (it requires to setup a proper connection string;
use the ‘hostname\\instancename’ to connect to a SQLServer named instance and pull some data from a table:)
sqlServerConnString <- “Driver=SQL Server;Server=SQL2016EE\\SQL2016CTP3;
Database=Oldskool_DB;Uid=sa;Pwd=MyPassword”
sqlServerDataDS <- RxSqlServerData(sqlQuery =
“SELECT * FROM LINEITEM_Small”,connectionString = sqlServerConnString )
2) Offloading R scripts to SQLServer
Pulling a large dataset from a database and processing it locally would have required the writing into a local file which is a single threaded process. This can take a long time… So luckily with the CTP3 we can now bring the R script to the Data and process it there! To execute and R script directly from an SQL query use the new ‘sp_execute_external_script’ to, for example, calculate the mean of L_Quantity via R:
EXECUTE sp_execute_external_script @language = N'R' , @script = N'OutputDataSet <- data.frame(mean(InputDataSet[,1]))' , @input_data_1 = N'SELECT L_QUANTITY from LINEITEM_small as L_Quantity ' WITH RESULT SETS ((col int not null)); go
SQL Server will execute this external script via its new launchpad service in the BxLserver.exe process (The Revolution Analytics Communication Component for SQL Server).
R Memory allocation
What would an article about ‘R’ be without a note about memory usage!
By default the SQLR integration service in the CTP3 build will use up to 20% of the total physical memory available on the host for all its In-DB running R sessions. You can monitor the Memory usage of the BxLserver.exe process via the Taskmanager Working Set memory column.
If you cross the line your query will fail with a message similar to the one below:
Msg 39004, Level 16, State 20, Line 8
A ‘R’ script error occurred during execution of ‘sp_execute_external_script’ with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 8
An external script error occurred:
Error in sqlSatelliteCall() :
Calls: sqlSatelliteCall -> .Call
Execution halted
Invalid BXL stream
STDOUT message(s) from external script:
Failed to allocate or reallocate memory.
The resolution for running out of memory like this would be to increase the max. memory allocation which is a setting in the Rlauncher.config file (which is located in C:\Program Files\Microsoft SQL Server\MSSQL13… \MSSQL\Binn directory). Increase the default setting to a value that is still safe so it will not interfere with the sqlserver memory allocation.
The last line in the file ‘MEMORY_LIMIT_PERCENT=20’ controls the maximum percentage of physical memory which gets allocated to all the R sessions. Increase the value to allow the R service to consume more memory:
Wrap-Up
Embracing and operationalizing R functionality via the new SQLServer R Services integration has become very simple! It brings a lot of new possibilities to apply advanced analytics to your SQL data, ranging from data exploration to Predictive Modeling. Getting started with R might be a bit challenging but it will enrich your reports for sure!
6 Responses Leave a comment
Do you need to have RevoScaleR to use sp_execute_external_script with language = N’R’?
Hi Lawrence,
yes you will have have to install all the bits as described in the link. http://henkvandervalk.com/wp-admin/edit-comments.php#comments-form
Also if you would like to run R from a separate system you will need to install:
– “RRO-3.2.2-for-RRE-7.5.0-Windows.exe” (Note you are installing a special flavor of RRO that works with a specific version of RRE).
and:
– Install Revolution R Enterprise (RRE) on the client (“Revolution-R-Enterprise-SQL-7.5.0-Windows”)
Enjoy!
Excellent! Thanks, Henk! Extremely valuable for those of us on the admin side setting up SQL Server 2016 for developers to explore R integration.
Any tips for additional references for SQL Server 2016 R integration? Didn’t see it in the 2016 BOL, though maybe I’m not looking in the right spot :-)
-Lonny
Hi
Thanks for the article.
In SQL 2016 RC2 it looks like MEMORY_LIMIT_PERCENT is not present in the config. Any idea how to set this?
Cheers
managed to find it. Looks there is an article here https://msdn.microsoft.com/en-us/library/bb895148.aspx
Hi, stay tuned! I’m working on some new articles;
To create an EXTERNAL resource pool:
CREATE EXTERNAL RESOURCE POOL [EP_1] WITH
(
AFFINITY CPU = AUTO,
MAX_MEMORY_PERCENT = 50
)
to monitor Parallelism:
select * from sys.dm_external_script_requests
(and also
select * from sys.dm_external_script_execution_stats
where counter_value > 0)