Evolution of Microsoft R service:
Revolutionary Analytics is a statistical software company focused on multiprocessor computing and developing Big Data. It also developed a multi-core functionality version of R called Revolution R Enterprise.
In January 2015, Microsoft acquired Revolutionary Analytics and re-branded Revolution R Enterprise as Microsoft R server.
Microsoft R server offers enterprise class server to manage parallel and distributed workloads of R Processes on servers and clusters like Hadoop and Apache Spark. It enhances open source R with rich analytics to manage massive sets of data, machine learning scenarios and statistical analysis. It can be run on both Windows and Linux servers.
The conduit between Microsoft R service and SQL Server is the SQL server R service. It allows R script execution inside SQL server.
Built-in to SQL server:
Challenges with Open Source R:
- Data Movement: Initially, developers pull the data required for analysis from database and move it to the machine where there is R run time. These days, as data grows faster, data movement has become more complicated from both a time and security perspective.
- Operationalization: It is the process of taking a solution developed by data scientist and deploying it to a server or other production environment. It is performed by an SQL developer who work with the data scientist to determine the necessary requirements. This is not an easy task and sometimes requires the developer to recall the entire script in another language.
- Scale and performance: R is single threaded and it can only fit the data available in its memory. R is an open source language and doesn’t scale well to perform big data analytics. It would take additional compute time to perform R analysis.
Execution of R code inside the SQL server would handle those scenarios better as it brings the data closer to the R script.
SQL Server R services to the Rescue:
With R services, we can now push R scripts computations directly to SQL server machines. This eliminates the entire data movement process.
For deployment in production, R scripts are integrated into T-SQL which means that they can be used in any application that connects to the SQL server.
The SQL server supports parallelism and is multi-threaded. Now, we can take advantage of all the performance features within SQL server like column store indexes and in-memory querying.
Key User Scenarios for R services
With the in-database server, we can invoke R scripts by embedding them in system stored procedures and have the computations execute in the database. This is useful when we want to score using the predictive model in production. We can also execute R scripts to return predictions and embed them into the application.
Data scientists interact most often with R language. He/she can use any client work station or R development tool to build the solutions they need. Having an IDE setup, they can look at data, produce a model and create R script for a specific requirement. The data scientist then hands the script to a developer who can take and integrate it into the current application which then can be implemented in production. This can be easily taken care of because the SQL developer will be familiar with t-SQL interface used to work with R Script.
A database administrator can then setup security policies and establish how R scripts behave in SQL server. This way, we can protect the critical resources on the database machine.
A data engineer can then integrate application work flows that use R scripts.
With these scenarios we can basically cover the end to end application for SQL server using R services.
SQL server R services feature must be explicitly installed in SQL Server.
Here is how we install R server together with new SQL server 2016 instance.
Figure 1: SQL Server Installation Type
In Figure 1, we are choosing a new installation of SQL server 2016. We can also add R service as a feature to the existing SQL server 2016 instance.
Figure 2: Feature Selection
As shown in Figure 2, select the R services in Database Engine services. This is the install option to integrate R within the SQL server.
Figure 3: Shared Feature in Feature Selection.
In Figure 3, we see that there is R server (Standalone). This is to install Microsoft R server independent of SQL server. For example, if a data scientist wanted to run R on his or her machine.
Go ahead and install after choosing the required features on the new instance.
After installation is successfully complete, open SQL server Launchpad Service.
SQL Server Launchpad
Figure4: Windows R sever.
Restart the SQL server Launchpad if required.
As said earlier, Microsoft SQL server R service is the conduit between SQL server and Microsoft R server. SQL server Launchpad service acts as a routing mechanism between SQL server and External Languages. The Launchpad service spins up and calls in the correct run time from inside the SQL server. At this point in the process, Launchpad service supports only R.
Enable SQL server R Service:
After installing SQL server R service we need to enable the feature. We need to change the configuration and enable external scripts. This can be done using the following command on SSMS.
Testing R script in SQL server 2016:
Sp_execute_external_Script is the procedure that will be used to call the R code inside SQL server. This procedure calls the Launchpad service which will route to call the correct run time.
Thanks for reading my blog. Feel free to comment below with any questions!