Apache: Big Data North America 2017 will be held at the Intercontinental Miami in Miami, Florida. 

Register now for the event taking place May 16-18, 2017. 
Back To Schedule
Tuesday, May 16 • 4:40pm - 5:30pm
R4ML: A R Bridge to Apache SystemML and SparkR - Alok Singh, IBM Spark Technology Center

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.

R is the de factor standard for statistics and data analysis. In this talk, we introduce R4ML, a new open-source R package from IBM. R4ML provides a bridge between R and Apache SystemML, allowing R scripts to invoke custom algorithms developed in SystemML's R-like domain specific language. This capability also provides a bridge to the algorithm scripts that ship with Apache SystemML, effectively adding a new library of prebuilt scalable algorithms for R on Apache Spark. R4ML integrates seamlessly SparkR, so data scientists can use the best features of SparkR and SystemML together in the same script. In addition, the R4ML package provides a number of useful new R functions that simplify common data cleaning and statistical analysis tasks.

Our talk will begin with an overview of the R4ML package, its API, supported canned algorithms, and the integration to Spark and SystemML. We will walk through a small example of creating a custom algorithm and a demo. We will share our experiences using R4ML technology with IBM clients. The talk will conclude with pointers to how the audience can try out R4ML and discuss potential areas of community collaboration.

avatar for Alok Singh

Alok Singh

Principal Engineer, IBM
Alok Singh is a Principal Engineer at the IBM Spark Technology Center, where he leads the HydraR project. He has built and architected multiple analytical frameworks and implemented machine learning algorithms. His interest is in creating Big Data and scalable machine learning software... Read More →

Tuesday May 16, 2017 4:40pm - 5:30pm EDT