- Requires dplyr 0.7 and Microsoft Machine Learning Server/R Server >= 8.0
- Supports data in HDFS and in native filesystem
- Supports in-database pipelines for SQL Server and in-cluster pipelines for Hive tables in Spark
- Adds utility functions for working with data in HDFS