1 R + Streaming
With this approach, you use MapReduce to execute R scripts in the map and reduce phrases.
The R package needs to be installed on each Data-Node, but packages are available on pubicly available Yum repositories for easy installation.
2 RHipe
Rhipe is an open source priject which allows MapReduce to be closely integrated with R on the client side.An R package that integrates the R environment with Hadoop, the open source implementation of Google’s MapReduce.
Using Rhipe, it is possible to write MapReduce algorithms in R, launch and monitor MapReduce jobs from R and interact with the HDFS.
R must be installed on each Data-Node, in conjunction with Protocal Buffers, and Rhipe itself.
3 RHadoop
RHadoop like Rhipe, provides an R wrapper around Map-Reduce so that they can be seamlessly integrated on the client side.
R must be installed on each Data-Node, and RHadoop has dependencies on other R packages. But these packages can be installled with CRAN, and the RHadoop installlation,, while not via CRAN, is straight-forward.
4 RHive
RHive is an R extension facilitating distributed computing via HIVE query. It provides an easy to use HQL like SQL and R objects and functions in HQL. It requires Hadoop core and Hive system.
5 Segue
An R language segue into parallel processing on Amazon’s Web Serives (in the cloud). Not a full map/reduce framework for R. Currently runs on Mac or Linux.