Radoop: Hadoop extension for RapidMiner
data analytics, data mining, machine learning, Sep 2010 -
Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.
Gábor Makrai and Zoltán Prekopcsák have started this project and closely integrated the highly optimized data analytics capabilities of Hive and Mahout, and the user-friendly interface of RapidMiner to form a powerful and easy-to-use data analytics solution for Hadoop.
Radoop will launch in June 2011 with a presentation at RCOMM 2011.
You can read more details on the main site of Radoop!
Zoltán Prekopcsák, Gábor Makrai, Tamás Henk and Csaba Gáspár-Papanek (2011) Radoop: Analyzing Big Data with RapidMiner and Hadoop at RCOMM 2011: RapidMiner Community Meeting And Conference. Dublin, Ireland, June 2011. [WWW] [PDF] [BibTeX]
Gábor Makrai, Zoltán Prekopcsák (2011) Scaling out data preprocessing with Hive in POSTER 2011: Proceedings of the 15th International Student Conference on Electrical Engineering. Prague, Czech Republic, May 2011. [PDF] [BibTeX]