Nigel Pond’s guide on building a 7 node Raspberry Pi Hadoop cluster
Inspired by a desire to learn more about Hadoop and the fact I already owned a Raspberry Pi I wondered whether anyone had yet built a Hadoop cluster based on this hobby computers. I wasn’t surprised to discover that people have already done this and the following instructions are the where I started:
Jonas Widriksson: http://www.widriksson.com/raspberry-pi-hadoop-cluster/
Jonas’s instructions are based on Hadoop version 1.0 and Carsten’s is based on version 2.x
If, like me, you’re interested in building with the newer version of Hadoop then follow Carsten’s instructions but read through Jonas’s too because he provides useful links for downloading the Raspian (Linux Operating System built specifically for the Raspberry Pi) distribution as well as commands and example files for testing your cluster.
The first stage is to build a single node cluster where your one node performs all tasks such as NameNode, Secondary NameNode and DataNode. Once you have this up-and-running you’re reading to add a second node. This second node will be a dedicated DataNode from which you will clone all subsequent DataNodes.
Creating the second node is slightly more difficult which is why I decided to write this post in the hope that it will save others time and effort.
Each Friday is PiDay here at Adafruit! Be sure to check out our posts, tutorials and new Raspberry Pi related products. Adafruit has the largest and best selection of Raspberry Pi accessories and all the code & tutorials to get you up and running in no time!