Wednesday, 13 November 2013

Centos 6/RHEL install Hadoop Multi Node Server

Installation of Hadoop on your Centos 6/RHEL box is now a lot simpler since rpm versions have been made available but you nonetheless need to have installed the JDK prior to doing so.

Change the JAVA_HOME path to /usr/java/default and you can install Hadoop via yum from Epel repo.

$ sudo yum -y install hadoop

If you have any problems with yum you can also use the Apache mirror service, download your preference and install it with

$ sudo rpm -Uvh <rpm_package_name>

Once installed as a package, set  it all up.

Generate hadoop configuration on all nodes

$ /usr/sbin/hadoop-setup-conf.sh

--jobtracker-url=${jobtracker}:9001 \

--conf-dir=/etc/hadoop \

--hdfs-dir=/var/lib/hadoop/hdfs \
   
--namenode-dir=/var/lib/hadoop/hdfs/namenode \
   
--mapred-dir=/var/lib/hadoop/mapred \
   
--datanode-dir=/var/lib/hadoop/hdfs/data \

--log-dir=/var/log/hadoop \

--auto

Where ${namenode} and ${jobtracker} should be replaced with hostname of namenode and jobtracker.

Format namenode and setup default HDFS layout.

$ /usr/sbin/hadoop-setup-hdfs.sh

Start all data nodes after stopping first.


$ /etc/init.d/hadoop-datanode start

Start job tracker node.
   
$ /etc/init.d/hadoop-jobtracker start

Start task tracker nodes
   
$ /etc/init.d/hadoop-tasktracker start

Create a user account on HDFS for yourself.
   
$ /usr/sbin/hadoop-create-user.sh -u $USER

Set up Hadoop Environment
   
$ vi ~/.bash_profile

In INSERT mode set path for JAVA_HOME

Export JAVA_HOME

Save file by clicking esc:wq

Run the .bash_profile
   
$ source ~/.bash_profile

Set JAVA_HOME path in Hadoop Environment file
   
$ sudo vi /etc/hadoop/hadoop-env.sh

Configure Hadoop

Use the following:

$ sudo vi /etc/hadoop/core-site.xml:

<configuration>

<property>
   
<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>
   
</configuration>

conf/hdfs-site.xml:

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

conf/mapred-site.xml:

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

 </property>

</configuration>

Hadoop Commands

$ hadoop

$ hadoop namenode –format (Format the namenode, If ask to
answer ‘Y’)
   
$ hadoop namenode (Start the namenode)
   
$ find / -name start-dfs.sh (find the file in directory)
   
$ cd usr/sbin (Go to respective directory directly)
   
$ start-dfs.sh
   
$ start-mapred.sh

$ hadoop fs –ls / (Shows the HDFS root folder)
   
$ hadooop fs –put input/file01 /input/file01 (Copy local input/file01 to

HDFS root /input/file01)