Centos 6/RHEL install Hadoop Multi Node Server

Installation
 of Hadoop on your Centos 6/RHEL box is now a lot simpler since rpm 
versions have been made available but you nonetheless need to have installed the JDK prior to doing so.

Change the JAVA_HOME path to /usr/java/default and you can install Hadoop via yum from Epel repo.

$ sudo yum -y install hadoop

If you have any problems with yum you can also use the Apache mirror service, download your preference and install it with

$ sudo rpm -Uvh <rpm_package_name>

Once installed as a package, set  it all up. 

Generate hadoop configuration on all nodes

$ /usr/sbin/hadoop-setup-conf.sh

--jobtracker-url=${jobtracker}:9001 \

--conf-dir=/etc/hadoop \

--hdfs-dir=/var/lib/hadoop/hdfs \

--namenode-dir=/var/lib/hadoop/hdfs/namenode \

--mapred-dir=/var/lib/hadoop/mapred \

--datanode-dir=/var/lib/hadoop/hdfs/data \

--log-dir=/var/log/hadoop \

--auto

Where ${namenode} and ${jobtracker} should be replaced with hostname of namenode and jobtracker.

Format namenode and setup default HDFS layout.

$ /usr/sbin/hadoop-setup-hdfs.sh

Start all data nodes after stopping first.

$ /etc/init.d/hadoop-datanode start

Start job tracker node.

$ /etc/init.d/hadoop-jobtracker start

Start task tracker nodes

$ /etc/init.d/hadoop-tasktracker start

Create a user account on HDFS for yourself.

$ /usr/sbin/hadoop-create-user.sh -u $USER

Set up Hadoop Environment

$ vi ~/.bash_profile

In INSERT mode set path for JAVA_HOME

Export JAVA_HOME

Save file by clicking esc:wq

Run the .bash_profile

$ source ~/.bash_profile

Set JAVA_HOME path in Hadoop Environment file

$ sudo vi /etc/hadoop/hadoop-env.sh

Configure Hadoop

Use the following:

$ sudo vi /etc/hadoop/core-site.xml:

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

conf/hdfs-site.xml:

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

conf/mapred-site.xml:

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

Hadoop Commands

$ hadoop

$ hadoop namenode –format (Format the namenode, If ask to
answer ‘Y’)

$ hadoop namenode (Start the namenode)

$ find / -name start-dfs.sh (find the file in directory)

$ cd usr/sbin (Go to respective directory directly)

$ start-dfs.sh

$ start-mapred.sh

$ hadoop fs –ls / (Shows the HDFS root folder)

$ hadooop fs –put input/file01 /input/file01 (Copy local input/file01 to

HDFS root /input/file01)

Labels: Centos 6. RHEL install Hadoop Multi Node Server