Wednesday, 13 November 2013

Centos 6/RHEL install Hadoop Multi Node Server

Installation of Hadoop on your Centos 6/RHEL box is now a lot simpler since rpm versions have been made available but you nonetheless need to have installed the JDK prior to doing so.

Change the JAVA_HOME path to /usr/java/default and you can install Hadoop via yum from Epel repo.

$ sudo yum -y install hadoop

If you have any problems with yum you can also use the Apache mirror service, download your preference and install it with

$ sudo rpm -Uvh <rpm_package_name>

Once installed as a package, set  it all up.

Generate hadoop configuration on all nodes

$ /usr/sbin/hadoop-setup-conf.sh

--jobtracker-url=${jobtracker}:9001 \

--conf-dir=/etc/hadoop \

--hdfs-dir=/var/lib/hadoop/hdfs \
   
--namenode-dir=/var/lib/hadoop/hdfs/namenode \
   
--mapred-dir=/var/lib/hadoop/mapred \
   
--datanode-dir=/var/lib/hadoop/hdfs/data \

--log-dir=/var/log/hadoop \

--auto

Where ${namenode} and ${jobtracker} should be replaced with hostname of namenode and jobtracker.

Format namenode and setup default HDFS layout.

$ /usr/sbin/hadoop-setup-hdfs.sh

Start all data nodes after stopping first.


$ /etc/init.d/hadoop-datanode start

Start job tracker node.
   
$ /etc/init.d/hadoop-jobtracker start

Start task tracker nodes
   
$ /etc/init.d/hadoop-tasktracker start

Create a user account on HDFS for yourself.
   
$ /usr/sbin/hadoop-create-user.sh -u $USER

Set up Hadoop Environment
   
$ vi ~/.bash_profile

In INSERT mode set path for JAVA_HOME

Export JAVA_HOME

Save file by clicking esc:wq

Run the .bash_profile
   
$ source ~/.bash_profile

Set JAVA_HOME path in Hadoop Environment file
   
$ sudo vi /etc/hadoop/hadoop-env.sh

Configure Hadoop

Use the following:

$ sudo vi /etc/hadoop/core-site.xml:

<configuration>

<property>
   
<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>
   
</configuration>

conf/hdfs-site.xml:

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

conf/mapred-site.xml:

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

 </property>

</configuration>

Hadoop Commands

$ hadoop

$ hadoop namenode –format (Format the namenode, If ask to
answer ‘Y’)
   
$ hadoop namenode (Start the namenode)
   
$ find / -name start-dfs.sh (find the file in directory)
   
$ cd usr/sbin (Go to respective directory directly)
   
$ start-dfs.sh
   
$ start-mapred.sh

$ hadoop fs –ls / (Shows the HDFS root folder)
   
$ hadooop fs –put input/file01 /input/file01 (Copy local input/file01 to

HDFS root /input/file01)


16 comments:

  1. I get a lot of great information here and this is what I am searching for Hadoop. Thank you for your sharing. I have bookmark this page for my future reference.
    Hadoop Training in hyderabad

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to Hadoop Training in Chennai

    ReplyDelete

  4. Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for me.. AWS course chennai | AWS Certification in chennai | AWS Certification chennai

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing. VMWare course chennai | VMWare certification in chennai | VMWare certification chennai

    ReplyDelete
  7. This is extremely helpful info!! Very good work. Everything is very interesting to learn and easy to understood. Thank you for giving information. Cloud Computing Training in chennai | Cloud Computing Training chennai | Cloud Computing Course in chennai | Cloud Computing Course chennai

    ReplyDelete
  8. This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.
    Regards,
    PHP Training in Chennai|PHP Course in Chennai|PHP Training Institute in Chennai

    ReplyDelete
  9. Whatever we gathered information from the blogs, we should implement that in practically then only we can understand that exact thing clearly, but it’s no need to do it, because you have explained the concepts very well. It was crystal clear, keep sharing dude.
    Regards,
    Web designing course in chennai|Web design training in chennai|Salesforce training in Chennai

    ReplyDelete
  10. Thanks for sharing this niche useful informative post to our knowledge, Actually SAP is ERP software that can be used in many companies for their day to day business activities it has great scope in future so do your sap course in Chennai
    Regards,
    SAP Training in Chennai|SAP course in chennai|SAP ABAP Training Institutes in Chennai

    ReplyDelete
  11. Well post, Thanks for sharing this to our vision. In recent day’s customer relationship play vital role to get good platform in business industry, Sales force crm tool helps you to maintain your customer relationship enhancement.
    Regards,
    Salesforce course in Chennai|Salesforce training institute in Chennai|Salesforce training chennai

    ReplyDelete
  12. Thanks Admin for sharing such a useful post, I hope it’s useful to many individuals for developing their skill to get good career.
    Regards,

    cognos Training in Chennai|cognos Training Chennai|cognos Training

    ReplyDelete
  13. Using big data analytics may give the companies many fruitful results, the findings can be implemented in their business decisions so as to minimize their risk and to cut the costs.
    hadoop training in chennai|big data training|big data training in chennai

    ReplyDelete
  14. Cloud computing is the next big thing, through cloud the users have the liberty to use a shared network. The companies can focus on core business parts rather than investing heavily on infrastucture.
    cloud computing training in chennai|cloud computing courses in chennai|cloud computing training

    ReplyDelete
  15. Great content. I really enjoyed while reading this content with useful information, keep sharing.
    Hadoop Training in Chennai | Hadoop Training Chennai | FITA Velachery | FITA Academy Chennai.

    ReplyDelete