Cloudera Hadoop Distribution installation guide on a Cluster
#note: All quoted sentences should be typed in terminal without the quotes.
For all machines (master + slaves):
1. Download JDK1.6.0_*.rpm.bin file from oracle’s sun java site.
Verify successful installation by typing “java -version” in the
terminal. Reboot.
2. Download Cloudera’s hadoop distribution by typing
“yum install hadoop-0.20”,
verify by typing “hadoop version”; should show the hadoop
version currently installed. Reboot.
3. Type “jps” to verify all the daemons are running. This to verify if hadoop is
running in pseudo-distributed mode.
4. Now stop all daemons by typing,
“for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop;done;
5. Edit the hosts file by typing “vi /etc/hosts” and append the following into it,
master 192.168.1.34(or whatever is the master IP)
slave1 192.168.1.67(or whatever is the slave1 IP)
slave2 192.168.1.67(or whatever is the slave2 IP)
….
….
6. Change directory to /usr/lib/hadoop-0.20/conf and edit following files,
Modify only the following property in core-site.xml for the time being,
<property>
<name>fs.default.name</name>
<value>hdfs://master:8020</value>
</property>
Modify only the following property in mapred-site.xml for the time being,
<property>
<name>mapred.job.tracker</name>
<value>master:8021</value>
</property>
Now, modify masters and slaves file in each machines respectively,
In masters file write,
master
In slave file write,
slave1
slave2
…
…
Setting up password-less ssh from master to all slaves
In the namenode machine i.e. in the master in our case following should be done,
ssh-keygen -t dsa -P “” -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
chmod 600 $HOME/.ssh/authorized_keys
chown `whoami` $HOME/.ssh/authorized_keys
Copy the generated id_dsa.pub key to all slave machines using,
of slave1 when asked)
scp ~/.ssh/id_dsa.pub slave2:~/.ssh/master.pub (provide password
of slave2 when asked)
………
………
Do the following in all slaves, (by any means)
chmod go-w $HOME $HOME/.ssh
chmod 600 $HOME/.ssh/authorized_keys
chown `whoami` $HOME/.ssh/authorized_keys
Now, typing “ssh slaveN”, where N=1,2…n(slaves), should result in
password-less login.
7. To format the namenode in the master machine type “sudo -u hdfs hadoop namenode –format” 8. Start the following daemons in master,
sudo service hadoop-0.20-namenode start sudo service hadoop-0.20-jobtracker start sudo service hadoop-0.20-secondarynamenode start 9. Start the following daemons in all slaves, sudo service hadoop-0.20-tasktracker start sudo service hadoop-0.20-datanode start
No comments:
Post a Comment