当前位置: > Linux集群 > Hadoop >

hadoop0.20.2完全分布模式安装和配置

时间:2014-12-30 23:19来源:linux.it.net.cn 作者:IT

-----------------------------------------------------------
hadoop集群规划
IP地址          hostname
------------    --------
10.10.10.100 master(namenode,secondary namenode,job tracker)
10.10.10.101 slave1(datanode,tasktracker)
10.10.10.102 slave2(datanode,tasktracker)
-----------------------------------------------------------

虚拟机软件VMWare Server2.0
操作系统:RedHat Enterprise Linux Server 5.3(32bit)*3
hadoop版本:0.20.2
jdk版本:1.7
注意:各操作用户和所在主机名请注意查看命令行的提示符
-----------------------------------------------------------

1、编辑/etc/hosts(三台机器都操作)
127.0.0.1       localhost
10.10.10.100    master
10.10.10.101    slave1
10.10.10.102    slave2

2、创建hadoop用户组和用户(三台机器都操作)
[root@master ~]# groupadd hadoop
[root@master ~]# useradd hadoop
[root@master ~]# passwd hadoop
Changing password for user hadoop.
New UNIX password: 
BAD PASSWORD: it is based on a dictionary word
Retype new UNIX password: 
passwd: all authentication tokens updated successfully.

3、上传需要用到的软件,这里用smb上传到hadoop家目录的software目录下(三台机器都操作)
[hadoop@master software]$ pwd
/home/hadoop/software
[hadoop@master software]$ ll
total 162096
-rwxr--r-- 1 hadoop hadoop  44575568 Feb  3 15:24 hadoop-0.20.2.tar.gz
-rwxr--r-- 1 hadoop hadoop 121236291 Jan  3 11:15 jdk-7u45-linux-i586.rpm

4、安装jdk(三台机器都操作)
[root@master ~]# cd /home/hadoop/software/
[root@master software]# ll
total 162096
-rwxr--r-- 1 hadoop hadoop  44575568 Feb  3 15:24 hadoop-0.20.2.tar.gz
-rwxr--r-- 1 hadoop hadoop 121236291 Jan  3 11:15 jdk-7u45-linux-i586.rpm
[root@master software]# rpm -ivh jdk-7u45-linux-i586.rpm 
Preparing...                ########################################### [100%]
   1:jdk                    ########################################### [100%]
Unpacking JAR files...
        rt.jar...
        jsse.jar...
        charsets.jar...
        tools.jar...
        localedata.jar...
        jfxrt.jar...
        plugin.jar...
        javaws.jar...
        deploy.jar...
[root@master software]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_45
export PATH=$JAVA_HOME/bin/:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

[root@master software]# source /etc/profile
[root@master software]# java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) Client VM (build 24.45-b08, mixed mode, sharing)

5、配置hadoop用户能够无密码登陆
--master节点
[hadoop@master ~]$ chmod -R 755 /home/hadoop
[hadoop@master ~]$ mkdir ~/.ssh
[hadoop@master ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
b4:be:4c:c0:b8:2c:bf:8f:ae:5c:8d:8b:4d:b8:90:87 hadoop@master
[hadoop@master ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_dsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
da:02:17:e3:10:42:38:a5:e9:12:f2:36:e5:22:14:6d hadoop@master

--slave1节点
[hadoop@slave1 ~]$ chmod -R 755 /home/hadoop
[hadoop@slave1 ~]$ mkdir ~/.ssh
[hadoop@slave1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
26:d6:66:43:36:ba:69:9f:95:d2:bb:4e:80:80:2f:0f hadoop@slave1
[hadoop@slave1 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_dsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
15:67:b5:ec:37:c4:db:ef:34:25:6c:8d:49:40:5d:2c hadoop@slave1

--slave2节点

[hadoop@slave2 ~]$ chmod -R 755 /home/hadoop
[hadoop@slave2 ~]$ mkdir ~/.ssh
[hadoop@slave2 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
b8:03:99:20:ec:0e:49:af:07:1c:66:15:7c:66:2e:03 hadoop@slave2
[hadoop@slave2 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_dsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
bc:52:59:f7:08:9a:37:17:57:af:dd:45:61:43:1e:e0 hadoop@slave2

--master节点
[hadoop@master ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@master ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

[hadoop@master ~]$ ssh slave1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'slave1 (10.10.10.101)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,10.10.10.101' (RSA) to the list of known hosts.
hadoop@slave1's password: 
[hadoop@master ~]$ ssh slave1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
hadoop@slave1's password:

[hadoop@master ~]$ ssh slave2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'slave2 (10.10.10.102)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave2,10.10.10.102' (RSA) to the list of known hosts.
hadoop@slave2's password: 
[hadoop@master ~]$ ssh slave2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
hadoop@slave2's password:

[hadoop@master ~]$ scp ~/.ssh/authorized_keys slave1:~/.ssh/authorized_keys
hadoop@slave1's password: 
authorized_keys                                                                                   100% 2994     2.9KB/s   00:00    
[hadoop@master ~]$ scp ~/.ssh/authorized_keys slave2:~/.ssh/authorized_keys
hadoop@slave2's password: 
authorized_keys                                                                                   100% 2994     2.9KB/s   00:00
--分别修改下各个节点的.ssh目录权限

[hadoop@master ~]$ chmod -R 700 ~/.ssh
[hadoop@slave1 ~]$ chmod -R 700 ~/.ssh
[hadoop@slave2 ~]$ chmod -R 700 ~/.ssh

--在master上进行测试
[hadoop@master ~]$ ssh master date
The authenticity of host 'master (10.10.10.100)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,10.10.10.100' (RSA) to the list of known hosts.
Fri Feb  7 00:10:42 CST 2014
[hadoop@master ~]$ ssh master date
Fri Feb  7 00:10:45 CST 2014
[hadoop@master ~]$ ssh slave1 date
Fri Feb  7 00:09:46 CST 2014
[hadoop@master ~]$ ssh slave2 date
Fri Feb  7 00:10:52 CST 2014
--在slave1上进行测试

[hadoop@slave1 ~]$ ssh master date
The authenticity of host 'master (10.10.10.100)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,10.10.10.100' (RSA) to the list of known hosts.
Fri Feb  7 00:11:30 CST 2014
[hadoop@slave1 ~]$ ssh master date
Fri Feb  7 00:11:32 CST 2014
[hadoop@slave1 ~]$ ssh slave1 date
The authenticity of host 'slave1 (10.10.10.101)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,10.10.10.101' (RSA) to the list of known hosts.
Fri Feb  7 00:10:34 CST 2014
[hadoop@slave1 ~]$ ssh slave1 date
Fri Feb  7 00:10:36 CST 2014
[hadoop@slave1 ~]$ ssh slave2 date
The authenticity of host 'slave2 (10.10.10.102)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave2,10.10.10.102' (RSA) to the list of known hosts.
Fri Feb  7 00:11:42 CST 2014
[hadoop@slave1 ~]$ ssh slave2 date
Fri Feb  7 00:11:44 CST 2014
--在slave2上进行测试

[hadoop@slave2 ~]$ ssh master date
The authenticity of host 'master (10.10.10.100)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,10.10.10.100' (RSA) to the list of known hosts.
Fri Feb  7 00:12:13 CST 2014
[hadoop@slave2 ~]$ ssh master date
Fri Feb  7 00:12:15 CST 2014
[hadoop@slave2 ~]$ ssh slave1  date
The authenticity of host 'slave1 (10.10.10.101)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,10.10.10.101' (RSA) to the list of known hosts.
Fri Feb  7 00:11:17 CST 2014
[hadoop@slave2 ~]$ ssh slave1  date
Fri Feb  7 00:11:18 CST 2014
[hadoop@slave2 ~]$ ssh slave2  date
The authenticity of host 'slave2 (10.10.10.102)' can't be established.
RSA key fingerprint is 99:ef:c2:9e:28:e3:b6:83:e2:00:eb:a3:ee:ad:29:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave2,10.10.10.102' (RSA) to the list of known hosts.
Fri Feb  7 00:12:26 CST 2014
[hadoop@slave2 ~]$ ssh slave2  date
Fri Feb  7 00:12:28 CST 2014

6、在master上安装hadoop
[hadoop@master software]$ pwd
/home/hadoop/software
[hadoop@master software]$ ll
total 162096
-rwxr-xr-x 1 hadoop hadoop  44575568 Feb  3 15:24 hadoop-0.20.2.tar.gz
-rwxr-xr-x 1 hadoop hadoop 121236291 Jan  3 11:15 jdk-7u45-linux-i586.rpm
[hadoop@master software]$ tar -zxvf hadoop-0.20.2.tar.gz
[hadoop@master software]$ ll
total 162100
drwxr-xr-x 12 hadoop hadoop      4096 Feb 19  2010 hadoop-0.20.2
-rwxr-xr-x  1 hadoop hadoop  44575568 Feb  3 15:24 hadoop-0.20.2.tar.gz
-rwxr-xr-x  1 hadoop hadoop 121236291 Jan  3 11:15 jdk-7u45-linux-i586.rpm
[hadoop@master software]$ mv hadoop-0.20.2 /home/hadoop
[hadoop@master software]$ cd 
[hadoop@master ~]$ ll
total 8
drwxr-xr-x 12 hadoop hadoop 4096 Feb 19  2010 hadoop-0.20.2
drwxr-xr-x  2 hadoop hadoop 4096 Feb  3 23:59 software


7、在master上配置hadoop
[hadoop@master conf]$ pwd
/home/hadoop/hadoop-0.20.2/conf
[hadoop@master conf]$ ll
total 56
-rw-rw-r-- 1 hadoop hadoop 3936 Feb 19  2010 capacity-scheduler.xml
-rw-rw-r-- 1 hadoop hadoop  535 Feb 19  2010 configuration.xsl
-rw-rw-r-- 1 hadoop hadoop  178 Feb 19  2010 core-site.xml
-rw-rw-r-- 1 hadoop hadoop 2237 Feb 19  2010 hadoop-env.sh
-rw-rw-r-- 1 hadoop hadoop 1245 Feb 19  2010 hadoop-metrics.properties
-rw-rw-r-- 1 hadoop hadoop 4190 Feb 19  2010 hadoop-policy.xml
-rw-rw-r-- 1 hadoop hadoop  178 Feb 19  2010 hdfs-site.xml
-rw-rw-r-- 1 hadoop hadoop 2815 Feb 19  2010 log4j.properties
-rw-rw-r-- 1 hadoop hadoop  178 Feb 19  2010 mapred-site.xml
-rw-rw-r-- 1 hadoop hadoop   10 Feb 19  2010 masters
-rw-rw-r-- 1 hadoop hadoop   10 Feb 19  2010 slaves
-rw-rw-r-- 1 hadoop hadoop 1243 Feb 19  2010 ssl-client.xml.example
-rw-rw-r-- 1 hadoop hadoop 1195 Feb 19  2010 ssl-server.xml.example

[hadoop@master conf]$ vi hadoop-env.sh
# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/java/jdk1.7.0_45

[hadoop@master conf]$ vi core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

[hadoop@master conf]$ vi hdfs-site.xml 
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-name</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/home/hadoop/hadoop-namesecondary</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

[hadoop@master conf]$ vi mapred-site.xml 
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>

[hadoop@master conf]$ vi masters
master

[hadoop@master conf]$ vi slaves 
slave1
slave2

8、分发master上配置好的的hadoop软件到slave1和slave2节点
[hadoop@master ~]$ scp -r hadoop-0.20.2/ slave1:~/
[hadoop@master ~]$ scp -r hadoop-0.20.2/ slave2:~/

9、格式化hdfs(在master节点上)
[hadoop@master bin]$ pwd
/home/hadoop/hadoop-0.20.2/bin
[hadoop@master bin]$ ll
total 64
-rwxr-xr-x 1 hadoop hadoop 9998 Feb 19  2010 hadoop
-rwxr-xr-x 1 hadoop hadoop 1966 Feb 19  2010 hadoop-config.sh
-rwxr-xr-x 1 hadoop hadoop 3690 Feb 19  2010 hadoop-daemon.sh
-rwxr-xr-x 1 hadoop hadoop 1227 Feb 19  2010 hadoop-daemons.sh
-rwxr-xr-x 1 hadoop hadoop 2710 Feb 19  2010 rcc
-rwxr-xr-x 1 hadoop hadoop 2043 Feb 19  2010 slaves.sh
-rwxr-xr-x 1 hadoop hadoop 1066 Feb 19  2010 start-all.sh
-rwxr-xr-x 1 hadoop hadoop  965 Feb 19  2010 start-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 1645 Feb 19  2010 start-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1159 Feb 19  2010 start-mapred.sh
-rwxr-xr-x 1 hadoop hadoop 1019 Feb 19  2010 stop-all.sh
-rwxr-xr-x 1 hadoop hadoop 1016 Feb 19  2010 stop-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 1146 Feb 19  2010 stop-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1068 Feb 19  2010 stop-mapred.sh
[hadoop@master bin]$ ./hadoop namenode -format

14/02/07 00:46:15 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/10.10.10.100
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
14/02/07 00:46:15 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
14/02/07 00:46:15 INFO namenode.FSNamesystem: supergroup=supergroup
14/02/07 00:46:15 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/02/07 00:46:15 INFO common.Storage: Image file of size 96 saved in 0 seconds.
14/02/07 00:46:15 INFO common.Storage: Storage directory /home/hadoop/hadoop-name has been successfully formatted.
14/02/07 00:46:15 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/10.10.10.100
************************************************************/


10、启动hadoop
[hadoop@master bin]$ ./start-all.sh (在master节点上)
starting namenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-jobtracker-master.out
slave1: starting tasktracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-tasktracker-slave1.out
slave2: starting tasktracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-tasktracker-slave2.out


11、查看hadoop进程
[hadoop@master bin]$ jps
8131 SecondaryNameNode
8201 JobTracker
8269 Jps
7968 NameNode

[hadoop@slave1 logs]$ jps
7570 DataNode
7722 Jps
7675 TaskTracker

[hadoop@slave2 conf]$ jps
7667 TaskTracker
7562 DataNode
7714 Jps


12、访问http服务
http://10.10.10.100:50030/(jobtracker的HTTP服务器地址和端口)
http://10.10.10.100:50060/(taskertracker的HTTP服务器地址和端口)
http://10.10.10.100:50070/(namenode的HTTP服务器地址和端口)
http://10.10.10.100:50075/(datanode的HTTP服务器地址和端口)
http://10.10.10.100:50090/(secondary namenode的HTTP服务器地址和端口)



(责任编辑:IT)
------分隔线----------------------------