> Linux集群 > Hadoop >

centos 6.5搭建hadoop2.2.0

由于测试学习用,所以安装三个虚拟机:

  1. s1=192.198.56.101  
  2. s1=192.198.56.102  
  3. s1=192.198.56.103  
s1=192.198.56.101
s1=192.198.56.102
s1=192.198.56.103

修改hosts文件:
#vim /etc/hosts //加入最下面
  1. 192.168.56.101  hadoop1  
  2. 192.168.56.102  hadoop2  
  3. 192.168.56.103  hadoop3  
192.168.56.101  hadoop1
192.168.56.102  hadoop2
192.168.56.103  hadoop3

1,(101,102,103)上安装jdk:
  1. 下载地址:http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1  
  2. #cd /usr/local/  
  3. #wget -c http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1  
  4. #chmod +x ./jdk-6u26-dlj-linux-i586.bin  
  5. #./jdk-6u26-dlj-linux-i586.bin  
下载地址:http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
#cd /usr/local/
#wget -c http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
#chmod +x ./jdk-6u26-dlj-linux-i586.bin
#./jdk-6u26-dlj-linux-i586.bin

看到lucese协议,撸到底输入yes,或者直接q,然后输入yes,等待完成

 

 

2,(101,102,103)增加环境变量:

  1. #vim + /etc/profile //文件最底部加入  
  2.   
  3. export JAVA_HOME=/usr/local/jdk1.6.0_26/  
  4. export CLASSPATH=.:$JAVA_HOME/lib.tools.jar  
  5. export PATH=$JAVA_HOME/bin:$PATH  
#vim + /etc/profile	//文件最底部加入

export JAVA_HOME=/usr/local/jdk1.6.0_26/
export CLASSPATH=.:$JAVA_HOME/lib.tools.jar
export PATH=$JAVA_HOME/bin:$PATH
  1. #source /ect/profile  
#source /ect/profile

 

 

3,配置ssh key免密码登录:

 

  1. (1)#ssh-keygen -t rsa   //一路回车即可,在三台机器上都要执行  
  2.   
  3. (2)然后将每台服务器上的id_rsa.pub均拷贝到其他服务上命名为authorized_keys  
  4.   
  5. (3)在101服务器上:  
  6.     #scp ~/root/.ssh/id_rsa.pub root@192.168.56.102:/root/.ssh/authorized_keys  
  7.     #scp ~/root/.ssh/id_rsa.pub root@192.168.56.103:/root/.ssh/authorized_keys    
  8.   
  9. (4)到102和103服务器上:  
  10.     #cd ~/.ssh/  
  11.     #cat id_rsa.pub >> authorized_keys  
  12.   
  13. (5)将102服务器上的~/root/.ssh/id_rsa.pub,追加到101和103服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖  
  14.   
  15. (6)到101和103服务器上:  
  16.     #cd ~/.ssh/  
  17.     #cat id_rsa.pub >> authorized_keys  
  18.   
  19. (7)将103服务器上的~/root/.ssh/id_rsa.pub,追加到102和101服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖  
  20.   
  21. (8)到102和101服务器上:  
  22.     #cd ~/.ssh/  
  23.     #cat id_rsa.pub >> authorized_keys  
  24.   
  25. 做完上述步骤之后,三台服务器均可以免密码互相登录了。  
(1)#ssh-keygen -t rsa	//一路回车即可,在三台机器上都要执行

(2)然后将每台服务器上的id_rsa.pub均拷贝到其他服务上命名为authorized_keys

(3)在101服务器上:
	#scp ~/root/.ssh/id_rsa.pub root@192.168.56.102:/root/.ssh/authorized_keys
	#scp ~/root/.ssh/id_rsa.pub root@192.168.56.103:/root/.ssh/authorized_keys	

(4)到102和103服务器上:
	#cd ~/.ssh/
	#cat id_rsa.pub >> authorized_keys

(5)将102服务器上的~/root/.ssh/id_rsa.pub,追加到101和103服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖

(6)到101和103服务器上:
	#cd ~/.ssh/
	#cat id_rsa.pub >> authorized_keys

(7)将103服务器上的~/root/.ssh/id_rsa.pub,追加到102和101服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖

(8)到102和101服务器上:
	#cd ~/.ssh/
	#cat id_rsa.pub >> authorized_keys

做完上述步骤之后,三台服务器均可以免密码互相登录了。


 

4,(101)安装hadoop:

  1. #cd /usr/local/  
  2. #wget -c http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz  
  3.   
  4. #tar zxvf hadoop-2.2.0.tar.gz  
  5. #cd hadoop-2.2.0/etc/hadoop/  
  6. #vim hadoop-env.sh   
#cd /usr/local/
#wget -c http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

#tar zxvf hadoop-2.2.0.tar.gz
#cd hadoop-2.2.0/etc/hadoop/
#vim hadoop-env.sh 

//找到export JAVA_HOME=${JAVA_HOME}复制一行,按两次yy
注视掉一行,
将未注释的改为:export JAVA_HOME=/usr/local/jdk1.6.0_26/

 

5,编辑core-site.xml文件:

 

  1. #vim core-site.xml  //将下面内容加入文件的<configuration>这里</configuration>  
#vim core-site.xml	//将下面内容加入文件的<configuration>这里</configuration>
  1. <property>  
  2. <name>fs.defaultFS</name>  
  3. <value>hdfs://hadoop1:9000</value>  
  4. </property>  
  5. <property>  
  6. <name>io.file.buffer.size</name>  
  7. <value>131072</value>  
  8. </property>  
  9. <property>  
  10. <name>hadoop.tmp.dir</name>  
  11. <value>/usr/local/hadoop-2.2.0/tmp</value>  
  12. <description>Abase for other temporary directories.</description>  
  13. </property>  
  14. <property>  
  15. <name>hadoop.proxyuser.hduser.hosts</name>  
  16. <value>*</value>  
  17. </property>  
  18. <property>  
  19. <name>hadoop.proxyuser.hduser.groups</name>  
  20. <value>*</value>  
  21. </property>  
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>

6,编辑mapred-site.xml文件:
  1. #vim mapred-site.xml    // 将下面内容加入文件的<configuration>这里</configuration>  
#vim mapred-site.xml	// 将下面内容加入文件的<configuration>这里</configuration>
  1. <property>  
  2. <name>mapred.job.tracker</name>  
  3. <value>hadoop1:9001</value>  
  4. </property>  
  5. <property>  
  6. <name>mapreduce.framework.name</name>  
  7. <value>yarn</value>  
  8. </property>  
  9. <property>  
  10. <name>mapreduce.jobhistory.address</name>  
  11. <value>hadoop1:10020</value>  
  12. </property>  
  13. <property>  
  14. <name>mapreduce.jobhistory.webapp.address</name>  
  15. <value>hadoop1:19888</value>  
  16. </property>  
<property>
<name>mapred.job.tracker</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>


 

7,编辑hdfs-site.xml

  1. #vim hdfs-site.xml<span style="white-space: pre;">  </span>// 将下面内容加入文件的<configuration>这里</configuration>  
#vim hdfs-site.xml	// 将下面内容加入文件的<configuration>这里</configuration>
  1. <property>  
  2. <name>dfs.namenode.secondary.http-address</name>  
  3. <value>hadoop1:9001</value>  
  4. </property>  
  5. <property>  
  6. <name>dfs.namenode.name.dir</name>  
  7. <value>/usr/local/hadoop-2.2.0/tmp/hdfs/name</value>  
  8. </property>  
  9. <property>  
  10. <name>dfs.datanode.data.dir</name>  
  11. <value>/usr/local/hadoop-2.2.0/tmp/hdfs/data</value>  
  12. </property>  
  13. <property>  
  14. <name>dfs.replication</name>  
  15. <value>1</value>  
  16. </property>  
  17. <property>  
  18. <name>dfs.webhdfs.enabled</name>  
  19. <value>true</value>  
  20. </property>  
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>


 

8,编辑yarn-site.xml文件:

  1. #vim yarn-site.xml  //将下面内容加入文件的<configuration>这里</configuration>  
#vim yarn-site.xml	//将下面内容加入文件的<configuration>这里</configuration>
 
  1. <property>  
  2. <name>yarn.nodemanager.aux-services</name>  
  3. <value>mapreduce_shuffle</value>  
  4. </property>  
  5. <property>  
  6. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>  
  7. <value>org.apache.hadoop.mapred.ShuffleHandler</value>  
  8. </property>  
  9. <property>  
  10. <name>yarn.resourcemanager.address</name>  
  11. <value>hadoop1:8032</value>  
  12. </property>  
  13. <property>  
  14. <name>yarn.resourcemanager.scheduler.address</name>  
  15. <value>hadoop1:8030</value>  
  16. </property>  
  17. <property>  
  18. <name>yarn.resourcemanager.resource-tracker.address</name>  
  19. <value>hadoop1:8031</value>  
  20. </property>  
  21. <property>  
  22. <name>yarn.resourcemanager.admin.address</name>  
  23. <value>hadoop1:8033</value>  
  24. </property>  
  25. <property>  
  26. <name>yarn.resourcemanager.webapp.address</name>  
  27. <value>hadoop1:8088</value>  
  28. </property>  
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop1:8088</value>
</property>

 

 

9,编辑slaves文件:

  1. #vim slaves //改为如下内容  
#vim slaves	//改为如下内容
 
  1. hadoop1  
  2. hadoop2  
  3. hadoop3  
hadoop1
hadoop2
hadoop3

10,复制hadoo到其他两台机器:

 

  1. #scp /usr/local/hadoop-2.2.0 root@192.168.56.102:/usr/local/  
  2. #scp /usr/local/hadoop-2.2.0 root@192.168.56.103:/usr/local/  
#scp /usr/local/hadoop-2.2.0 root@192.168.56.102:/usr/local/
#scp /usr/local/hadoop-2.2.0 root@192.168.56.103:/usr/local/

 

11,在三台服务器上修改环境变量:
 
  1. #vim + /etc/profile //在末尾追加如下内容  
#vim + /etc/profile	//在末尾追加如下内容
 
  1. export HADOOP_HOME=/usr/local/hadoop-2.2.0  
  2. export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH  
  3. #source /etc/profile  
export HADOOP_HOME=/usr/local/hadoop-2.2.0
export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH
#source /etc/profile

12,在三台服务器上创建目录并给予读写权限:

 

  1. #mkdir /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data  
  2. #chmod -R 0777 /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data  
#mkdir /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data
#chmod -R 0777 /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data

需要注意jdk内容,我下载的jdk不知道为何总少这几个包,只能手动压,否则导致hadoop无法启动:

 

  1. 1,/usr/local/jdk1.6.0_26/jre/lib/jsse.jar   /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200    jsse.pack jsse.jar  
  2. 2,/usr/local/jdk1.6.0_26/lib/tools.jar  /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200    tools.pack tools.jar  
  3. 3,/usr/local/jdk1.6.0_26/jre/lib/rt.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200    rt.pack rt.jar  
1,/usr/local/jdk1.6.0_26/jre/lib/jsse.jar	/如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200	 jsse.pack jsse.jar
2,/usr/local/jdk1.6.0_26/lib/tools.jar	/如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200	 tools.pack tools.jar
3,/usr/local/jdk1.6.0_26/jre/lib/rt.jar	/如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200	 rt.pack rt.jar

13,安装完成:
 
  1. <strong>然后在主服务器(101)</strong>  
  2. #hadoop namenode -format    //格式化name节点  
  3. #/usr/local/hadoop/sbin/start-all.sh    //启动所有  
  4.   
  5. #jps    //查看进程  
  6. 32029 SecondaryNameNode  
  7. 31866 NameNode  
  8. 32164 ResourceManager  
  9. 32655 Jps  
  10.   
  11. # hdfs dfsadmin -report //查看hdfs报考  
  12. Configured Capacity: 14184103936 (13.21 GB)  
  13. Present Capacity: 8253059072 (7.69 GB)  
  14. DFS Remaining: 8167120896 (7.61 GB)  
  15. DFS Used: 85938176 (81.96 MB)  
  16. DFS Used%: 1.04%  
  17. Under replicated blocks: 0  
  18. Blocks with corrupt replicas: 0  
  19. Missing blocks: 0  
  20.   
  21. -------------------------------------------------  
  22. Datanodes available: 2 (2 total, 0 dead)  
  23.   
  24. Live datanodes:  
  25. Name: 192.168.10.123:50010 (hadoop3)  
  26. Hostname: hadoop3  
  27. Decommission Status : Normal  
  28. Configured Capacity: 7092051968 (6.60 GB)  
  29. DFS Used: 24576 (24 KB)  
  30. Non DFS Used: 2965458944 (2.76 GB)  
  31. DFS Remaining: 4126568448 (3.84 GB)  
  32. DFS Used%: 0.00%  
  33. DFS Remaining%: 58.19%  
  34. Last contact: Thu Feb 20 16:58:47 CST 2014  
  35.   
  36.   
  37. Name: 192.168.10.110:50010 (hadoop2)  
  38. Hostname: hadoop2  
  39. Decommission Status : Normal  
  40. Configured Capacity: 7092051968 (6.60 GB)  
  41. DFS Used: 85913600 (81.93 MB)  
  42. Non DFS Used: 2965585920 (2.76 GB)  
  43. DFS Remaining: 4040552448 (3.76 GB)  
  44. DFS Used%: 1.21%  
  45. DFS Remaining%: 56.97%  
  46. Last contact: Thu Feb 20 17:00:32 CST 2014  
  47.   
  48. <strong>//启动成功之后试试</strong>  
  49. # hadoop fs -mkdir /test/  
  50. # hadoop fs -ls /  
  51. drwxr-xr-x   - root supergroup          0 2014-02-20 17:13 /test  
  52.   
  53. <strong>成功 ^_^了,上传一个文件试试</strong>  
  54. #hadoop fs -put CentOS-6.5-x86_64-LiveCD.iso /test/  
  55. #hadoop fs -ls /test/  
  56. Found 1 items  
  57. -rw-r--r--   1 root supergroup  680525824 2014-02-20 17:37 /test/CentOS-6.5-x86_64-LiveCD.iso  


(责任编辑:IT)