centos 6.5搭建hadoop2.2.0
时间:2016-02-27 14:05 来源:linux.it.net.cn 作者:IT
由于测试学习用,所以安装三个虚拟机:
-
s1=192.198.56.101
-
s1=192.198.56.102
-
s1=192.198.56.103
s1=192.198.56.101
s1=192.198.56.102
s1=192.198.56.103
修改hosts文件:
#vim /etc/hosts //加入最下面
-
192.168.56.101 hadoop1
-
192.168.56.102 hadoop2
-
192.168.56.103 hadoop3
192.168.56.101 hadoop1
192.168.56.102 hadoop2
192.168.56.103 hadoop3
1,(101,102,103)上安装jdk:
-
下载地址:http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
-
#cd /usr/local/
-
#wget -c http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
-
#chmod +x ./jdk-6u26-dlj-linux-i586.bin
-
#./jdk-6u26-dlj-linux-i586.bin
下载地址:http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
#cd /usr/local/
#wget -c http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1
#chmod +x ./jdk-6u26-dlj-linux-i586.bin
#./jdk-6u26-dlj-linux-i586.bin
看到lucese协议,撸到底输入yes,或者直接q,然后输入yes,等待完成
2,(101,102,103)增加环境变量:
-
#vim + /etc/profile //文件最底部加入
-
-
export JAVA_HOME=/usr/local/jdk1.6.0_26/
-
export CLASSPATH=.:$JAVA_HOME/lib.tools.jar
-
export PATH=$JAVA_HOME/bin:$PATH
#vim + /etc/profile //文件最底部加入
export JAVA_HOME=/usr/local/jdk1.6.0_26/
export CLASSPATH=.:$JAVA_HOME/lib.tools.jar
export PATH=$JAVA_HOME/bin:$PATH
-
#source /ect/profile
#source /ect/profile
3,配置ssh key免密码登录:
-
(1)#ssh-keygen -t rsa //一路回车即可,在三台机器上都要执行
-
-
(2)然后将每台服务器上的id_rsa.pub均拷贝到其他服务上命名为authorized_keys
-
-
(3)在101服务器上:
-
#scp ~/root/.ssh/id_rsa.pub root@192.168.56.102:/root/.ssh/authorized_keys
-
#scp ~/root/.ssh/id_rsa.pub root@192.168.56.103:/root/.ssh/authorized_keys
-
-
(4)到102和103服务器上:
-
#cd ~/.ssh/
-
#cat id_rsa.pub >> authorized_keys
-
-
(5)将102服务器上的~/root/.ssh/id_rsa.pub,追加到101和103服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖
-
-
(6)到101和103服务器上:
-
#cd ~/.ssh/
-
#cat id_rsa.pub >> authorized_keys
-
-
(7)将103服务器上的~/root/.ssh/id_rsa.pub,追加到102和101服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖
-
-
(8)到102和101服务器上:
-
#cd ~/.ssh/
-
#cat id_rsa.pub >> authorized_keys
-
-
做完上述步骤之后,三台服务器均可以免密码互相登录了。
(1)#ssh-keygen -t rsa //一路回车即可,在三台机器上都要执行
(2)然后将每台服务器上的id_rsa.pub均拷贝到其他服务上命名为authorized_keys
(3)在101服务器上:
#scp ~/root/.ssh/id_rsa.pub root@192.168.56.102:/root/.ssh/authorized_keys
#scp ~/root/.ssh/id_rsa.pub root@192.168.56.103:/root/.ssh/authorized_keys
(4)到102和103服务器上:
#cd ~/.ssh/
#cat id_rsa.pub >> authorized_keys
(5)将102服务器上的~/root/.ssh/id_rsa.pub,追加到101和103服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖
(6)到101和103服务器上:
#cd ~/.ssh/
#cat id_rsa.pub >> authorized_keys
(7)将103服务器上的~/root/.ssh/id_rsa.pub,追加到102和101服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖
(8)到102和101服务器上:
#cd ~/.ssh/
#cat id_rsa.pub >> authorized_keys
做完上述步骤之后,三台服务器均可以免密码互相登录了。
4,(101)安装hadoop:
-
#cd /usr/local/
-
#wget -c http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
-
-
#tar zxvf hadoop-2.2.0.tar.gz
-
#cd hadoop-2.2.0/etc/hadoop/
-
#vim hadoop-env.sh
#cd /usr/local/
#wget -c http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
#tar zxvf hadoop-2.2.0.tar.gz
#cd hadoop-2.2.0/etc/hadoop/
#vim hadoop-env.sh
//找到export JAVA_HOME=${JAVA_HOME}复制一行,按两次yy
注视掉一行,
将未注释的改为:export JAVA_HOME=/usr/local/jdk1.6.0_26/
5,编辑core-site.xml文件:
-
#vim core-site.xml //将下面内容加入文件的<configuration>这里</configuration>
#vim core-site.xml //将下面内容加入文件的<configuration>这里</configuration>
-
<property>
-
<name>fs.defaultFS</name>
-
<value>hdfs://hadoop1:9000</value>
-
</property>
-
<property>
-
<name>io.file.buffer.size</name>
-
<value>131072</value>
-
</property>
-
<property>
-
<name>hadoop.tmp.dir</name>
-
<value>/usr/local/hadoop-2.2.0/tmp</value>
-
<description>Abase for other temporary directories.</description>
-
</property>
-
<property>
-
<name>hadoop.proxyuser.hduser.hosts</name>
-
<value>*</value>
-
</property>
-
<property>
-
<name>hadoop.proxyuser.hduser.groups</name>
-
<value>*</value>
-
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
6,编辑mapred-site.xml文件:
-
#vim mapred-site.xml // 将下面内容加入文件的<configuration>这里</configuration>
#vim mapred-site.xml // 将下面内容加入文件的<configuration>这里</configuration>
-
<property>
-
<name>mapred.job.tracker</name>
-
<value>hadoop1:9001</value>
-
</property>
-
<property>
-
<name>mapreduce.framework.name</name>
-
<value>yarn</value>
-
</property>
-
<property>
-
<name>mapreduce.jobhistory.address</name>
-
<value>hadoop1:10020</value>
-
</property>
-
<property>
-
<name>mapreduce.jobhistory.webapp.address</name>
-
<value>hadoop1:19888</value>
-
</property>
<property>
<name>mapred.job.tracker</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>
7,编辑hdfs-site.xml
-
#vim hdfs-site.xml<span style="white-space: pre;"> </span>// 将下面内容加入文件的<configuration>这里</configuration>
#vim hdfs-site.xml // 将下面内容加入文件的<configuration>这里</configuration>
-
<property>
-
<name>dfs.namenode.secondary.http-address</name>
-
<value>hadoop1:9001</value>
-
</property>
-
<property>
-
<name>dfs.namenode.name.dir</name>
-
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/name</value>
-
</property>
-
<property>
-
<name>dfs.datanode.data.dir</name>
-
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/data</value>
-
</property>
-
<property>
-
<name>dfs.replication</name>
-
<value>1</value>
-
</property>
-
<property>
-
<name>dfs.webhdfs.enabled</name>
-
<value>true</value>
-
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop-2.2.0/tmp/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
8,编辑yarn-site.xml文件:
-
#vim yarn-site.xml //将下面内容加入文件的<configuration>这里</configuration>
#vim yarn-site.xml //将下面内容加入文件的<configuration>这里</configuration>
-
<property>
-
<name>yarn.nodemanager.aux-services</name>
-
<value>mapreduce_shuffle</value>
-
</property>
-
<property>
-
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
-
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
-
</property>
-
<property>
-
<name>yarn.resourcemanager.address</name>
-
<value>hadoop1:8032</value>
-
</property>
-
<property>
-
<name>yarn.resourcemanager.scheduler.address</name>
-
<value>hadoop1:8030</value>
-
</property>
-
<property>
-
<name>yarn.resourcemanager.resource-tracker.address</name>
-
<value>hadoop1:8031</value>
-
</property>
-
<property>
-
<name>yarn.resourcemanager.admin.address</name>
-
<value>hadoop1:8033</value>
-
</property>
-
<property>
-
<name>yarn.resourcemanager.webapp.address</name>
-
<value>hadoop1:8088</value>
-
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop1:8088</value>
</property>
9,编辑slaves文件:
-
#vim slaves //改为如下内容
#vim slaves //改为如下内容
-
hadoop1
-
hadoop2
-
hadoop3
hadoop1
hadoop2
hadoop3
10,复制hadoo到其他两台机器:
-
#scp /usr/local/hadoop-2.2.0 root@192.168.56.102:/usr/local/
-
#scp /usr/local/hadoop-2.2.0 root@192.168.56.103:/usr/local/
#scp /usr/local/hadoop-2.2.0 root@192.168.56.102:/usr/local/
#scp /usr/local/hadoop-2.2.0 root@192.168.56.103:/usr/local/
11,在三台服务器上修改环境变量:
-
#vim + /etc/profile //在末尾追加如下内容
#vim + /etc/profile //在末尾追加如下内容
-
export HADOOP_HOME=/usr/local/hadoop-2.2.0
-
export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH
-
#source /etc/profile
export HADOOP_HOME=/usr/local/hadoop-2.2.0
export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH
#source /etc/profile
12,在三台服务器上创建目录并给予读写权限:
-
#mkdir /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data
-
#chmod -R 0777 /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data
#mkdir /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data
#chmod -R 0777 /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data
需要注意jdk内容,我下载的jdk不知道为何总少这几个包,只能手动压,否则导致hadoop无法启动:
-
1,/usr/local/jdk1.6.0_26/jre/lib/jsse.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 jsse.pack jsse.jar
-
2,/usr/local/jdk1.6.0_26/lib/tools.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 tools.pack tools.jar
-
3,/usr/local/jdk1.6.0_26/jre/lib/rt.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 rt.pack rt.jar
1,/usr/local/jdk1.6.0_26/jre/lib/jsse.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 jsse.pack jsse.jar
2,/usr/local/jdk1.6.0_26/lib/tools.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 tools.pack tools.jar
3,/usr/local/jdk1.6.0_26/jre/lib/rt.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 rt.pack rt.jar
13,安装完成:
-
<strong>然后在主服务器(101)</strong>
-
#hadoop namenode -format //格式化name节点
-
#/usr/local/hadoop/sbin/start-all.sh //启动所有
-
-
#jps //查看进程
-
32029 SecondaryNameNode
-
31866 NameNode
-
32164 ResourceManager
-
32655 Jps
-
-
# hdfs dfsadmin -report //查看hdfs报考
-
Configured Capacity: 14184103936 (13.21 GB)
-
Present Capacity: 8253059072 (7.69 GB)
-
DFS Remaining: 8167120896 (7.61 GB)
-
DFS Used: 85938176 (81.96 MB)
-
DFS Used%: 1.04%
-
Under replicated blocks: 0
-
Blocks with corrupt replicas: 0
-
Missing blocks: 0
-
-
-------------------------------------------------
-
Datanodes available: 2 (2 total, 0 dead)
-
-
Live datanodes:
-
Name: 192.168.10.123:50010 (hadoop3)
-
Hostname: hadoop3
-
Decommission Status : Normal
-
Configured Capacity: 7092051968 (6.60 GB)
-
DFS Used: 24576 (24 KB)
-
Non DFS Used: 2965458944 (2.76 GB)
-
DFS Remaining: 4126568448 (3.84 GB)
-
DFS Used%: 0.00%
-
DFS Remaining%: 58.19%
-
Last contact: Thu Feb 20 16:58:47 CST 2014
-
-
-
Name: 192.168.10.110:50010 (hadoop2)
-
Hostname: hadoop2
-
Decommission Status : Normal
-
Configured Capacity: 7092051968 (6.60 GB)
-
DFS Used: 85913600 (81.93 MB)
-
Non DFS Used: 2965585920 (2.76 GB)
-
DFS Remaining: 4040552448 (3.76 GB)
-
DFS Used%: 1.21%
-
DFS Remaining%: 56.97%
-
Last contact: Thu Feb 20 17:00:32 CST 2014
-
-
<strong>//启动成功之后试试</strong>
-
# hadoop fs -mkdir /test/
-
# hadoop fs -ls /
-
drwxr-xr-x - root supergroup 0 2014-02-20 17:13 /test
-
-
<strong>成功 ^_^了,上传一个文件试试</strong>
-
#hadoop fs -put CentOS-6.5-x86_64-LiveCD.iso /test/
-
#hadoop fs -ls /test/
-
Found 1 items
-
-rw-r--r-- 1 root supergroup 680525824 2014-02-20 17:37 /test/CentOS-6.5-x86_64-LiveCD.iso
(责任编辑:IT)
由于测试学习用,所以安装三个虚拟机:
s1=192.198.56.101 s1=192.198.56.102 s1=192.198.56.103 修改hosts文件: #vim /etc/hosts //加入最下面
192.168.56.101 hadoop1 192.168.56.102 hadoop2 192.168.56.103 hadoop3 1,(101,102,103)上安装jdk:
下载地址:http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1 #cd /usr/local/ #wget -c http://download.chinaunix.net/down.php?id=33931&ResourceID=61&site=1 #chmod +x ./jdk-6u26-dlj-linux-i586.bin #./jdk-6u26-dlj-linux-i586.bin 看到lucese协议,撸到底输入yes,或者直接q,然后输入yes,等待完成
2,(101,102,103)增加环境变量:
#vim + /etc/profile //文件最底部加入 export JAVA_HOME=/usr/local/jdk1.6.0_26/ export CLASSPATH=.:$JAVA_HOME/lib.tools.jar export PATH=$JAVA_HOME/bin:$PATH
#source /ect/profile
3,配置ssh key免密码登录:
(1)#ssh-keygen -t rsa //一路回车即可,在三台机器上都要执行 (2)然后将每台服务器上的id_rsa.pub均拷贝到其他服务上命名为authorized_keys (3)在101服务器上: #scp ~/root/.ssh/id_rsa.pub root@192.168.56.102:/root/.ssh/authorized_keys #scp ~/root/.ssh/id_rsa.pub root@192.168.56.103:/root/.ssh/authorized_keys (4)到102和103服务器上: #cd ~/.ssh/ #cat id_rsa.pub >> authorized_keys (5)将102服务器上的~/root/.ssh/id_rsa.pub,追加到101和103服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖 (6)到101和103服务器上: #cd ~/.ssh/ #cat id_rsa.pub >> authorized_keys (7)将103服务器上的~/root/.ssh/id_rsa.pub,追加到102和101服务器的~/root/.ssh/authorized_keys最下面,注意是追加不是覆盖 (8)到102和101服务器上: #cd ~/.ssh/ #cat id_rsa.pub >> authorized_keys 做完上述步骤之后,三台服务器均可以免密码互相登录了。
4,(101)安装hadoop:
#cd /usr/local/ #wget -c http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz #tar zxvf hadoop-2.2.0.tar.gz #cd hadoop-2.2.0/etc/hadoop/ #vim hadoop-env.sh //找到export JAVA_HOME=${JAVA_HOME}复制一行,按两次yy 注视掉一行, 将未注释的改为:export JAVA_HOME=/usr/local/jdk1.6.0_26/
5,编辑core-site.xml文件:
#vim core-site.xml //将下面内容加入文件的<configuration>这里</configuration>
<property> <name>fs.defaultFS</name> <value>hdfs://hadoop1:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-2.2.0/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> 6,编辑mapred-site.xml文件:
#vim mapred-site.xml // 将下面内容加入文件的<configuration>这里</configuration>
<property> <name>mapred.job.tracker</name> <value>hadoop1:9001</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop1:19888</value> </property>
7,编辑hdfs-site.xml
#vim hdfs-site.xml // 将下面内容加入文件的<configuration>这里</configuration>
<property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop1:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop-2.2.0/tmp/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop-2.2.0/tmp/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
8,编辑yarn-site.xml文件:
#vim yarn-site.xml //将下面内容加入文件的<configuration>这里</configuration>
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hadoop1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop1:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hadoop1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hadoop1:8088</value> </property>
9,编辑slaves文件:
#vim slaves //改为如下内容
hadoop1 hadoop2 hadoop3 10,复制hadoo到其他两台机器:
#scp /usr/local/hadoop-2.2.0 root@192.168.56.102:/usr/local/ #scp /usr/local/hadoop-2.2.0 root@192.168.56.103:/usr/local/ 11,在三台服务器上修改环境变量:
#vim + /etc/profile //在末尾追加如下内容
export HADOOP_HOME=/usr/local/hadoop-2.2.0 export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH #source /etc/profile 12,在三台服务器上创建目录并给予读写权限:
#mkdir /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data #chmod -R 0777 /usr/local/hadoop-2.2.0/tmp /usr/local/hadoop-2.2.0/tmp/hdfs /usr/local/hadoop-2.2.0/tmp/hdfs/name /usr/local/hadoop-2.2.0/tmp/hdfs/data 需要注意jdk内容,我下载的jdk不知道为何总少这几个包,只能手动压,否则导致hadoop无法启动:
1,/usr/local/jdk1.6.0_26/jre/lib/jsse.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 jsse.pack jsse.jar 2,/usr/local/jdk1.6.0_26/lib/tools.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 tools.pack tools.jar 3,/usr/local/jdk1.6.0_26/jre/lib/rt.jar /如果没有,则用//usr/local/jdk1.6.0_26/bin/unpack200 rt.pack rt.jar 13,安装完成:
(责任编辑:IT) |