Hadoop 2.2.0 安装路径: /opt/hadoop-2.2.0
Java版本:
[steve@bmc opt]$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
用于运行Hadoop的用户steve:steve
用steve配置本机无密钥登录(./start-dfs.sh运行时需要):
[steve@bmc ~]$ ssh-keygen -t rsa -P ""
[steve@bmc ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[steve@bmc ~]$ chmod 600 ~/.ssh/authorized_keys
[steve@bmc ~]$ chmod 600 ~/.ssh/id_rsa
[steve@bmc ~]$ ssh-add ~/.ssh/id_rsa
Identity added: /home/steve/.ssh/id_rsa (/home/steve/.ssh/id_rsa)
[steve@bmc ~]$ ssh localhost
Last login: Thu Jan 23 22:06:46 2014 from bmc
在64位系统上编译Hadoop 2.2.0,获取64位native库,将其复制到/opt/hadoop-2.2.0/lib/native-x64
用缺省的32位native库(/opt/hadoop-2.2.0/lib/native)会出现下面的错误:
14/01/27 10:52:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /opt/hadoop-2.2.0/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
steve的环境变量(.bashrc):
export ANT_HOME=/opt/apache-ant-1.9.2
export JAVA_HOME=/usr/java/jdk1.7.0_45
export JRE_HOME=${JAVA_HOME}/jre
export PATH=$JAVA_HOME/bin:$ANT_HOME/bin:/opt/eclipse:/opt/apache-jmeter-2.11/bin:$PATH
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_PREFIX=/opt/hadoop-2.2.0
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export HADOOP_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native-x64 (一定要,否则会出奇怪的域名解析错误)
export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_YARN_HOME=${HADOOP_PREFIX}
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native-64 -Djava.net.preferIPv4Stack=true"
[steve@bmc opt]$ sudo chown steve:steve hadoop-2.2.0/ -R
[steve@bmc /]$ sudo mkdir /hadoop
[steve@bmc /]$ sudo chown steve:steve /hadoop/
[steve@bmc sbin]$ sudo service iptables stop (一定要关防火墙,否则NodeManager无法启动)
修改配置文件:
core-site.xml:
fs.defaultFS
hdfs://localhost:9100
hadoop.tmp.dir
/hadoop/tmp
hdfs-site.xml:
dfs.namenode.name.dir
/hadoop/dfs/name
dfs.datanode.data.dir
/hadoop/dfs/data
dfs.replication
1
dfs.permissions
false
mapred-site.xml:
mapreduce.framework.name
yarn
mapreduce.job.tracker
hdfs://localhost:9210
true
mapreduce.map.memory.mb
1536
mapreduce.map.java.opts
-Xmx1024M
mapreduce.reduce.memory.mb
3072
mapreduce.reduce.java.opts
-Xmx2560M
mapreduce.task.io.sort.mb
512
mapreduce.task.io.sort.factor
100
mapreduce.reduce.shuffle.parallelcopies
50
mapred.map.tasks
10
As a rule of thumb, use 10x the number of slaves(i.e., number of tasktrackers).
mapred.reduce.tasks
2
As a rule of thumb, use 2x the number of slaveprocessors (i.e., number of tasktrackers).
mapred.system.dir
file:/hadoop/mapred/system
true
mapred.local.dir
file:/hadoop/mapred/local
true
yarn-site.xml:
mapreduce.framework.name
yarn
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce_shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
The address of the applications manager interface in the RM.
yarn.resourcemanager.address
localhost:18040
The address of the scheduler interface.
yarn.resourcemanager.scheduler.address
localhost:18030
The address of the RM web application.
yarn.resourcemanager.webapp.address
localhost:18088
The address of the resource tracker interface.
yarn.resourcemanager.resource-tracker.address
localhost:8025
初始化hadoop 系统:
[steve@bmc bin]$ cd /opt/hadoop-2.2.0/bin/
[steve@bmc bin]$ ./hdfs namenode -format
因为hadoop默认是支持ipv6的 因此如果linux本身支持ipv6的话 就会把端口绑定到ipv6上。需要编辑~/.bashrc:
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib -Djava.net.preferIPv4Stack=true"
启动dfs:
[steve@bmc sbin]$ ./start-dfs.sh
14/01/26 12:07:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-namenode-bmc.out
localhost: starting datanode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-datanode-bmc.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-secondarynamenode-bmc.out
14/01/26 12:07:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
启动成功后,运行jps可以看到3个进程:
[steve@bmc sbin]$ jps
4312 NameNode
4428 DataNode
4682 Jps
4576 SecondaryNameNode
启动yarn:
[steve@bmc sbin]$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.2.0/logs/yarn-steve-resourcemanager-bmc.out
localhost: starting nodemanager, logging to /opt/hadoop-2.2.0/logs/yarn-steve-nodemanager-bmc.out
启动成功后,运行jps可以看到5个进程:
[steve@bmc sbin]$ jps
4312 NameNode
4756 ResourceManager
4428 DataNode
4855 NodeManager
4576 SecondaryNameNode
5156 Jps
[steve@bmc sbin]$ hdfs dfs -mkdir /test
[steve@bmc sbin]$ hdfs dfs -copyFromLocal ~/test.txt /test/
[steve@bmc sbin]$ hdfs dfs -cat /test/test.txt
I love so much. I like apple apple is one greatest company in the world what
is your name my name is Steve but I like swimming ha ha bye byte
[steve@bmc sbin]$ hadoop jar /opt/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /test/test.txt /test/out
如果出现问题,可以查看/opt/hadoop-2.2.0/logs下面的log。
下面的警告的原因:
14/01/25 20:34:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
直接从apache镜像中下载的编译好的Hadoop版本native library都是32版本的,如果要支持64位版本,必须自己重新编译
(责任编辑:IT) |