> Linux集群 > Hadoop >

Hadoop 2.2.0 在单个Redhat Enterprise Linux 6.4 64bit 虚拟机

Hadoop 2.2.0 安装路径: /opt/hadoop-2.2.0

Java版本:
[steve@bmc opt]$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)


用于运行Hadoop的用户steve:steve

 

用steve配置本机无密钥登录(./start-dfs.sh运行时需要):
[steve@bmc ~]$ ssh-keygen -t rsa -P ""
[steve@bmc ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[steve@bmc ~]$ chmod 600 ~/.ssh/authorized_keys
[steve@bmc ~]$ chmod 600 ~/.ssh/id_rsa
[steve@bmc ~]$ ssh-add   ~/.ssh/id_rsa
Identity added: /home/steve/.ssh/id_rsa (/home/steve/.ssh/id_rsa)
[steve@bmc ~]$ ssh localhost
Last login: Thu Jan 23 22:06:46 2014 from bmc

 

在64位系统上编译Hadoop 2.2.0,获取64位native库,将其复制到/opt/hadoop-2.2.0/lib/native-x64
用缺省的32位native库(/opt/hadoop-2.2.0/lib/native)会出现下面的错误:
14/01/27 10:52:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /opt/hadoop-2.2.0/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.


steve的环境变量(.bashrc):
export ANT_HOME=/opt/apache-ant-1.9.2
export JAVA_HOME=/usr/java/jdk1.7.0_45
export JRE_HOME=${JAVA_HOME}/jre
export PATH=$JAVA_HOME/bin:$ANT_HOME/bin:/opt/eclipse:/opt/apache-jmeter-2.11/bin:$PATH
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export HADOOP_PREFIX=/opt/hadoop-2.2.0 
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin 
export HADOOP_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native-x64 (一定要,否则会出奇怪的域名解析错误)
export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX} 
export HADOOP_YARN_HOME=${HADOOP_PREFIX} 
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native-64 -Djava.net.preferIPv4Stack=true"

 

[steve@bmc opt]$ sudo chown steve:steve hadoop-2.2.0/ -R

[steve@bmc /]$ sudo mkdir /hadoop
[steve@bmc /]$ sudo chown steve:steve /hadoop/

[steve@bmc sbin]$ sudo service iptables stop (一定要关防火墙,否则NodeManager无法启动)

 

修改配置文件:
core-site.xml:

  
    fs.defaultFS
    hdfs://localhost:9100


    hadoop.tmp.dir
    /hadoop/tmp
 

 

hdfs-site.xml:


     dfs.namenode.name.dir
     /hadoop/dfs/name
      


     dfs.datanode.data.dir
     /hadoop/dfs/data
    


    dfs.replication
    1


    dfs.permissions
    false

 

mapred-site.xml:


    mapreduce.framework.name
    yarn


    mapreduce.job.tracker
    hdfs://localhost:9210
    true



    mapreduce.map.memory.mb
    1536


    mapreduce.map.java.opts
    -Xmx1024M


    mapreduce.reduce.memory.mb
    3072


    mapreduce.reduce.java.opts
    -Xmx2560M


    mapreduce.task.io.sort.mb
    512


    mapreduce.task.io.sort.factor
    100
 


    mapreduce.reduce.shuffle.parallelcopies
    50


    mapred.map.tasks
    10
    As a rule of thumb, use 10x the number of slaves(i.e., number of tasktrackers).
   


    mapred.reduce.tasks
    2
    As a rule of thumb, use 2x the number of slaveprocessors (i.e., number of tasktrackers).
   


    mapred.system.dir
    file:/hadoop/mapred/system
    true


    mapred.local.dir
    file:/hadoop/mapred/local
    true
 

 

yarn-site.xml:


    mapreduce.framework.name
    yarn


    yarn.nodemanager.aux-services
  
    mapreduce_shuffle


    yarn.nodemanager.aux-services.mapreduce_shuffle.class
    org.apache.hadoop.mapred.ShuffleHandler

            

    The address of the applications manager interface in the RM.
    yarn.resourcemanager.address
    localhost:18040


    The address of the scheduler interface.
    yarn.resourcemanager.scheduler.address
    localhost:18030


    The address of the RM web application.
    yarn.resourcemanager.webapp.address
    localhost:18088

                

    The address of the resource tracker interface.
    yarn.resourcemanager.resource-tracker.address
    localhost:8025
 

 

初始化hadoop 系统:
[steve@bmc bin]$ cd /opt/hadoop-2.2.0/bin/
[steve@bmc bin]$ ./hdfs namenode -format

 

因为hadoop默认是支持ipv6的 因此如果linux本身支持ipv6的话 就会把端口绑定到ipv6上。需要编辑~/.bashrc:
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib -Djava.net.preferIPv4Stack=true"

 

启动dfs:
[steve@bmc sbin]$ ./start-dfs.sh
14/01/26 12:07:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-namenode-bmc.out
localhost: starting datanode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-datanode-bmc.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop-2.2.0/logs/hadoop-steve-secondarynamenode-bmc.out
14/01/26 12:07:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

 

启动成功后,运行jps可以看到3个进程:
[steve@bmc sbin]$ jps
4312 NameNode
4428 DataNode
4682 Jps
4576 SecondaryNameNode

 

启动yarn:
[steve@bmc sbin]$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.2.0/logs/yarn-steve-resourcemanager-bmc.out
localhost: starting nodemanager, logging to /opt/hadoop-2.2.0/logs/yarn-steve-nodemanager-bmc.out

 

启动成功后,运行jps可以看到5个进程:
[steve@bmc sbin]$ jps
4312 NameNode
4756 ResourceManager
4428 DataNode
4855 NodeManager
4576 SecondaryNameNode
5156 Jps

[steve@bmc sbin]$ hdfs dfs -mkdir /test
[steve@bmc sbin]$ hdfs dfs -copyFromLocal ~/test.txt /test/

[steve@bmc sbin]$ hdfs dfs -cat /test/test.txt
I love so much. I like apple apple is one greatest company in the world what
is your name my name is Steve but I like swimming ha ha bye byte

[steve@bmc sbin]$ hadoop jar /opt/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /test/test.txt /test/out

 

如果出现问题,可以查看/opt/hadoop-2.2.0/logs下面的log。

 

下面的警告的原因:
14/01/25 20:34:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
直接从apache镜像中下载的编译好的Hadoop版本native library都是32版本的,如果要支持64位版本,必须自己重新编译

(责任编辑:IT)