1. 基本信息

hadoop 版本 hadoop-0.20.205.0.tar.gz

操作系统 ubuntu

2. 问题

在使用Hadoop开发初期的时候遇到一个问题。每次重启系统后发现不能正常运行hadoop。必须执行 bin/hadoop namenode -format 进行格式化才能成功运行hadoop，但是也就意味着以前记录的name等数据丢失。

查询日志发现错误：

21:08:48,103 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
21:08:48,125 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
21:08:48,129 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /tmp/hadoop-sylar/dfs/name does not exist.
21:08:48,130 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-sylar/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:288)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:358)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:497)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1268)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1277)

3.原因

后查询文档时发现，在linux下hadoop等的各种数据保存在/tmp目录下。当重启系统后/tmp目录中的数据信息被清除，导致hadoop启动失败。当bin/hadoop namenode -format 格式化后，恢复了默认设置，即可正常启动。

4. 解决

需要在配置文件core-site.xml中指定临时目录的存储位置，现贴出修改后的配置

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoopdata/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoopdata/filesystem/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoopdata/filesystem/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication. The actual number of replications can be specified when the file is created. The default isused if replication is not specified in create time.</description>
</property>
</configuration>

dfs.name.dir是NameNode持久存储名字空间及事务日志的本地文件系统路径。当这个值是一个逗号分割的目录列表时，nametable数据将会被复制到所有目录中做冗余备份。

dfs.data.dir是DataNode存放块数据的本地文件系统路径，逗号分割的列表。当这个值是逗号分割的目录列表时，数据将被存储在所有目录下，通常分布在不同设备上。

(责任编辑：IT)