Yesterday, I did not properly shut down my laptop. Unfortunately, this crashed my NameNode. When I try to start hadoop by:
hadoop@fxu-t60:/usr/local/hadoop/bin$ ./start-all.sh
I did not notice anything wrong. But when the Eclipse plug-in tried to connect the HDFS, it said: “Error: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused!” (Note: my hadoop is installed at /usr/local/hadoop/)
Check the log file at:http://localhost:50030/logs/hadoop-hadoop-namenode-fxu-t60.log
2011-03-27 11:32:38,282 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = fxu-t60/127.0.1.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.3-dev STARTUP_MSG: build = -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011 ************************************************************/ 2011-03-27 11:32:38,556 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310 2011-03-27 11:32:38,567 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:54310 2011-03-27 11:32:38,572 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-03-27 11:32:38,574 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin 2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-03-27 11:32:38,705 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-03-27 11:32:38,707 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-03-27 11:32:38,786 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readLong(DataInputStream.java:399) at org.apache.hadoop.hdfs.server.namenode.FSImage.readCheckpointTime(FSImage.java:561) at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:552) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2011-03-27 11:32:38,788 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310 2011-03-27 11:32:38,789 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readLong(DataInputStream.java:399) at org.apache.hadoop.hdfs.server.namenode.FSImage.readCheckpointTime(FSImage.java:561) at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:552) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2011-03-27 11:32:38,790 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1 ************************************************************/
To verify this, run command jps, I did not find NameNode running at all. I am running hadoop at my laptop for development purpose. So there is no important data to recover. But I need Hadoop to run. My solution is to create a new HDFS to replace the old one. Later, I will try to figure out how to recover data to this new HDFS. Just edit the configuration file “/usr/local/hadoop/conf/core-site.xml” changing value of key “hadoop.tmp.dir” to a new directory, for example “/home/hadoop/datastore“. Then run start-all script. Not lucky, the Eclipse still cannot connect the HDFS. But the error log is different this time:
2011-03-27 13:48:17,701 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = fxu-t60/127.0.1.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.3-dev STARTUP_MSG: build = -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011 ************************************************************/ 2011-03-27 13:48:17,990 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310 2011-03-27 13:48:18,003 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:54310 2011-03-27 13:48:18,008 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-03-27 13:48:18,013 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin 2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-03-27 13:48:18,161 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2011-03-27 13:48:18,163 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-03-27 13:48:18,210 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/hadoop/datastore/dfs/name does not exist. 2011-03-27 13:48:18,218 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/datastore/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2011-03-27 13:48:18,220 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310 2011-03-27 13:48:18,221 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/datastore/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2011-03-27 13:48:18,224 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1 ************************************************************/
OK, please format the new HDFS:
hadoop@fxu-t60:/usr/local/hadoop/bin$ ./hadoop namenode -format 11/03/27 13:50:26 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = fxu-t60/127.0.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.3-dev STARTUP_MSG: build = -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011 ************************************************************/ 11/03/27 13:50:26 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin 11/03/27 13:50:26 INFO namenode.FSNamesystem: supergroup=supergroup 11/03/27 13:50:26 INFO namenode.FSNamesystem: isPermissionEnabled=true 11/03/27 13:50:26 INFO common.Storage: Image file of size 96 saved in 0 seconds. 11/03/27 13:50:26 INFO common.Storage: Storage directory /home/hadoop/datastore/dfs/name has been successfully formatted. 11/03/27 13:50:26 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1 ************************************************************/ hadoop@fxu-t60:/usr/local/hadoop/bin$ ./stop-all.sh stopping jobtracker localhost: stopping tasktracker no namenode to stop localhost: stopping datanode localhost: stopping secondarynamenode hadoop@fxu-t60:/usr/local/hadoop/bin$ ./start-all.sh starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-fxu-t60.out localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-fxu-t60.out localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-fxu-t60.out starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-fxu-t60.out localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-fxu-t60.out hadoop@fxu-t60:/usr/local/hadoop/bin$ jps 3597 SecondaryNameNode 3915 Jps 3677 JobTracker 3418 DataNode 3237 NameNode 3847 TaskTracker
Finally, the NameNode is back!
After 2 hours of searching stackoverflow for solutions, your fix seemed to work. Thank you!