Hadoop NameNode failed and reset

Yesterday, I did not properly shut down my laptop. Unfortunately,  this crashed my NameNode. When I try to start hadoop by:

hadoop@fxu-t60:/usr/local/hadoop/bin$ ./start-all.sh

I did not notice anything wrong. But when the Eclipse plug-in tried to connect the HDFS, it said: “Error: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused!” (Note: my hadoop is installed at /usr/local/hadoop/)

Check the log file at:http://localhost:50030/logs/hadoop-hadoop-namenode-fxu-t60.log

2011-03-27 11:32:38,282 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = fxu-t60/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.3-dev
STARTUP_MSG:   build =  -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011
************************************************************/
2011-03-27 11:32:38,556 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310
2011-03-27 11:32:38,567 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:54310
2011-03-27 11:32:38,572 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-03-27 11:32:38,574 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin
2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2011-03-27 11:32:38,683 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2011-03-27 11:32:38,705 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-03-27 11:32:38,707 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
2011-03-27 11:32:38,786 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:180)
	at java.io.DataInputStream.readLong(DataInputStream.java:399)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.readCheckpointTime(FSImage.java:561)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:552)
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227)
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
2011-03-27 11:32:38,788 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310
2011-03-27 11:32:38,789 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:180)
	at java.io.DataInputStream.readLong(DataInputStream.java:399)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.readCheckpointTime(FSImage.java:561)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:552)
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227)
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2011-03-27 11:32:38,790 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1
************************************************************/

To verify this, run command jps, I did not find NameNode running at all.  I am running hadoop at my laptop for development purpose. So there is no important data to recover. But I need Hadoop to run. My solution is to create a new HDFS to replace the old one.  Later, I will try to figure out how to recover data to this new HDFS. Just edit the configuration file “/usr/local/hadoop/conf/core-site.xml” changing value of key “hadoop.tmp.dir” to a new directory, for example “/home/hadoop/datastore“. Then run start-all script. Not lucky, the Eclipse still cannot connect the HDFS. But the error log is different this time:

2011-03-27 13:48:17,701 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = fxu-t60/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.3-dev
STARTUP_MSG:   build =  -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011
************************************************************/
2011-03-27 13:48:17,990 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=54310
2011-03-27 13:48:18,003 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:54310
2011-03-27 13:48:18,008 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-03-27 13:48:18,013 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin
2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2011-03-27 13:48:18,121 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2011-03-27 13:48:18,161 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-03-27 13:48:18,163 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
2011-03-27 13:48:18,210 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/hadoop/datastore/dfs/name does not exist.
2011-03-27 13:48:18,218 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/datastore/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
2011-03-27 13:48:18,220 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310
2011-03-27 13:48:18,221 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/hadoop/datastore/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2011-03-27 13:48:18,224 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1
************************************************************/

OK, please format the new HDFS:

hadoop@fxu-t60:/usr/local/hadoop/bin$ ./hadoop namenode -format
11/03/27 13:50:26 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = fxu-t60/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.3-dev
STARTUP_MSG:   build =  -r ; compiled by 'hadoop' on Thu Feb 17 12:24:21 PST 2011
************************************************************/
11/03/27 13:50:26 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop,admin
11/03/27 13:50:26 INFO namenode.FSNamesystem: supergroup=supergroup
11/03/27 13:50:26 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/03/27 13:50:26 INFO common.Storage: Image file of size 96 saved in 0 seconds.
11/03/27 13:50:26 INFO common.Storage: Storage directory /home/hadoop/datastore/dfs/name has been successfully formatted.
11/03/27 13:50:26 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at fxu-t60/127.0.1.1
************************************************************/
hadoop@fxu-t60:/usr/local/hadoop/bin$ ./stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
no namenode to stop
localhost: stopping datanode
localhost: stopping secondarynamenode
hadoop@fxu-t60:/usr/local/hadoop/bin$ ./start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-fxu-t60.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-fxu-t60.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-fxu-t60.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-fxu-t60.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-fxu-t60.out
hadoop@fxu-t60:/usr/local/hadoop/bin$ jps
3597 SecondaryNameNode
3915 Jps
3677 JobTracker
3418 DataNode
3237 NameNode
3847 TaskTracker

Finally, the NameNode is back!

1 thought on “Hadoop NameNode failed and reset”

Leave a comment