Standby NameNode is faling and only one is running

  • 0

Standby NameNode is faling and only one is running

Category : HDFS

Standby NameNode is unable to start up. Or, once bring up standby NameNode, the active NameNode will go down soon, leaving only one live NameNode. NameNode log shows:

FATAL namenode.FSEditLog ( – Error: flush failed for required journal (JournalAndStream(mgr=QJM to )) Timed out waiting 20000ms for a quorum of nodes to respond. 


This can be caused by network issue, which causes JournalNode to take long time to sync. The following snippet from JournalNode log shows it took unusual long time to sync:

WARN server.Journal ( – Sync of transaction range 187176137-187176137 took 44461ms

WARN ipc.Server ( – IPC Server handler 3 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from Call#10890 Retry#0: output error

INFO ipc.Server ( – IPC Server handler 3 on 8485 caught an exception




at org.apache.hadoop.ipc.Server.channelWrite(

at org.apache.hadoop.ipc.Server.access$1900(

at org.apache.hadoop.ipc.Server$Responder.processResponse(

at org.apache.hadoop.ipc.Server$Responder.doRespond(

at org.apache.hadoop.ipc.Server$



Increase the values of following JournalNode timeout properties: = 60000 = 60000 = 60000

Leave a Reply