hdfs balancer gets failed after every 30 mins when you run it through ambari

  • 0

hdfs balancer gets failed after every 30 mins when you run it through ambari

Actually there is still a bug in ambari 2.2.0, whenever you run balancer though ambari and it has to balance lots of TBs data then it fails after 30 mins due to timeout.

You can see following error in your logs:

resource_management.core.exceptions.Fail: Execution of ‘ambari-sudo.sh su hdfs -l -s /bin/bash -c ‘export PATH='”‘”‘/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'”‘”‘ ; hdfs –config /usr/hdp/current/hadoop-client/conf balancer -threshold 10” returned 252. 16/03/08 08:42:03 INFO balancer.Balancer: Using a threshold of 10.0
16/03/08 08:42:03 INFO balancer.Balancer: namenodes = [hdfs://HDPDEVHA]
16/03/08 08:42:03 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]
16/03/08 08:42:03 INFO balancer.Balancer: included nodes = []
16/03/08 08:42:03 INFO balancer.Balancer: excluded nodes = []
16/03/08 08:42:03 INFO balancer.Balancer: source nodes = []
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
16/03/08 08:42:04 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
16/03/08 08:42:04 INFO block.BlockTokenSecretManager: Setting block keys
16/03/08 08:42:04 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec
java.io.IOException: Another Balancer is running.. Exiting …
Mar 8, 2016 8:42:04 AM Balancing took 1.27 seconds
Last login: Tue Mar 8 08:12:09 EST 2016

 

So to resolve this error, you have to change ambari.properties file on ambari server node, after that you have to restart ambari server and run the balancer from ambari.

$ vi /etc/ambari-server/conf/ambari.properties

agent.task.timeout=7200

 


Leave a Reply