Rack Awareness on Hadoop
Category : Bigdata
If you have Hadoop clusters of more than 30-40 nodes then it is better you have configured it with rack awarenwss because communication between two data nodes on the same rack is efficient than the same between two nodes on different racks.
It also have us to improve network traffic while reading/writing HDFS files, NameNode chooses data nodes which are on the same rack or a near by rack to read/write request (client node).
NameNode achieves this rack information by maintaining rack ids of each data node. This concept of choosing closer data nodes based on racks information is called Rack Awareness in Hadoop.
Note : A default Hadoop installation assumes all the nodes belong to the same rack.
So in this article I have explained how to make your cluster rack aware.
Step 1: Create a topology data file anywhere in Master node(i.e NN) and insert all datanodes ip address corresponding to rack.
[root@m1 ~]# vi topology.data
[root@m1 ~]# cat topology.data
192.168.56.51 01
192.168.56.52 02
192.168.56.53 01
192.168.56.54 02
192.168.56.55 01
192.168.56.56 02
Step 2: Now create rack-topology.sh for above data files.
root@m1 ~]# vi rack-topology.sh
[root@m1 ~]# cat rack-topology.sh
#!/bin/bash
# Adjust/Add the property “net.topology.script.file.name”
# to core-site.xml with the “absolute” path the this
# file. ENSURE the file is “executable”.
# Supply appropriate rack prefix
RACK_PREFIX=default
# To test, supply a hostname as script input:
if [ $# -gt 0 ]; then
CTL_FILE=${CTL_FILE:-“rack_topology.data”}
HADOOP_CONF=${HADOOP_CONF:-“/etc/hadoop/conf”}
if [ ! -f ${HADOOP_CONF}/${CTL_FILE} ]; then
echo -n “/$RACK_PREFIX/rack “
exit 0
fi
while [ $# -gt 0 ] ; do
nodeArg=$1
exec< ${HADOOP_CONF}/${CTL_FILE}
result=””
while read line ; do
ar=( $line )
if [ “${ar[0]}” = “$nodeArg” ] ; then
result=”${ar[1]}”
fi
done
shift
if [ -z “$result” ] ; then
echo -n “/$RACK_PREFIX/rack “
else
echo -n “/$RACK_PREFIX/rack_$result “
fi
done
else
echo -n “/$RACK_PREFIX/rack “
fi
Step 3: Add this property into core-site.xml or through ambari add following property.