Advanced Cluster Configuration Features

Here I tried to explain following advanced configurations :

  •  How to specify your rack topology script
  • How to perform advanced configuration of Hadoop
  • How to set up HDFS federation
  • How to enable HDFS high availability
  • How configuration management tools can help you manage a large cluster

We’ll now discuss some additional properties: These generally fall into one of several categories

  • Optimization and performance tuning
  • Capacity management
  • Access control

hdfs-site.xml

  • dfs.namenode.handler.count:  The number of threads the NameNode uses to handle RPC requests from DataNodes. Default: 10. Recommended: 10% of the number of nodes, with a floor of 10 and a ceiling of 200. Symptoms of this being set too low: ‘connection refused’ messages in DataNode logs as they try to transmit block reports to the NameNode. Specified on the NameNode.
  • dfs.permissions: If true (the default), checks file permissions. If false, permission checking is disabled (everyone can access every file). Specified on the NameNode.
  • dfs.datanode.du.reserved: The amount of space on each volume which cannot be used for HDFS block storage. Recommended: at least 10GB (See later.) Specified on each DataNode.
  • dfs.datanode.failed.volumes.tolerated: The number of volumes allowed to fail before the DataNode takes itself offline, ultimately resulting in all of its blocks being re-replicated. Default: 0, but often increased on machines with several disks. Specified on each DataNode.

core-site.xml 

  • fs.trash.interval: When a file is deleted, it is placed in a .Trash directory in the user’s home directory, rather than being immediately deleted. It is purged from HDFS after the number of minutes specified. Default: 0 (disabled). Recommended: 1440 (one day). Specified on clients and on the NameNode.
  • hadoop.tmp.dir: Base temporary directory, both on the local disk and in HDFS. Default is  /tmp/hadoop-${user.name}. Specified on all nodes.
  • io.file.buffer.size: Determines how much data is buffered during read and write operations. Should be a power of 2 of hardware page size. Default: 4096. Recommendation: 65536 (64KB). Specified on all nodes.
  • io.compression.codec: List of compression codecs that Hadoop can use for file compression. Specified on all nodes. Default is org.apache.hadoop.io.compress.DefaultCodec,or g.apache.hadoop.io.compress.GzipCodec,org.apa che.hadoop.io.compress.BZip2Codec,org.apache. hadoop.io.compress.DeflateCodec,org.apache.ha doop.io.compress.SnappyCodec

mapred-site.xml 

  • mapred.job.tracker.handler.count: Number of threads used by the JobTracker to respond to heartbeats from the TaskTrackers. Default: 10. Recommendation: approx. 4% of the number of nodes with a floor of 10 and a ceiling of 200. Specified on the JobTracker.
  • mapred.reduce.parallel.copies: Number of TaskTrackers a Reducer can connect to in parallel to transfer its data. Default: 5. Recommendation: SQRT(number_of_nodes) with a floor of 10. Specified on all TaskTracker nodes.
  • tasktracker.http.threads: The number of HTTP threads in the TaskTracker which the Reducers use to retrieve data. Default: 40. Recommendation: 80. Specified on all TaskTracker nodes.
  • mapred.reduce.slowstart.comple ted.maps: The percentage of Map tasks which must be completed before the JobTracker will schedule Reducers on the cluster. Default: 0.05. Recommendation: 0.5 to 0.8. Specified on the JobTracker.
  • mapred.jobtracker.taskScheduler: The class used by the JobTracker to determine how to schedule tasks on the cluster. Default: org.apache.hadoop.mapred.JobQu eueTaskScheduler.! Recommendation: org.apache.hadoop.mapred.FairS cheduler. (Job and task scheduling is discussed later in the course.) Specified on the JobTracker.
  • mapred.map.tasks.speculative.e xecution: Whether to allow speculative execution for Map tasks. Default: true. Recommendation: true. Specified on the JobTracker.
  • mapred.reduce.tasks.speculativ e.execution: Whether to allow speculative execution for Reduce tasks. Default: true. Recommendation: false. Specified on the JobTracker.
  • If a task is running significantly more slowly than the average speed of tasks for that job, speculative execution may occur(Another attempt to run the same task is instantiated on a different node. The results from the first completed task are used.The slower task is killed.
  • mapred.compress.map.output: Determines whether intermediate data from Mappers should be compressed before transfer across the network. Default: false. Recommendation: true. Specified on all TaskTracker nodes.
  • mapred.output.compression.type: If the output from the Reducers are SequenceFiles, determines whether to compress the SequenceFiles. Default: RECORD. Options: NONE, RECORD, BLOCK. Recommendation: BLOCK. Specified on all TaskTracker nodes.
  • io.sort.mb: The size of the buffer on the Mapper to which the Mapper writes its Key/Value pairs. Default: 100MB. Recommendation: 256MB. This allocation comes out of the task’s JVM heap space. Specified on each TaskTracker node.
  • io.sort.factor: The number of streams to merge at once when sorting files. Specified on each TaskTracker node.

Host ‘include’ and ‘exclude’ Files :

  • Optionally, specify dfs.hosts in hdfs-site.xml to point to a file listing hosts which are allowed to connect to a NameNode and act as DataNodes.
    • Similarly, mapred.hosts points to a file which lists hosts allowed to connect as TaskTrackers
  • Both files are optional
    •  If omitted, any host may connect and act as a DataNode/ TaskTracker
      • This is a possible security/data integrity issue
  •  NameNode can be forced to reread the dfs.hosts file with hadoop dfsadmin -refreshNodes
    •  No such command for the JobTracker, which has to be restarted to re-read the mapred.hosts file, so many System Administrators only create a dfs.hosts file.
  • It is possible to explicitly prevent one or more hosts from acting as DataNodes
    • Create a dfs.hosts.exclude property, and specify a filename
    • List the names of all the hosts to exclude in that file
    • These hosts will then not be allowed to connect to the NameNode
    • This is often used if you intend to decommission nodes (see later)
    • Run hadoop dfsadmin -refreshNodes to make the NameNode re-read the file
  • Similarly, mapred.hosts.exclude can be used to specify a file listing hosts which may not connect to the JobTracker
    • Not as commonly used, since the JobTracker must be restarted in order to re-read the file

Rack Topology Awareness: Recall that HDFS is ‘rack aware’

  •  Distributes blocks based on hosts’ locations
  • Administrator supplies a script which tells Hadoop which rack a node is in
    •  Should return a hierarchical ‘rack ID’ for each argument it’s passed
    • Rack ID is of the form /datacenter/rack
      •  Example: /datactr1/rack40
    • Script can use a flat file, database, etc
    • Script name is in topology.script.file.name in  core-site.xml
      • If this is blank (default), Hadoop returns a value of  /default-rack for all nodes
  • A sample rack topology script:

#!/usr/bin/env python
import sys
DEFAULT_RACK = “/datacenter1/default-rack”
HOST_RACK_FILE = “/etc/hadoop/conf/host-rack.map”
host_rack = {}
for line in open(HOST_RACK_FILE):
(host, rack) = line.split()
host_rack[host] = rack

for host in sys.argv[1:]:
if host in host_rack:
print host_rack[host]
else:
print DEFAULT_RACK

  • The /etc/hadoop/conf/host-rack.map file:

host1 /datacenter1/rack1
host2 /datacenter1/rack1
host3 /datacenter1/rack1
host4 /datacenter1/rack1
host5 /datacenter1/rack2
host6 /datacenter1/rack2
host7 /datacenter1/rack2
host8 /datacenter1/rack2

  • A common scenario is to name your hosts in such a way that the Rack Topology Script can easily determine their location
    • Example: a host called r1m32
      •  32nd machine in Rack 1
    • The Rack Topology Script can simply deconstruct the machine name and then return the rack awareness information

A Note on DNS vs IP Addresses: 

  • You can use machine names or IP addresses to identify nodes in Hadoop’s configuration files
  •  You should use one or the other, but not a combination!
    •  Hadoop performs both forward and reverse lookups on IP addresses in different situations; if the results don’t match, it could cause major problems
  • Most people use names rather than IP addresses
    •  This means you must ensure DNS is configured correctly on your cluster
    • Just using the /etc/hosts file on each node will cause configuration headaches as the cluster grows

 


4 Comments

Sahil Somkuwar

January 24, 2017 at 11:13 pm

Such a great and informative blog. Haven’t found more genuine and helpful easy to understand articles. That too very few being written for Hadoop Administrators. Thank you so much. Keep up the good work.

    admin

    January 25, 2017 at 7:46 am

    Thank you very much Sahil for your kind words.
    We will defiantly try to make it more informative and useful. Please keep visiting and give your valuable feedback. Also if you need any help on any topics then let us know, we will be happy to help you.

Vittal Jadhav

February 3, 2017 at 1:02 am

Hi Saurabh,
Excellent website and blogs. Very good and precise information for the viewer. The site content is well organized for Dev, Admin, Architect. The details about admin, data processing and so on are fabulous.
Thanks a lot for creating this site and articles. Keep adding details and information.

Superb work once again!

Thanks,
Vittal Jadhav

    admin

    February 3, 2017 at 11:56 am

    Thanks a lot Vittal for your valuable feedback, these feedback means a lot for me and give me motivation to go ahead with more work.
    I will try my best to create other useful blogs/pages, but in case if you need any specific topic details on anything please feel free to post your doubts/questions.

Leave a Reply