How to enable Node Label in your cluster

  • 2

How to enable Node Label in your cluster

Node Label:

Here we described how to use Node labels to run YARN/Other applications on cluster nodes that have a specified node label. Node labels can be set as exclusive or shareable:

  • Exclusive— Access is restricted to applications running in queues associated with the node label.
  • Sharable— If idle capacity is available on the labeled node, resources are shared with all applications in the cluster.

Note: (Queues without Node Labels) If no node label is assigned to a queue, the applications submitted by the queue can run on any node without a node label, and on nodes with shareable node labels if idle resources are available.

Preemption: Labeled applications that request labeled resources preempt non-labeled applications on labeled nodes. If a labeled resource is not explicitly requested, the normal rules of preemption apply. Non-labeled applications cannot preempt labeled applications running on labeled nodes.

Configuring Node Labels: To enable node labels, make the following configuration changes on the YARN Resource Manager hosts.

Step 1: Create a Label Directory in HDFS:

Use the following commands to create a “node-labels” directory in which to store the node labels in HDFS.

$ sudo su hdfs

$ hadoop fs -mkdir -p /yarn/node-labels

$ hadoop fs -chown -R yarn:yarn /yarn

$ hadoop fs -chmod -R 700 /yarn

 

Note:  -chmod -R 700 specifies that only the yarn user can access the “node-labels” directory.

You can then use the following command to confirm that the directory was created in HDFS.

$ hadoop fs -ls /yarn

The new node label directory should appear in the list returned by the following command. The owner should be yarn, and the permission should be drwx.

Found 1 items

drwx—— – yarn yarn 0 2014-11-24 13:09 /yarn/node-labels

Use the following commands to create a /user/yarn directory that is required by the distributed shell.

$ hadoop fs -mkdir -p /user/yarn

$ hadoop fs -chown -R yarn:yarn /user/yarn

$ hadoop fs -chmod -R 700 /user/yarn

 

Step 2 : Configure YARN for Node Labels

Add the following properties to the /etc/hadoop/conf/yarn-site.xml file on the ResourceManager host.

Set the following property to enable node labels:

<property>

     <name>yarn.node-labels.enabled</name>

     <value>true</value>

</property>

Set the following property to reference the HDFS node label directory:

<property>

     <name>yarn.node-labels.fs-store.root-dir</name>

     <value> hdfs://lxhdpmastinf001.lowes.com:8020/yarn/node-labels/ </value>

</property>

 

Step 3: Start or Restart the YARN ResourceManager : In order for the configuration changes in the yarn-site.xml file to take effect, you must stop and restart the YARN ResourceManager if it is running, or start the ResourceManager if it is not running. To start or stop the ResourceManager, use the applicable commands in the “Controlling HDP Services Manually” section of the HDP Reference Guide.

Step 4: Add Node Labels : Use the following command format to add node labels. You should run these commands as the yarn user. Node labels must be added before they can be assigned to nodes and associated with queues.

$ sudo su yarn

$ yarn rmadmin -addToClusterNodeLabels “<label1>(exclusive=<true|false>),<label2>(exclusive=<true|false>)”

Note: If exclusive is not specified, the default value is true. For example, the following commands add the node label “x” as exclusive, and “y” as shareable (non-exclusive).

$ sudo su yarn

$ yarn rmadmin -addToClusterNodeLabels “spark_nl1(exclusive=true), spark nl2(exclusive=false), spark_nl3(exclusive=false)”

 

You can use the yarn cluster –list-node-labels command to confirm that node labels have been added:

[yarn@localhost ~]$ yarn cluster –list-node-labels

16/04/27 03:19:01 INFO impl.TimelineClientImpl: Timeline service address: http://localhost:8188/ws/v1/timeline/

16/04/27 03:19:02 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2

Node Labels: <spark_nl1:exclusivity=true>,<spark_nl2:exclusivity=false>,<spark_nl3:exclusivity=false>

 

Note: You can use the following command format to remove node labels. You cannot remove a node label if it is associated with a queue:

$ yarn rmadmin -removeFromClusterNodeLabels “<label1>,<label2>”

 

Step 5 : Assign Node Labels to Cluster Nodes : 

Use the following command format to add or replace node label assignments on cluster nodes:

$ yarn rmadmin -replaceLabelsOnNode “<node1>:<port>=<label1> <node2>:<port>=<label2>”

For example, the following commands assign node label “x” to “node-1.example.com”, and node label “y” to “node-2.example.com”.

$ sudo su yarn

$ yarn rmadmin -replaceLabelsOnNode “localhost1=spark_nl1 llocalhost2=spark_nl2”

$ yarn rmadmin -replaceLabelsOnNode “localhost1=spark_nl3”

 

Note: You can only assign one node label to each node. Also, if you do not specify a port, the node label change will be applied to all NodeManagers on the host.

To remove node label assignments from a node, use -replaceLabelsOnNode, but do not specify any labels. For example, you would use the following commands to remove the “x” label from lxhdpwrkinf001.lowes.com:

$ sudo su yarnYarn rmadmin -replaceLabelsOnNode “localhost1” 

 

Step 6: Associating Node Labels with Queues: Now that we have created node labels, we can associate them with queues in the /etc/hadoop/conf/capacity-scheduler.xml file.

You must specify capacity on each node label of each queue, and also ensure that the sum of capacities of each node-label of direct children of a parent queue at every level is equal to 100%. Node labels that a queue can access (accessible node labels of a queue) must be the same as, or a subset of, the accessible node labels of its parent queue.

 Example:

Assume that a cluster has a total of 8 nodes. The first 3 nodes (n1-n3) have node label=x, the next 3 nodes (n4-n6) have node label=y, and the final 2 nodes (n7, n8) do not have any node labels. Each node can run 10 containers.

The queue hierarchy is as follows:

Screen Shot 2016-05-03 at 4.10.03 PM

 

Batch can access both node label x,y and user queue can access only node label y

capacity(batch) = 30, capacity(batch, label=x) = 100, capacity(batch, label=y) = 50; capacity(user) =40, capacity(user, label=y) = 50

ado,adop and di: capacity(user.ado) = 40, capacity(user.ado, label=x) =40, capacity(user.adop) = 60, capacity(user.adop, label=x) =40, capacity(user.di) = 20, capacity(user.di, label=x) =10

In this way you can configure Node label with Capacity scheduler queues.

Test Cases: You can use below sample example to test your node label and CS queue. 

Example 1: /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue batch /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

Example 2: /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue ado /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

Example 3: /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue adospark /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

Example 4 : /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue di /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

Example 5 :/usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue dispark –num-executors 5 –conf spark.executor.memory=5g –conf spark.driver.memory=2g –conf spark.driver.cores=2 –conf spark.executor.cores=2 –conf spark.executor.instances=2 /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

Example 6 : /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster –queue ea –num-executors 5 /usr/hdp/2.3.4.0-3485/spark/lib/spark-examples-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar 10

 

I hope it will help you to understand node label and capacity scheduler. If you have any suggestion or feedback please feel free to write to me.


2 Comments

salteh

March 13, 2017 at 9:55 am

making node-labels exclusive doesn’t work on hadoop 2.7.2.
it throws an error for name of cluster

    admin

    March 13, 2017 at 12:10 pm

    Hello Salteh,

    We have configured it in hadoop 2.7.1.2 and it is working fine and it should work on higher version also.
    But to understand actual root cause can you share screen shots of error.

Leave a Reply