Attempt to add *.jar multiple times to the distributed cache

  • 0

Attempt to add *.jar multiple times to the distributed cache

When we submit Spark2 action via oozie then we may see following exception in logs and job will fail:

exception: Attempt to add (hdfs://m1:8020/user/oozie/share/lib/lib_20171129113304/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.

java.lang.IllegalArgumentException: Attempt to add (hdfs://m1:8020/user/oozie/share/lib/lib_20171129113304/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.

The above error occurs because the same jar files exists in both(/user/oozie/share/lib/lib_20171129113304/oozie/ and  /user/oozie/share/lib/lib_20171129113304/spark2/) the locations.

Solution:

You need to deleted duplicate jars from Spark2 directory and will be left with only one copy in Oozie directory.

  1. Identify the oozie sharelib run the command:
    hdfs dfs -ls /user/oozie/share/lib/
  2. Use following command to list all jar files in directory Oozie:
    hdfs dfs -ls /user/oozie/share/lib/lib_<timestamp>/oozie | awk -F \/ ‘{print $8}’ > /tmp/list
  3. Use following command for deleting the jar files in Spark2 directory which matches with Oozie directory:
    for f in $(cat /tmp/list);do echo $f; hdfs dfs -rm -skipTrash /user/oozie/share/lib/lib_<timestamp>/spark2/$f;done
  4. Restart Oozie Service.

Thanks for visiting this blog, please feel free to give your valuable feedback.


Leave a Reply