Spark job run successfully in client mode but failing in cluster mode

  • 0

Spark job run successfully in client mode but failing in cluster mode

If you build a pyspark application which can run successfully  in both the local and yarn-client modes.  However, when you try to run in cluster mode, then you may receive following errors :

  1. Error 1:  Exception: (“You must build Spark with Hive. Export ‘SPARK_HIVE=true’ and run build/sbt assembly”, Py4JJavaError(u’An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.\n’, JavaObject id=o52))
  2. Error 2: INFO Client: Deleting staging directory .sparkStaging/application_1476997468030_139760
    Exception in thread “main” org.apache.spark.SparkException: Application application_1476997468030_139760 finished at org.apache.spark.deploy.yarn.Client.run(Client.scala:974)
  3. Error 3: ERROR yarn.ApplicationMaster: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient Caused by: java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory
  4. Error 4: INFO ApplicationMaster: Final app status: FAILED, exitCode: 1, (reason: User application exited with status 1)
    17/08/22 04:56:19 ERROR ApplicationMaster: Uncaught exception:
    org.apache.spark.SparkException: Exception thrown in awaitResult:
    at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
    at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:401)
    at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:766)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:764)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
    Caused by: org.apache.spark.SparkUserAppException: User application exited with 1

Root Cause : If you are using HDP stack then you might be hitting a bug with HDP 2.3.2 with Ambari 2.2.1 :https://hortonworks.jira.com/browse/BUG-56393 where starting from Ambari 2.2.1 , it does not manage the spark version if HDP stack is < HDP 2.3.4.

If not then you are missing some drivers and hive parameters which you need to pass in command line during spark-submit in cluster mode.

Resolution : You can use following steps to solve this issue :

  • Check the hive-site.xml contents. Should be like as below for spark.
  • Add hive-site.xml to the driver-classpath so that spark can read hive configuration. Make sure —files must come before you .jar file.
  • Add the datanucleus jars using –jars option when you submit
  • Check the contents of hive-site.xml
    <configuration>
    <property>
    <name>hive.metastore.uris</name>
    <value>thrift://sandbox.hortonworks.com:9083</value>
    </property>
    </configuration>
  • The Seq. of command
    spark-submit \
    –class <Your.class.name> \
    –master yarn-cluster \
    –num-executors 1 \
    –driver-memory 1g \
    –executor-memory 1g \
    –executor-cores 1 \
    –files /usr/hdp/current/spark-client/conf/hive-site.xml \
    –jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar \
    target/YOUR_JAR-1.0.0-SNAPSHOT.jar “show tables”

Or complete command can be :

spark-submit --master yarn --deploy-mode cluster --queue di --jars /usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,/usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar --conf "spark.yarn.appMasterEnv.PATH=/opt/rh/rh-python34/root/usr/bin${PATH:+:${PATH}}" --conf "spark.yarn.appMasterEnv.PATH=/opt/rh/rh-python34/root/usr/bin${PATH:+:${PATH}}" --conf "spark.yarn.appMasterEnv.LD_LIBRARY_PATH=/opt/rh/rh-python34/root/usr/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}" --conf "spark.yarn.appMasterEnv.MANPATH=/opt/rh/rh-python34/root/usr/share/man:${MANPATH}" --conf "spark.yarn.appMasterEnv.XDG_DATA_DIRS=/opt/rh/rh-python34/root/usr/share${XDG_DATA_DIRS:+:${XDG_DATA_DIRS}}" --conf "spark.yarn.appMasterEnv.PKG_CONFIG_PATH=/opt/rh/rh-python34/root/usr/lib64/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}" --conf "spark.executorEnv.PATH=/opt/rh/rh-python34/root/usr/bin${PATH:+:${PATH}}" --conf "spark.executorEnv.PATH=/opt/rh/rh-python34/root/usr/bin${PATH:+:${PATH}}" --conf "spark.executorEnv.LD_LIBRARY_PATH=/opt/rh/rh-python34/root/usr/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}" --conf "spark.executorEnv.MANPATH=/opt/rh/rh-python34/root/usr/share/man:${MANPATH}" --conf "spark.executorEnv.XDG_DATA_DIRS=/opt/rh/rh-python34/root/usr/share${XDG_DATA_DIRS:+:${XDG_DATA_DIRS}}" --conf "spark.executorEnv.PKG_CONFIG_PATH=/opt/rh/rh-python34/root/usr/lib64/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}" hive.py

where hive.py has following value :

[adebatch@server1 ~]$ cat hive.py 
from pyspark import SparkContext,SparkConf
from pyspark.sql import HiveContext
import json
import sys
conf = SparkConf()
sc = SparkContext(conf=conf)
hiveCtx = HiveContext(sc)
result = hiveCtx.sql('show databases')
#result = hiveCtx.sql('select * from default.table1 limit 1')
result.show()
result.write.save('/tmp/pyspark', format='text', mode='overwrite')

Please feel free to give your valuable feedback.