Wednesday, 10 August 2016

Running Spark Jobs on Yarn Cluster

Submit Spark job on yarn cluster Mode  


Solution : 


If you are using spark with hdp/hdinsight, then we have to do following things.
  1. Add these entries in your $SPARK_HOME/conf/spark-defaults.conf
    spark.driver.extraJavaOptions -Dhdp.version=2.2.9.1-19 (your installed HDP version)
    spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.9.1-19 (your installed HDP version)
  2. create java-opts file in $SPARK_HOME/conf and add the installed HDP version in that file like
-Dhdp.version=2.2.9.1-19 (your installed HDP version)
to know hdp verion please run command hdp-select status hadoop-client in the cluster

Example command :
spark-submit --class org.apache.spark.examples.SparkPi \
    --master yarn \
    --deploy-mode cluster \
    --driver-memory 1g \
    --executor-memory 2g \
    --executor-cores 1 \
    --queue default \
    /usr/hdp/current/spark/lib/spark-examples*.jar \

    10

Yarn Memory Terms :


yarn.nodemanager.vmem-pmem-ratio property: Is defines ratio of virtual memory to available pysical memory, defaukt is 2.1 means virtual memory will be double the size of physical memory.
yarn.app.mapreduce.am.command-opts: In yarn ApplicationMaster(AM) is responsible for securing necessary resources. So this property defines how much memory required to run AM itself. Don't confuse this with nodemanager, where job will be executed.
yarn.app.mapreduce.am.resource.mb: This property specify criteria to select resource for particular job. Here is given 1536 Means any nodemanager which has equal or more memory available will get selected for executing job.


Difference between yarn-client mode and yarn-cluster mode :

  • In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
  • In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application.

No comments:

Post a Comment