Hive2 action with Oozie in kerberos Env
One of my friend was trying to run some simple hive2 action in their Oozie workflow and was getting error. Then I decided to replicate it on my cluster and finally I did it after some retry.
If you have the same requirement where you have to run hive sql via oozie then this article will help you to do your job.
So there 3 requirements for Oozie Hive 2 Action on Kerberized HiveServer2:
1. Must have “oozie.credentials.credentialclasses” property defined in /etc/oozie/conf/oozie-site.xml. oozie.credentials.credentialclasses must include the value “hive2=org.apache.oozie.action.hadoop.Hive2Credentials”
2. workflow.xml must include a <credentials><credential>…</credential></credentials> section including the 2 properties “hive2.server.principal” and “hive2.jdbc.url”.
3. The Hive 2 action must reference the above defined credential name in the “cred=” field of the <action> definition.
Step 1: First create some dir inside hdfs(under your home dir) to have all script in same place and then run it from there:
[s0998dnz@m1 hive2_action_oozie]$ hadoop fs -mkdir -p /user/s0998dnz/hive2demo/app
Step 2: Now create your workflow.xml and job.properties:
[root@m1 hive_oozie_demo]# cat workflow.xml
<workflow-app name=”hive2demo” xmlns=”uri:oozie:workflow:0.4″>
<global>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
</global>
<credentials>
<credential name=”hs2-creds” type=”hive2″>
<property>
<name>hive2.server.principal</name>
<value>${jdbcPrincipal}</value>
</property>
<property>
<name>hive2.jdbc.url</name>
<value>${jdbcURL}</value>
</property>
</credential>
</credentials>
<start to=”hive2″/>
<action name=”hive2″ cred=”hs2-creds”>
<hive2 xmlns=”uri:oozie:hive2-action:0.1″>
<jdbc-url>${jdbcURL}</jdbc-url>
<script>${hivescript}</script>
</hive2>
<ok to=”End”/>
<error to=”Kill”/>
</action>
<kill name=”Kill”>
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name=”End”/>
</workflow-app>
[s0998dnz@m1 hive2_action_oozie]$ cat job.properties
# Job.properties file
nameNode=hdfs://HDPINF
jobTracker=m2.hdp22:8050
exampleDir=${nameNode}/user/${user.name}/hive2demo
oozie.wf.application.path=${exampleDir}/app
oozie.use.system.libpath=true
# Hive2 action
hivescript=${oozie.wf.application.path}/hivequery.hql
outputHiveDatabase=default
jdbcURL=jdbc:hive2://m2.hdp22:10000/default
jdbcPrincipal=hive/_HOST@HADOOPADMIN.COM
Step 3: Now create your hive script :
[s0998dnz@m1 hive2_action_oozie]$ cat hivequery.hql
show databases;
Step 4: Now Upload hivequery.hql and workflow.xml to HDFS:
For example:
[s0998dnz@m1 hive2_action_oozie]$ hadoop fs -put workflow.xml /user/s0998dnz/hive2demo/app/
[s0998dnz@m1 hive2_action_oozie]$ hadoop fs -put hivequery.hql /user/s0998dnz/hive2demo/app/
Step 5: Run the oozie job with the properites (please run kinit to acquire kerberos ticket first if required):
[s0998dnz@m1 hive2_action_oozie]$ oozie job -oozie http://m2.hdp22:11000/oozie -config job.properties -run
job: 0000008-170221004234250-oozie-oozi-W
I hope it will help you to run your hive2 action in oozie, please fell free to give your valuable feedback or suggestions.