Monthly Archives: May 2018

  • 1

Insert overwrite query Failed with exception Unable to move source

If you have explicitly setup hive.exec.stagingdir to some location like /tmp/ or some other location then whenever you will run insert overwrite statment then you will get following error.

ERROR exec.Task (SessionState.java:printError(989)) – Failed with exception Unable to move source hdfs://clustername/apps/finance/nest/nest_audit_log_final/
.hive-staging_hive_2017-12-12_19-15-30_008_33149322272174981-1/-ext-10000 to
destination hdfs://clustername/apps/finance/nest/nest_audit_log_final

Example: 

INSERT OVERWRITE TABLE nest.nest_audit_log_final
SELECT
project_name
, application
, module_seq_num
, module_name
, script_seq_num
, script_name
, run_session_id
, load_ts
, max_posted_date
, currency
, processor
, load_date
FROM nest.nest_audit_log_final;

Then you will get following error:

INFO common.FileUtils (FileUtils.java:mkdir(519)) - 
Creating directory if it doesn't exist: 
hdfs://clustername/apps/finance/nest/nest_audit_log_final
2017-12-12 19:21:42,508 ERROR hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) 
- Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
2017-12-12 19:21:42,525 ERROR exec.Task (SessionState.java:printError(989)) 
- Failed with exception Unable to move source 
hdfs://clustername/apps/finance/nest/nest_audit_log_final/
.hive-staging_hive_2017-12-12_19-15-30_008_33149322272174981-1/-ext-10000 to 
destination hdfs://clustername/apps/finance/nest/nest_audit_log_final
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://clustername/apps/finance/nest/nest_audit_log_final/
.hive-staging_hive_2017-12-12_19-15-30_008_33149322272174981-1/-ext-10000 
to destination hdfs://clustername/apps/finance/nest/nest_audit_log_final
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2900)
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3140)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1727)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:353)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1745)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1491)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1146)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:315)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:429)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:718)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: rename for src path: 
hdfs://clustername/apps/finance/nest/nest_audit_log_final/
.hive-staging_hive_2017-12-12_19-15-30_008_33149322272174981-1/-ext-10000/000000_0 
to dest:hdfs://clustername/apps/finance/nest/nest_audit_log_final returned false
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2849)

Root Cause: It is because of a bug (HIVE-17063). 

Workaround:

set hive.exec.stagingdir=.hive-staging;

  • 1

last access time of a table is showing zero

If you many hundreds or thousands tables and you want to know when was the last time your hive table accessed then you can run following mysql query in mysql under hive database.

mysql> use hive;

mysql> select TBL_NAME,LAST_ACCESS_TIME from TBLS where DB_ID=<db_id>;
+—————————————————————————————————-+——————+
| TBL_NAME | LAST_ACCESS_TIME |
+—————————————————————————————————-+——————+
| df_nov_4 | 0 |
| google_feed_20151111 | 0 |
| null_recos | 0 |
| taxonomy_20160107 | 0 |

Note : but unfortunately due to bug https://issues.apache.org/jira/browse/HIVE-2526 you can not do that without making some configuration changes like follow.

1. From Ambari > Hive > Advanced > Custom hive-site, edit (if it exists) or add a new property:
hive.security.authorization.sqlstd.confwhitelist=hive\.exec\.pre\.hooks

2. From Ambari > Hive > Advanced > General, edit hive.exec.pre.hooks property and append the following to the end of the field value (comma separated):
org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec

3. Restart the affected Hive services after saving the config changes in Ambari.

4. After the restart, try running the query (select TBL_NAME,LAST_ACCESS_TIME from TBLS where DB_ID=<db_id>;) again and this time you should be able to see the last access time.

 

I hope it will help you to get your work done, feel free to give your valuable feedback.


  • 1

kill hive query where application id was not created

Tags :

Category : Hive

Sometime when you run hive queries then it does not launch application or get hung due to some resources or any other reason.

Now in this case you have to kill query to resubmit it. So, please use following steps to kill hive query itself.

 

  • hive> select * from table1;
    Query ID = mapr_201804547_2ad87f0f5627
    Total jobs = 1
    Launching Job 1 out of 1
  • Use “kill query” command, which is available from HDP 2.6.3:
    KILL QUERY <queryid1>

 

Please feel free to give your feedback.