How to use Hive Query result in a variable for other query
Category : Hive
Many time we want to store one query result into a variable and then use this variable in some other query. So now it is possible in your favorite hadoop ecosystem i.e hive.
With the help of this article you can achieve it.
[root@m1 etc]# hive
16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Step1: Check table test1 description:
hive> desc Table test1;
OK
columnA date
Time taken: 0.452 seconds, Fetched: 1 row(s)
Step2: Table test2 description:
hive> desc Table test2;
OK
columnB date
Time taken: 0.516 seconds, Fetched: 1 row(s)
Step3: Assigning subquery to variable:
hive> set hivevar:var1=(select columnB from Table test2 where columnB=’2016-09-19′);
Step4: Running main query with subquery:
hive> select * from Table test1 where columnA IN ${hivevar:var1};
3 Comments
Michael Segel
December 7, 2016 at 4:31 pmYour title for your blog article is very misleading and wrong.
What you’re doing is just setting a variable in the shell and then using the variable in the following query.
To be clear you are not taking the result of a query and using it in another query.
To test this… after set hivevar:var1 = (select columnB from test2 where columnB=’2016-09-19′)
say set hivevar:var1.
You will get a print out of your statement. All you are saving is the text.
The issue is that many want to store things like ‘SELECT MAX(foo) FROM bar’; and have this stored in a variable called max_val.
This can’t be done… yet…
admin
December 8, 2016 at 12:11 pmThanks Michael for your feedback.
And yes you are right it is not actually storing output,it is just storing whole query as text. Yes it is text saving in real time.
hive> set hivevar:var1=(select count(*) from summary_table);
hive> set hivevar:var1;
hivevar:var1=(select count(*) from summary_table)
Sai K
May 1, 2019 at 12:11 pmThen what is the use of your script, when it doesn’t store any result of the query.