How to use Hive Query result in a variable for other query

  • 3

How to use Hive Query result in a variable for other query

Category : Hive

Many time we want to store one query result into a variable and then use this variable in some other query. So now it is possible in your favorite hadoop ecosystem i.e hive.

With the help of this article you can achieve it.

[root@m1 etc]# hive

16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist

16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist

16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist

16/10/04 02:40:45 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist

Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Step1: Check table test1 description:
hive> desc Table test1;
OK
columnA date
Time taken: 0.452 seconds, Fetched: 1 row(s)

Step2: Table test2 description:
hive> desc Table test2;
OK
columnB date
Time taken: 0.516 seconds, Fetched: 1 row(s)

Step3: Assigning subquery to variable:
hive> set hivevar:var1=(select columnB from Table test2 where columnB=’2016-09-19′);

Step4: Running main query with subquery:
hive> select * from Table test1 where columnA IN ${hivevar:var1};

 

 


3 Comments

Michael Segel

December 7, 2016 at 4:31 pm

Your title for your blog article is very misleading and wrong.

What you’re doing is just setting a variable in the shell and then using the variable in the following query.
To be clear you are not taking the result of a query and using it in another query.

To test this… after set hivevar:var1 = (select columnB from test2 where columnB=’2016-09-19′)
say set hivevar:var1.
You will get a print out of your statement. All you are saving is the text.

The issue is that many want to store things like ‘SELECT MAX(foo) FROM bar’; and have this stored in a variable called max_val.

This can’t be done… yet…

    admin

    December 8, 2016 at 12:11 pm

    Thanks Michael for your feedback.
    And yes you are right it is not actually storing output,it is just storing whole query as text. Yes it is text saving in real time.
    hive> set hivevar:var1=(select count(*) from summary_table);
    hive> set hivevar:var1;
    hivevar:var1=(select count(*) from summary_table)

      Sai K

      May 1, 2019 at 12:11 pm

      Then what is the use of your script, when it doesn’t store any result of the query.

Leave a Reply