Beeline java.lang.OutOfMemoryError: Requested array size exceeds VM limit

  • 2

Beeline java.lang.OutOfMemoryError: Requested array size exceeds VM limit

When we run beeline jobs very heavily then sometime we can see following error :

WARNING: Use "yarn jar" to launch YARN applications.
issuing: !connect jdbc:hive2://hdpsap.lowes.com:8443/default;transportMode=http;httpPath=gateway/default/hive?hive.execution.engine=tez;tez.queue.name=di;hive.exec.parallel=true;hive.vectorized.execution.enabled=true;hive.vectorized.execution.reduce.enabled hdpdib [pass$
Connecting to jdbc:hive2://hdpsap.lowes.com:8443/default;transportMode=http;httpPath=gateway/default/hive?hive.execution.engine=tez;tez.queue.name=di;hive.exec.parallel=true;hive.vectorized.execution.enabled=true;hive.vectorized.execution.reduce.enabled
17/07/01 20:00:05 [main]: INFO jdbc.Utils: Supplied authorities: hdpsap.lowes.com:8443
17/07/01 20:00:05 [main]: INFO jdbc.Utils: Resolved authority: hdpsap.lowes.com:8443
Connected to: Apache Hive (version 1.2.1.2.3.4.75-1)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:2271)
 at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
 at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
 at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:122)
 at org.apache.hive.beeline.BeeLine.getConsoleReader(BeeLine.java:863)
 at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:804)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:773)
 at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Root Cause : By default, the history file is located under ~/.beeline/history for that user who is facing this issue and beeline will load the latest 500 rows into memory. If those queries are super big, containing lots of characters, it is possible that the history file size will reach as big as a few GBs. When beeline is trying to load such big history file into memory, it will eventually fail with OutOfMemory error.

Currently Beeline does not provide an option to limit the max size for beeline history file, in the case that each query is very big, it will flood the history file and slow down beeline on start up and shutdown.

https://issues.apache.org/jira/browse/HIVE-15166

[root@m1 ]ls -ltrh /home/hdpdib/.beeline/
total 1.1G
-rw-r--r-- 1 hdpdib hdpuser 1.1G Jul1 03:15 history

Solution : So now for time-being to we have a workaround and that is to remove or clean the ~/.beeline/history file and then run again your jobs. Now you should be good for running jobs. 

[root@m1 ~]# rm /home/hdpdib/.beeline/history

Please feel free to reach out to me or give your valuable feedback.


2 Comments

Arun

January 6, 2018 at 12:49 pm

what does history file contain?

    admin

    January 16, 2018 at 7:48 am

    It contains your commands history whatever you executed in past and running in current.
    [root@m1 ] cat ~/.beeline/history
    show create table sales_margin_smmry;
    use sampledb;
    !connect jdbc:hive2://m1:10001/default;transportMode=http;httpPath=cliservice
    vmathia
    use perftestdb;
    drop table perftestdb.sales_margin_smmry;
    show tables;

Leave a Reply