Beeline java.lang.OutOfMemoryError: Requested array size exceeds VM limit
Category : Hive
When we run beeline jobs very heavily then sometime we can see following error :
WARNING: Use "yarn jar" to launch YARN applications. issuing: !connect jdbc:hive2://hdpsap.lowes.com:8443/default;transportMode=http;httpPath=gateway/default/hive?hive.execution.engine=tez;tez.queue.name=di;hive.exec.parallel=true;hive.vectorized.execution.enabled=true;hive.vectorized.execution.reduce.enabled hdpdib [pass$ Connecting to jdbc:hive2://hdpsap.lowes.com:8443/default;transportMode=http;httpPath=gateway/default/hive?hive.execution.engine=tez;tez.queue.name=di;hive.exec.parallel=true;hive.vectorized.execution.enabled=true;hive.vectorized.execution.reduce.enabled 17/07/01 20:00:05 [main]: INFO jdbc.Utils: Supplied authorities: hdpsap.lowes.com:8443 17/07/01 20:00:05 [main]: INFO jdbc.Utils: Resolved authority: hdpsap.lowes.com:8443 Connected to: Apache Hive (version 1.2.1.2.3.4.75-1) Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485) Transaction isolation: TRANSACTION_REPEATABLE_READ java.lang.OutOfMemoryError: Requested array size exceeds VM limit at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:122) at org.apache.hive.beeline.BeeLine.getConsoleReader(BeeLine.java:863) at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:804) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:773) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Root Cause : By default, the history file is located under ~/.beeline/history for that user who is facing this issue and beeline will load the latest 500 rows into memory. If those queries are super big, containing lots of characters, it is possible that the history file size will reach as big as a few GBs. When beeline is trying to load such big history file into memory, it will eventually fail with OutOfMemory error.
Currently Beeline does not provide an option to limit the max size for beeline history file, in the case that each query is very big, it will flood the history file and slow down beeline on start up and shutdown.
https://issues.apache.org/jira/browse/HIVE-15166
[root@m1 ]ls -ltrh /home/hdpdib/.beeline/ total 1.1G -rw-r--r-- 1 hdpdib hdpuser 1.1G Jul1 03:15 history
Solution : So now for time-being to we have a workaround and that is to remove or clean the ~/.beeline/history file and then run again your jobs. Now you should be good for running jobs.
[root@m1 ~]# rm /home/hdpdib/.beeline/history
Please feel free to reach out to me or give your valuable feedback.