GC pool ‘PS MarkSweep’ had collection(s): count=6 time=26445ms

  • 0

GC pool ‘PS MarkSweep’ had collection(s): count=6 time=26445ms

When you create table and it is enforcing authorization using Ranger then it fails to create the table and post that HiveServer2 process crashes.

0: jdbc:hive2://server1> CREATE EXTERNAL TABLE test (cust_id STRING, ACCOUNT_ID STRING,
 ROLE_ID STRING, ROLE_NAME STRING, START_DATE STRING, END_DATE STRING, PRIORITY STRING, 
ACTIVE_ACCOUNT_ROLE STRING) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n' 
STORED AS TEXTFILE LOCATION '/tmp/testTable' 
TBLPROPERTIES ('serialization.null.format'=''); 
Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)


When you check hiveserver2 logs then you will see permission denied error:
Caused by: org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException: 
Permission denied: user [saurkuma] does not have [READ] privilege on [hdfs://HDPHA/tmp/testTable]
at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges
(RangerHiveAuthorizer.java:253)

Along with the above errors, hiveserver2.log also shows repetitive GC pauses and subsequently
HiveServer2 service crashes:
2016-11-15 12:39:54,428 WARN [org.apache.hadoop.util.JvmPauseMonitor$Monitor@24197b13]:
util.JvmPauseMonitor (JvmPauseMonitor.java:run(192)) - Detected pause in JVM or host machine 
(eg GC): pause of approximately 24000ms GC pool 'PS MarkSweep' had collection(s): 
count=6 time=26445ms

Root Cause: It is because process goes to check for a permission (read or write) on a given path 
of query, Ranger checks for permissions on a given directory and all its children. However,
if the directory does not exist, it will try to check the parent directory, or its parent directory,
and so on. Eventually the table creation fails and at the same time as this operation uses too much 
memory and causes GC pauses.

In this case, Ranger checks for permission on /tmp/<databasename>, and since it does not exist it 
starts checking /tmp/ and its child directories, causing the GC Pauses and HiveServer2 service crash.

RESOLUTION:
No permamnetly solution for this issue as of now but we have following workaround. 

WORKAROUND:
Ensure that the Storage Location specified in the create table statement does exist in the system.