I have seen many time that sometime error does not give a clear picture about issue and it can be mislead to us. Also we have to waste so much time to investigate it. I have found enabling debug mode is a easy way to troubleshoot any hadoop problem as it gives us a detail picture and clear step to step overview about your task.
In this article I have tried to explain a process to enable debug mode.
There are two methods to enable debug mode :
- You can enable it run time only for a specific command or job like following :
[root@sandbox ~]# export HADOOP_ROOT_LOGGER=DEBUG,console
[root@sandbox ~]# echo $HADOOP_ROOT_LOGGER;
DEBUG,console
[root@sandbox ~]# hadoop fs -ls /
16/06/29 10:35:34 DEBUG util.Shell: setsid exited with exit code 0
16/06/29 10:35:34 DEBUG conf.Configuration: parsing URL jar:file:/usr/hdp/2.4.0.0-169/hadoop/hadoop-common-2.7.1.2.4.0.0-169.jar!/core-default.xml
16/06/29 10:35:34 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@d1e67eb
16/06/29 10:35:34 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/2.4.0.0-169/0/core-site.xml
16/06/29 10:35:34 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@38509e85
16/06/29 10:35:34 DEBUG security.Groups: Creating new Groups object
16/06/29 10:35:34 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library…
16/06/29 10:35:34 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
16/06/29 10:35:34 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
16/06/29 10:35:34 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
16/06/29 10:35:34 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
16/06/29 10:35:34 DEBUG security.UserGroupInformation: hadoop login
16/06/29 10:35:34 DEBUG security.UserGroupInformation: hadoop login commit
16/06/29 10:35:34 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: root
16/06/29 10:35:34 DEBUG security.UserGroupInformation: Using user: “UnixPrincipal: root” with name root
16/06/29 10:35:34 DEBUG security.UserGroupInformation: User entry: “root”
16/06/29 10:35:34 DEBUG security.UserGroupInformation: UGI loginUser:root (auth:SIMPLE)
16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true
16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket
16/06/29 10:35:34 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
16/06/29 10:35:34 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@4215232f
16/06/29 10:35:34 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@3253bcf3
16/06/29 10:35:34 DEBUG azure.NativeAzureFileSystem: finalize() called.
16/06/29 10:35:34 DEBUG azure.NativeAzureFileSystem: finalize() called.
16/06/29 10:35:35 DEBUG unix.DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@20282aa5: starting with interruptCheckPeriodMs = 60000
16/06/29 10:35:35 DEBUG shortcircuit.DomainSocketFactory: The short-circuit local reads feature is enabled.
16/06/29 10:35:35 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
16/06/29 10:35:35 DEBUG ipc.Client: The ping interval is 60000 ms.
16/06/29 10:35:35 DEBUG ipc.Client: Connecting to sandbox.hortonworks.com/172.16.162.136:8020
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: starting, having connections 1
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root sending #0
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root got value #0
16/06/29 10:35:35 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 52ms
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root sending #1
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root got value #1
16/06/29 10:35:35 DEBUG ipc.ProtobufRpcEngine: Call: getListing took 3ms
Found 11 items
drwxrwxrwx – yarn hadoop 0 2016-03-11 10:12 /app-logs
drwxr-xr-x – hdfs hdfs 0 2016-03-11 10:18 /apps
drwxr-xr-x – yarn hadoop 0 2016-03-11 10:12 /ats
drwxr-xr-x – hdfs hdfs 0 2016-03-11 10:41 /demo
drwxr-xr-x – hdfs hdfs 0 2016-03-11 10:12 /hdp
drwxr-xr-x – mapred hdfs 0 2016-03-11 10:12 /mapred
drwxrwxrwx – mapred hadoop 0 2016-03-11 10:12 /mr-history
drwxr-xr-x – hdfs hdfs 0 2016-03-11 10:33 /ranger
drwxrwxrwx – spark hadoop 0 2016-06-29 10:35 /spark-history
drwxrwxrwx – hdfs hdfs 0 2016-03-11 10:23 /tmp
drwxr-xr-x – hdfs hdfs 0 2016-03-11 10:24 /user
16/06/29 10:35:35 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@3253bcf3
16/06/29 10:35:35 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@3253bcf3
16/06/29 10:35:35 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@3253bcf3
16/06/29 10:35:35 DEBUG ipc.Client: Stopping client
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: closed
16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: stopped, remaining connections 0
Another option is edit logger property in ambari itself and whenever your service will start then only it will keep on adding debug log into your respective log file.
1. Edit hadoop-env template section
2. Define this environment variable to enable debug logging for NameNode:
export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA"
3. Define this environment variable to enable debug logging for DataNode:
export HADOOP_DATANODE_OPTS=”${HADOOP_DATANODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA”
4. Save the configuration and restart the required HDFS services as suggested by Ambari