Monthly Archives: June 2016

  • 0

Ambari shows all services down though hadoop services running

Category : Bigdata

We have seen many time that our hadoop services are up and running but when we open ambari then it shows all are down. So basically it means services do not have any issue,it is a problem with ambari-agent.

Ambari server typically gets to know about the service availability from Ambari agent and using the ‘*.pid’ files created in /var/run.

Suspected problem 1:

[root@sandbox ambari-agent]# ambari-agent status

Found ambari-agent PID: 12112

ambari-agent running.

Agent PID at: /var/run/ambari-agent/ambari-agent.pid

Agent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

Now check pid in process also and compare like below :

[root@sandbox ambari-agent]# ps -ef | grep ‘ambari_agent’

root     12104     1  0 12:32 pts/0    00:00:00 /usr/bin/python2 /usr/lib/python2.6/site-packages/ambari_agent/AmbariAgent.py start

root     12112 12104  6 12:32 pts/0    00:01:28 /usr/bin/python2 /usr/lib/python2.6/site-packages/ambari_agent/main.py start

If the agent process id and /var/run/ambari-agent/ambari-agent.pid are matching, then possibly there is no issue with the agent process itself.

Now the issue is due to /var/lib/ambari-agent/data/structured-out-status.json. So cat this file to review the content. Typical content could be like following:

cat structured-out-status.json {“processes”: [], “securityState”: “UNKNOWN”} or

[root@sandbox ambari-agent]# cat /var/lib/ambari-agent/data/structured-out-status.json

{“processes”: [], “securityState”: “UNSECURED”}

Compare the content with the same file in another node which is working fine.

Resolution :

Now you need to delete this .json file and restart ambari-agent once again and see the content of this file to match with above given:

root@sandbox ambari-agent]# rm /var/lib/ambari-agent/data/structured-out-status.json

rm: remove regular file `/var/lib/ambari-agent/data/structured-out-status.json’? y

[root@sandbox ambari-agent]# ll /var/lib/ambari-agent/data/structured-out-status.json

ls: cannot access /var/lib/ambari-agent/data/structured-out-status.json: No such file or directory

[root@sandbox ambari-agent]# ambari-agent restart

Restarting ambari-agent

Verifying Python version compatibility…

Using python  /usr/bin/python2

Found ambari-agent PID: 13866

Stopping ambari-agent

Removing PID file at /var/run/ambari-agent/ambari-agent.pid

ambari-agent successfully stopped

Verifying Python version compatibility…

Using python  /usr/bin/python2

Checking for previously running Ambari Agent…

Starting ambari-agent

Verifying ambari-agent process status…

Ambari Agent successfully started

Agent PID at: /var/run/ambari-agent/ambari-agent.pid

Agent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

[root@sandbox ambari-agent]# ll /var/lib/ambari-agent/data/structured-out-status.json

-rw-r–r– 1 root root 73 2016-06-29 12:59 /var/lib/ambari-agent/data/structured-out-status.json

[root@sandbox ambari-agent]# cat /var/lib/ambari-agent/data/structured-out-status.json

{“processes”: [], “securityState”: “UNSECURED”}

Suspected Problem 2: Ambari Agent is good, but the HDP services are still shown to be down

If there are only few services which are shown to be down, then it could be due to the /var/run/PRODUCT/product.pid file is not matching with the process running in the node.

For eg, if Hiveserver2 service is shown to be not up in Ambari, when hive is actually working fine, check the following files:

  1. # cd /var/run/hive # ls -lrt-rw-r–r– 1 hive hadoop 6 Feb 17 07:15 hive.pid -rw-r–r– 1 hive hadoop 6 Feb 17 07:16 hive-server.pid

Check the content of these files. For eg,

  1. # cat hive-server.pid
  2. 31342
  3. # ps -ef | grep 31342
  4. hive 31342 1 0 Feb17 ? 00:14:36 /usr/jdk64/jdk1.7.0_67/bin/java Xmx1024m Dhdp.version=2.2.9.03393 Djava.net.preferIPv4Stack=true Dhdp.version=2.2.9.03393 Dhadoop.log.dir=/var/log/hadoop/hive Dhadoop.log.file=hadoop.log Dhadoop.home.dir=/usr/hdp/2.2.9.03393/hadoop Dhadoop.id.str=hive Dhadoop.root.logger=INFO,console Djava.library.path=:/usr/hdp/current/hadoopclient/lib/native/Linuxamd6464:/usr/hdp/2.2.9.03393/hadoop/lib/native Dhadoop.policy.file=hadooppolicy.xml Djava.net.preferIPv4Stack=true Xmx1024m XX:MaxPermSize=512m Xmx1437m Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.2.9.03393/hive/lib/hiveservice0.14.0.2.2.9.03393.jar org.apache.hive.service.server.HiveServer2 hiveconf hive.aux.jars.path=file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar -hiveconf hive.metastore.uris= -hiveconf hive.log.file=hiveserver2.log -hiveconf hive.log.dir=/var/log/hive

If the content of hive-server.pid and the process running for HiveServer2 aren’t matching, then Ambari wouldn’t report the status correctly.

Ensure that these files have correct ownership / permissions. For eg, the pid files for Hive should be owned by hive:hadoop and it should be 644. In this situation, change the ownership/ permission correctly and update the file with the correct PID of hive process. This would ensure that Ambari shows the status correctly.

Care should be taken while doing the above by ensuring that this is the only HiveServer2 process running in the system and that HiveServer2 is indeed working fine. If there are multiple HiveServer2 processes, then some of them could be stray which needs to be killed.

Post this, if possible also restart the affected services and ensure that the status of the services are correctly shown.


  • 0

How to enable debug logging for HDFS

Category : Bigdata

I have seen many time that sometime error does not give a clear picture about issue and it can be mislead to us. Also we have to waste so much time to investigate it. I have found enabling debug mode is a easy way to troubleshoot any hadoop problem as it gives us a detail picture and clear step to step overview about your task.

In this article I have tried to explain a process to enable debug mode.

There are two methods to enable debug mode :

  1. You can enable it run time only for a specific command or job like following :

[root@sandbox ~]# export HADOOP_ROOT_LOGGER=DEBUG,console

[root@sandbox ~]# echo $HADOOP_ROOT_LOGGER;

DEBUG,console

[root@sandbox ~]# hadoop fs -ls /

16/06/29 10:35:34 DEBUG util.Shell: setsid exited with exit code 0

16/06/29 10:35:34 DEBUG conf.Configuration: parsing URL jar:file:/usr/hdp/2.4.0.0-169/hadoop/hadoop-common-2.7.1.2.4.0.0-169.jar!/core-default.xml

16/06/29 10:35:34 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@d1e67eb

16/06/29 10:35:34 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/2.4.0.0-169/0/core-site.xml

16/06/29 10:35:34 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@38509e85

16/06/29 10:35:34 DEBUG security.Groups:  Creating new Groups object

16/06/29 10:35:34 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library…

16/06/29 10:35:34 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library

16/06/29 10:35:34 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution

16/06/29 10:35:34 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping

16/06/29 10:35:34 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000

16/06/29 10:35:34 DEBUG security.UserGroupInformation: hadoop login

16/06/29 10:35:34 DEBUG security.UserGroupInformation: hadoop login commit

16/06/29 10:35:34 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: root

16/06/29 10:35:34 DEBUG security.UserGroupInformation: Using user: “UnixPrincipal: root” with name root

16/06/29 10:35:34 DEBUG security.UserGroupInformation: User entry: “root”

16/06/29 10:35:34 DEBUG security.UserGroupInformation: UGI loginUser:root (auth:SIMPLE)

16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false

16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true

16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false

16/06/29 10:35:34 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket

16/06/29 10:35:34 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null

16/06/29 10:35:34 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@4215232f

16/06/29 10:35:34 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@3253bcf3

16/06/29 10:35:34 DEBUG azure.NativeAzureFileSystem: finalize() called.

16/06/29 10:35:34 DEBUG azure.NativeAzureFileSystem: finalize() called.

16/06/29 10:35:35 DEBUG unix.DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@20282aa5: starting with interruptCheckPeriodMs = 60000

16/06/29 10:35:35 DEBUG shortcircuit.DomainSocketFactory: The short-circuit local reads feature is enabled.

16/06/29 10:35:35 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection

16/06/29 10:35:35 DEBUG ipc.Client: The ping interval is 60000 ms.

16/06/29 10:35:35 DEBUG ipc.Client: Connecting to sandbox.hortonworks.com/172.16.162.136:8020

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: starting, having connections 1

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root sending #0

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root got value #0

16/06/29 10:35:35 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 52ms

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root sending #1

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root got value #1

16/06/29 10:35:35 DEBUG ipc.ProtobufRpcEngine: Call: getListing took 3ms

Found 11 items

drwxrwxrwx   – yarn   hadoop          0 2016-03-11 10:12 /app-logs

drwxr-xr-x   – hdfs   hdfs            0 2016-03-11 10:18 /apps

drwxr-xr-x   – yarn   hadoop          0 2016-03-11 10:12 /ats

drwxr-xr-x   – hdfs   hdfs            0 2016-03-11 10:41 /demo

drwxr-xr-x   – hdfs   hdfs            0 2016-03-11 10:12 /hdp

drwxr-xr-x   – mapred hdfs            0 2016-03-11 10:12 /mapred

drwxrwxrwx   – mapred hadoop          0 2016-03-11 10:12 /mr-history

drwxr-xr-x   – hdfs   hdfs            0 2016-03-11 10:33 /ranger

drwxrwxrwx   – spark  hadoop          0 2016-06-29 10:35 /spark-history

drwxrwxrwx   – hdfs   hdfs            0 2016-03-11 10:23 /tmp

drwxr-xr-x   – hdfs   hdfs            0 2016-03-11 10:24 /user

16/06/29 10:35:35 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@3253bcf3

16/06/29 10:35:35 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@3253bcf3

16/06/29 10:35:35 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@3253bcf3

16/06/29 10:35:35 DEBUG ipc.Client: Stopping client

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: closed

16/06/29 10:35:35 DEBUG ipc.Client: IPC Client (1548560986) connection to sandbox.hortonworks.com/172.16.162.136:8020 from root: stopped, remaining connections 0

Another option is edit logger property in ambari itself and whenever your service will start then only it will keep on adding debug log into your respective log file. 

1. Edit hadoop-env template section

2. Define this environment variable to enable debug logging for NameNode:

export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA"

3. Define this environment variable to enable debug logging for DataNode:

export HADOOP_DATANODE_OPTS=”${HADOOP_DATANODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA”

4. Save the configuration and restart the required HDFS services as suggested by Ambari


  • 0

Backup and Restore of Postgres Database

How To Backup Postgres Database

1. Backup a single postgres database

This example will backup erp database that belongs to user geekstuff, to the file mydb.sql

$ pg_dump -U geekstuff erp -f mydb.sql


It prompts for password, after authentication mydb.sql got created with create table, alter table and copy commands for all the tables in the erp database. Following is a partial output of mydb.sql showing the dump information of employee_details table.

--
-- Name: employee_details; Type: TABLE; Schema: public; Owner: geekstuff; Tablespace:
--

CREATE TABLE employee_details (
employee_name character varying(100),
emp_id integer NOT NULL,
designation character varying(50),
comments text
);

ALTER TABLE public.employee_details OWNER TO geekstuff;

--
-- Data for Name: employee_details; Type: TABLE DATA; Schema: public; Owner: geekstuff
--
COPY employee_details (employee_name, emp_id, designation, comments) FROM stdin;
geekstuff 1001 trainer
ramesh 1002 author
sathiya 1003 reader
\.
--
-- Name: employee_details_pkey; Type: CONSTRAINT; Schema: public; Owner: geekstuff; Tablespace:
--
ALTER TABLE ONLY employee_details

ADD CONSTRAINT employee_details_pkey PRIMARY KEY (emp_id);

2. Backup all postgres databases

To backup all databases, list out all the available databases as shown below.

Login as postgres / psql user:

$ su postgres

List the databases:

$ psql -l

List of databases
Name | Owner | Encoding
-----------+-----------+----------
article | sathiya | UTF8
backup | postgres | UTF8
erp | geekstuff | UTF8
geeker | sathiya | UTF8

Backup all postgres databases using pg_dumpall:

You can backup all the databases using pg_dumpall command.

$ pg_dumpall > all.sql

Verify the backup:

Verify whether all the databases are backed up,

$ grep "^[\]connect" all.sql
\connect article
\connect backup
\connect erp
\connect geeker

3. Backup a specific postgres table

$ pg_dump --table products -U geekstuff article -f onlytable.sql

To backup a specific table, use the –table TABLENAME option in the pg_dump command. If there are same table names in different schema then use the –schema SCHEMANAME option.

How To Restore Postgres Database

1. Restore a postgres database

$ psql -U erp -d erp_devel -f mydb.sql

This restores the dumped database to the erp_devel database.

Restore error messages

While restoring, there may be following errors and warning, which can be ignored.

psql:mydb.sql:13: ERROR:  must be owner of schema public
psql:mydb.sql:34: ERROR:  must be member of role "geekstuff"
psql:mydb.sql:59: WARNING:  no privileges could be revoked
psql:mydb.sql:60: WARNING:  no privileges could be revoked
psql:mydb.sql:61: WARNING:  no privileges were granted
psql:mydb.sql:62: WARNING:  no privileges were granted

2. Backup a local postgres database and restore to remote server using single command:

$ pg_dump dbname | psql -h hostname dbname

The above dumps the local database, and extracts it at the given hostname.

3. Restore all the postgres databases

$ su postgres
$ psql -f alldb.sql

4. Restore a single postgres table

The following psql command installs the product table in the geek stuff database.

$ psql -f producttable.sql geekstuff



  • 0

Hive Cross Cluster replication

Hive Cross-Cluster Replication

Here I tried to explain cross-Cluster Replication with a Feed entity. This is a simple way to enforce Disaster Recovery policies or aggregate data from multiple clusters to a single cluster for enterprise reporting. To further illustrate Apache Falcon’s capabilities, we will use an HCatalog/Hive table as the Feed entity.

Step 1: First create databases/tables on source and target clusters:

— Run on primary cluster
create database landing_db;
use landing_db;
CREATE TABLE summary_table(id int, value string) PARTITIONED BY (ds string);
ALTER TABLE summary_table ADD PARTITION (ds = ‘2014-01’);
ALTER TABLE summary_table ADD PARTITION (ds = ‘2014-02’);
ALTER TABLE summary_table ADD PARTITION (ds = ‘2014-03’);

 

insert into summary_table PARTITION(ds) values (1,’abc1′,”2014-01″);
insert into summary_table PARTITION(ds) values (2,’abc2′,”2014-02″);
insert into summary_table PARTITION(ds) values (3,’abc3′,”2014-03″);

 

— Run on secondary cluster

create database archive_db;
use archive_db;
CREATE TABLE summary_archive_table(id int, value string) PARTITIONED BY (ds string);
Step 2: Now create falcon staging and working directories on both clusters:

 

hadoop fs -mkdir /apps/falcon/staging

hadoop fs -mkdir /apps/falcon/working

hadoop fs -chown falcon /apps/falcon/staging

hadoop fs -chown falcon /apps/falcon/working

hadoop fs -chmod 777 /apps/falcon/staging

hadoop fs -chmod 755 /apps/falcon/working

 

Step 3: Configure your source and target cluster for Distcp in NN High Availability. 
http://www.hadoopadmin.co.in/bigdata/distcp-between-high-availability-enabled-cluster/
In order to distcp between two HDFS HA cluster (for example A and B),modify the following in the hdfs-site.xml for both clusters:

For example, nameservice for cluster A and B is HAA and HAB respectively.
– Add value to the nameservice for both clusters

dfs.nameservices = CLUSTERAHA,CLUSTERBHA

– Add property dfs.internal.nameservices
In cluster A:
dfs.internal.nameservices =CLUSTERAHA
In cluster B:
dfs.internal.nameservices =CLUSTERBHA

– Add dfs.ha.namenodes.<nameservice>
In cluster A
dfs.ha.namenodes.CLUSTERBHA = nn1,nn2
In cluster B
dfs.ha.namenodes.CLUSTERAHA = nn1,nn2

– Add property dfs.namenode.rpc-address.<cluster>.<nn>
In cluster A
dfs.namenode.rpc-address.CLUSTERBHA.nn1 =server1:8020
dfs.namenode.rpc-address.CLUSTERBHA.nn2 =server2:8020
In cluster B
dfs.namenode.rpc-address.CLUSTERAHA.nn1 =server1:8020
dfs.namenode.rpc-address.CLUSTERAHA.nn2 =server2:8020

– Add property dfs.client.failover.proxy.provider.<cluster – i.e HAA or HAB>
In cluster A
dfs.client.failover.proxy.provider.CLUSTERBHA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
In cluster B
dfs.client.failover.proxy.provider.CLUSTERAHA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

– Restart HDFS service.

Once complete you will be able to run the distcp command using the nameservice similar to:
hadoop distcp hdfs://falconG/tmp/testDistcp hdfs://falconE/tmp/

hadoop distcp hdfs://CLUSTERAHA/user/s0998dnz/input.txt hdfs://CLUSTERBHA/tmp/

Step 4: Now create cluster entities and submit them like below sample cluster definition for source and target cluster. 

 

[s0998dnz@server1 hiveReplication]$ ll

total 24

-rw-r–r– 1 s0998dnz hdpadm 1031 Jun 15 06:43 cluster1.xml

-rw-r–r– 1 s0998dnz hdpadm 1030 Jun 15 05:11 cluster2.xml

-rw-r–r– 1 s0998dnz hdpadm 1141 Jun 1 05:44 destinationCluster.xml

-rw-r–r– 1 s0998dnz hdpadm 794 Jun 15 05:05 feed.xml

-rw-r–r– 1 s0998dnz hdpadm 1114 Jun 1 06:36 replication-feed.xml

-rw-r–r– 1 s0998dnz hdpadm 1080 Jun 15 05:07 sourceCluster.xml

[s0998dnz@server1 hiveReplication]$ cat cluster1.xml

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?>

<cluster name=”source” description=”primary” colo=”primary” xmlns=”uri:falcon:cluster:0.1″>

<tags>EntityType=Cluster</tags>

<interfaces>

<interface type=”readonly” endpoint=”hdfs://CLUSTERAHA” version=”2.2.0″/>

<interface type=”write” endpoint=”hdfs://CLUSTERAHA” version=”2.2.0″/>

<interface type=”execute” endpoint=”server2:8050″ version=”2.2.0″/>

<interface type=”workflow” endpoint=”http://server1:11000/oozie/” version=”4.0.0″/>

<interface type=”messaging” endpoint=”tcp://server2:61616?daemon=true” version=”5.1.6″/>

<interface type=”registry” endpoint=”thrift://server2:9083″ version=”1.2.1″ />

</interfaces>

<locations>

<location name=”staging” path=”/apps/falcon/staging”/>

<location name=”temp” path=”/tmp”/>

<location name=”working” path=”/apps/falcon/working”/>

</locations>

</cluster>

 

[s0998dnz@server2 hiveReplication]$ cat cluster2.xml

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?>

<cluster name=”target” description=”target” colo=”backup” xmlns=”uri:falcon:cluster:0.1″>

<tags>EntityType=Cluster</tags>

<interfaces>

<interface type=”readonly” endpoint=”hdfs://CLUSTERBHA” version=”2.2.0″/>

<interface type=”write” endpoint=”hdfs://CLUSTERBHA” version=”2.2.0″/>

<interface type=”execute” endpoint=”server2:8050″ version=”2.2.0″/>

<interface type=”workflow” endpoint=”http://server2:11000/oozie/” version=”4.0.0″/>

<interface type=”messaging” endpoint=”tcp://server2:61616?daemon=true” version=”5.1.6″/>

<interface type=”registry” endpoint=”thrift://server2:9083″ version=”1.2.1″ />

</interfaces>

<locations>

<location name=”staging” path=”/apps/falcon/staging”/>

<location name=”temp” path=”/tmp”/>

<location name=”working” path=”/apps/falcon/working”/>

</locations>

</cluster>

 

falcon entity -type cluster -submit -file cluster1.xml

falcon entity -type cluster -submit -file cluster2.xml
Step 5: Copy updated configuration files (/etc/hadoop/conf/*) from source cluster to target’s server.

 

zip -r sourceClusterConf1.zip /etc/hadoop/conf/

scp sourceClusterConf1.zip s0998dnz@server1:/home/s0998dnz/
Step 6: At target oozie server run following command.

 

mkdir -p /hdptmp/hadoop_primary/conf

chmod 777 /hdptmp/hadoop_primary/conf

unzip sourceClusterConf1.zip

cp etc/hadoop/conf/* /hdptmp/hadoop_primary/conf/

cp -r etc/hadoop/conf/* /hdptmp/hadoop_primary/conf/
Step 7: Modify below property in your target cluster’s oozie once you have copied the configuration. 

<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*={{hadoop_conf_dir}},server2:8050=/hdptmp/hadoop_primary/conf,server1:8050=/hdptmp/hadoop_primary/conf,server1:8020=/hdptmp/hadoop_primary/conf,server2:8020=/hdptmp /hadoop_primary/conf</value>

Note : You can change /hdptmp/hadoop_primary/conf to directory of your choice. However oozie should have access to the path. 
Step 8: Finally submit and schedule the feed definition using attached feed.xml file. 

 

[s0998dnz@server1 hiveReplication]$ cat feed.xml

<?xml version=”1.0″ encoding=”UTF-8″?>

<feed description=”Monthly Analytics Summary” name=”replication-feed” xmlns=”uri:falcon:feed:0.1″>

<tags>EntityType=Feed</tags>

<frequency>months(1)</frequency>

<clusters>

<cluster name=”source” type=”source”>

<validity start=”2014-01-01T00:00Z” end=”2015-03-31T00:00Z”/>

<retention limit=”months(36)” action=”delete”/>

</cluster>

<cluster name=”target” type=”target”>

<validity start=”2014-01-01T00:00Z” end=”2016-03-31T00:00Z”/>

<retention limit=”months(180)” action=”delete”/>

<table uri=”catalog:archive_db:summary_archive_table#ds=${YEAR}-${MONTH}” />

</cluster>

</clusters>

<table uri=”catalog:landing_db:summary_table#ds=${YEAR}-${MONTH}” />

<ACL owner=”falcon” />

<schema location=”hcat” provider=”hcat”/>

</feed>

falcon entity -type feed -submit -file feed.xml

falcon entity -type feed -schedule -name replication-feed
This feed has been scheduled from 2014-01 so insert below values in your source table.

 


  • 0

How to read compressed data from hdfs through hadoop command

Category : Bigdata

Sometime we have a requirement where we need to read compressed data from hdfs through hdfs command. And we have many compressed algorithms like(.gz, .snappy, .lzo and .bz2 etc).

I have tried to explain how we can achieve this requirement with the help of following ways :

Step 1: Copy any compressed file to your hdfs dir: 

[s0998dnz@hdpm1 ~]$ hadoop fs -put logs.tar.gz /tmp/

Step 2: Now you can use in-build hdfs text command to read this .gz file. This command-line will automatically find the right decompressor for any simple text file and print the uncompressed data to standard output:

[user1@hdpm1 ~]$ hadoop fs -text /tmp/logs.tar.gzvar/log/hadoop/hdfs/gc.log-2016052306240000644002174336645170000001172412720563430016153 0ustar   hdfshadoop2016-05-23T06:24:03.539-0400: 2.104: [GC2016-05-23T06:24:03.540-0400: 2.104: [ParNew: 163840K->14901K(184320K), 0.0758510 secs] 163840K->14901K(33533952K), 0.0762040 secs] [Times: user=0.51 sys=0.01, real=0.08 secs]2016-05-23T06:24:04.613-0400: 3.178: [GC2016-05-23T06:24:04.613-0400: 3.178: [ParNew: 178741K->16370K(184320K), 0.1591140 secs] 965173K->882043K(33533952K), 0.1592230 secs] [Times: user=1.21 sys=0.03, real=0.16 secs]2016-05-23T06:24:06.121-0400: 4.686: [GC2016-05-23T06:24:06.121-0400: 4.686: [ParNew: 180210K->11741K(184320K), 0.0811950 secs] 1045883K->887215K(33533952K), 0.0813160 secs] [Times: user=0.63 sys=0.00, real=0.09 secs]2016-05-23T06:24:12.313-0400: 10.878: [GC2016-05-23T06:24:12.313-0400: 10.878: [ParNew: 175581K->9827K(184320K), 0.0751580 secs] 1051055K->892704K(33533952K), 0.0752800 secs] [Times: user=0.56 sys=0.01, real=0.07 secs]2016-05-23T06:24:13.881-0400: 12.445: [GC2016-05-23T06:24:13.881-0400: 12.445: [ParNew: 173667K->20480K(184320K), 0.0810330 secs] 1056544K->920485K(33533952K), 0.0812040 secs] [Times: user=0.58 sys=0.01, real=0.08 secs]2016-05-23T06:24:16.515-0400: 15.080: [GC2016-05-23T06:24:16.515-0400: 15.080: [ParNew: 184320K->13324K(184320K), 0.0867770 secs] 1084325K->931076K(33533952K), 0.0870140 secs] [Times: user=0.63 sys=0.01, real=0.08 secs]2016-05-23T06:24:17.268-0400: 15.833: [GC2016-05-23T06:24:17.268-0400: 15.833: [ParNew: 177164K->11503K(184320K), 0.0713880 secs] 1094916K->929256K(33533952K), 0.0715820 secs] [Times: user=0.55 sys=0.00, real=0.07 secs]2016-05-23T06:25:14.412-0400: 72.977: [GC2016-05-23T06:25:14.412-0400: 72.977: [ParNew: 175343K->18080K(184320K), 0.0779040 secs] 1093096K->935833K(33533952K), 0.0781710 secs] [Times: user=0.59 sys=0.01, real=0.07 secs]2016-05-23T06:26:49.597-0400: 168.161: [GC2016-05-23T06:26:49.597-0400: 168.162: [ParNew: 181920K->13756K(184320K), 0.0839120 secs] 1099673K->941811K(33533952K), 0.0841350 secs] [Times: user=0.62 sys=0.01, real=0.08 secs]2016-05-23T06:26:50.126-0400: 168.691: [GC2016-05-23T06:26:50.127-0400: 168.691: [ParNew: 177596K->9208K(184320K), 0.0641380 secs] 1105651K->937264K(33533952K), 0.0644310 secs] [Times: user=0.50 sys=0.00, real=0.07 secs]2016-05-23T06:27:19.282-0400: 197.846: [GC2016-05-23T06:27:19.282-0400: 197.847: [ParNew: 173048K->10010K(184320K), 0.0687210 secs] 1101104K->938065K(33533952K), 0.0689210 secs] [Times: user=0.54 sys=0.00, real=0.07 secs]2016-05-23T06:30:45.428-0400: 403.992: [GC2016-05-23T06:30:45.428-0400: 403.992: [ParNew: 173850K->9606K(184320K), 0.0723210 secs] 1101905K->937661K(33533952K), 0.0726160 secs] [Times: user=0.56 sys=0.00, real=0.07 secs]2016-05-23T06:37:15.629-0400: 794.193: [GC2016-05-23T06:37:15.629-0400: 794.193: [ParNew: 173446K->9503K(184320K), 0.0723460 secs] 1101501K->937558K(33533952K), 0.0726260 secs] [Times: user=0.57 sys=0.0

In the above example I have tried to read .gz files. It probably works for .snappy, .lzo and .bz2 files.

This is an important feature because Hadoop uses a custom file format for Snappy files. This is the only direct way to uncompress a Hadoop-created Snappy file.

Note: hadoop fs -text is single-threaded and runs the decompression on the machine where you run the command.


  • 7

How do I change an existing Ambari DB Postgres to MySQL?

Category : Ambari , Bigdata

By default when you configure your ambari server then it runs on postgres database. And if after sometime we need to change it to our comfortable and your org lovable db(like mysql) then you need to use following steps.

Step 1: Please stop your ambari server and then take back of postgres  ambari db(the default password is ‘bigdata’):

$ ambari-server stop

$ pg_dump -U ambari ambari > /temp/ambari.sql

Step 2: Now you need to setup mysql on any of the node with the help of following command :

$ yum install mysql-connector-java

Step 2: Now confirm that .jar is in the Java share directory and Make sure the .jar file has the appropriate permissions – 644

$ ls /usr/share/java/mysql-connector-java.jar

Step 3: Create a user for Ambari and grant it permissions.

For example, using the MySQL database admin utility:

# mysql -u root -p

CREATE USER ‘ambari’@’%’ IDENTIFIED BY ‘bigdata’;

GRANT ALL PRIVILEGES ON *.* TO ‘ambari’@’%’;

CREATE USER ‘ambari’@’localhost’ IDENTIFIED BY ‘bigdata’;

GRANT ALL PRIVILEGES ON *.* TO ‘ambari’@’localhost’;

CREATE USER ‘ambari’@’hdpm1.com>’ IDENTIFIED BY ‘bigdata’;

GRANT ALL PRIVILEGES ON *.* TO ‘ambari’@’hdpm1.com’;

FLUSH PRIVILEGES;

Step 4: Now you need load/restore the Ambari Server database schema.

mysql -u ambari -p

CREATE DATABASE ambaridb;

USE ambaridb;

SOURCE temp/ambari.sql; (the backup from postgres);

Step 5: Now finally update the ambari-server configuration to reference the MySQL instance:

  • On the ambari-server node you need to run ambari setup:

​            $ ambari-server setup

Enter advanced database configuration [y/n] (n)?

choose “y”, and follow the steps for setting it up as MySQL (option 3) using the guide mentioned above. Once that is done, don’t change any other settings after the db change.

Once setup is complete for the MySQL instance then you can start ambari:

$ ​ambari-server start

So now you have successfully migrated ambari postgres db to mysql.


  • 1

Error: java.io.IOException: java.lang.RuntimeException: serious problem (state=,code=0)

Category : Bigdata

If you run your hive query on ORC tables in hdp 2.3.4 then you may encounter this issue and it is because ORC split generation running on a global threadpool and doAs not being propagated to that threadpool. Threads in the threadpool are created on demand at execute time and thus execute as random users that were active at that time.

It is known issue and fixed by hitting: https://issues.apache.org/jira/browse/HIVE-13120 jira.

Intermittently ODBC users get error that another user doesn’t have permissions on the table. It seems hiveserver2 is checking on wrong user. For example, say you run a job as user ‘user1’, then the error message you will get is something like:

WARN [HiveServer2-Handler-Pool: Thread-587]: thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(681)) – Error fetching results:
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: serious problem
Caused by: java.io.IOException: java.lang.RuntimeException: serious problem

Caused by: java.lang.RuntimeException: serious problem
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)

Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.security.AccessControlException: Permission denied: user=haha, access=READ_EXECUTE, inode=”/apps/hive/warehouse/xixitb”:xixi:hdfs:drwx——
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)

Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=haha, access=READ_EXECUTE, inode=”/apps/hive/warehouse/xixitb”:xixi:hdfs:drwx——
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
Note that user ‘haha’ is not querying on ‘xixitb’ at all.

Resolution:
For this issue we need to set following property at run time as a workaround which is to turn off local fetch task for hiveserver2.

set hive.fetch.task.conversion=none

0: jdbc:hive2://localhost:8443/default> set hive.fetch.task.conversion=none;

No rows affected (0.033 seconds)

0: jdbc:hive2://localhost:8443/default> select * from database1.table1 where lct_nbr=2413 and ivo_nbr in (17469,18630);

INFO  : Tez session hasn’t been created yet. Opening session

INFO  : Dag name: select * from ldatabase1.table1…(17469,18630)(Stage-1)

INFO  :

INFO  : Status: Running (Executing on YARN cluster with App id application_1462173172032_65644)

INFO  : Map 1: -/-

INFO  : Map 1: 0/44

INFO  : Map 1: 0(+1)/44

Please feel free to give any feedback or suggestion for any improvements.


  • 0

Ranger User sync does not work due to ERROR UserGroupSync [UnixUserSyncThread]

Category : Bigdata

If we have enabled AD/LDAP user sync in ranger and we get below error then we need to follow given steps to resolve it.

LdapUserGroupBuilder [UnixUserSyncThread] – Updating user count: 148, userName:, groupList: [test, groups]
09 Jun 2016 09:04:34 ERROR UserGroupSync [UnixUserSyncThread] – Failed to initialize UserGroup source/sink. Will retry after 3600000 milliseconds. Error details:
javax.naming.PartialResultException: Unprocessed Continuation Reference(s); remaining name ‘dc=companyName,dc=com’
at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2866)
at com.sun.jndi.ldap.LdapCtx.processReturnCode(LdapCtx.java:2840)
at com.sun.jndi.ldap.LdapNamingEnumeration.getNextBatch(LdapNamingEnumeration.java:147)
at com.sun.jndi.ldap.LdapNamingEnumeration.hasMoreImpl(LdapNamingEnumeration.java:216)
at com.sun.jndi.ldap.LdapNamingEnumeration.hasMore(LdapNamingEnumeration.java:189)
at org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder.updateSink(LdapUserGroupBuilder.java:318)
at org.apache.ranger.usergroupsync.UserGroupSync.run(UserGroupSync.java:58)
at java.lang.Thread.run(Thread.java:745)

Root Cause:  When ranger usersync is set for ranger.usersync.ldap.referral = ignore the ldap search will prematurely fail when it encounters additional referrals.

Resolution:

  1. Change base db to dc=companyName,dc=com from cn=Users,dc=companyName,dc=com
  2. Also change ranger.usersync.ldap.referral = follow from ranger.usersync.ldap.referral = ignore

So in this way it will resolve the above issue. I hope it helped you to solve your issue very easily.

Please feel free to give feedback or suggestion for any improvements.