Hive Actions with Oozie

  • 0

Hive Actions with Oozie

Category : Hive

One of my friend was trying to run some hive .hql in their Oozie workflow and was getting error. Then I decided to replicate it on my cluster and finally I did it after some retry.

If you have the same requirement where you have to run hive sql via oozie then this article will help you to do your job.

Step 1: First create some dir inside hdfs(under your home dir) to have all script in same place and then run it from there: 

[hdfs@m1 ~]$ hadoop fs -mkdir -p /user/ambari-qa/tutorial/hive-oozie

[root@m1 ]# hadoop fs -mkdir -p /user/ambari-qa/tutorial/hive-input

Step 2: Now create your workflow.xml and job.properties: 

[root@m1 hive_oozie_demo]# cat workflow.xml

<workflow-app xmlns=”uri:oozie:workflow:0.4″ name=”hive-wf”>

    <start to=”hive-node”/>

    <action name=”hive-node”>

        <hive xmlns=”uri:oozie:hive-action:0.2″>

        <job-tracker>${jobTracker}</job-tracker>

        <name-node>${nameNode}</name-node>

        <job-xml>hive-site.xml</job-xml>

            <configuration>

                <property>

                    <name>mapred.job.queue.name</name>

                    <value>${queueName}</value>

                </property>              

            </configuration>

            <script>script.hql</script>

            <param>INPUT_PATH=${inputPath}</param>

        </hive>

        <ok to=”end”/>

        <error to=”fail”/>

    </action>

    <kill name=”fail”>

        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>

    </kill>

    <end name=”end”/>

</workflow-app>

[root@m1 hive_oozie_demo]# cat job.properties

nameNode=hdfs://HDPINF

jobTracker=192.168.56.42:50300

queueName=default

exampleRoot=example

oozie.use.system.libpath=true

oozie.libpath=/user/oozie/share/lib

oozie.wf.application.path=${nameNode}/user/ambari-qa/${exampleRoot}/hive-oozie

inputPath=${nameNode}/user/ambari-qa/${exampleRoot}/hive-input/*

Step 3: Now create hive table in hive database :

hive> create table demo(id int, name string);

Step 4: Now create your hive script :

[root@m1 hive_oozie_demo]# cat script.hql

insert into test.demo select * from test.demo1;

Step 5: Now you need to setup your Oozie workflow app folder. You need one very important file to execute Hive action through Oozie which is hive-site.xml. When Oozie executes a Hive action, it needs Hive’s configuration file. You can provide multiple configurations file in a single action. You can find your Hive configuration file from “/etc/hive/conf.dist/hive-site.xml” (default location). Copy that file and put it inside your workflow application path in HDFS.

[root@m1 hive_oozie_demo]# hadoop fs -put /etc/hive/conf/hive-site.xml /user/ambari-qa/tutorial/hive-oozie/

[root@m1 hive_oozie_demo]# hadoop fs -put script.hql /user/ambari-qa/tutorial/hive-oozie/

[root@m1 hive_oozie_demo]# hadoop fs -put workflow.xml /user/ambari-qa/tutorial/hive-oozie/

[root@m1 hive_oozie_demo]# hadoop fs -lsr /user/ambari-qa/tutorial/hive-oozie

lsr: DEPRECATED: Please use ‘ls -R’ instead.

-rw-r–r–   3 root hdfs      19542 2016-10-08 04:36 /user/ambari-qa/tutorial/hive-oozie/hive-site.xml

-rw-r–r–   3 root hdfs         65 2016-10-08 04:36 /user/ambari-qa/tutorial/hive-oozie/script.hql

-rw-r–r–   3 root hdfs        878 2016-10-08 04:38 /user/ambari-qa/tutorial/hive-oozie/workflow.xml

Look at the <job-xml> tag, since I’m putting hive-site.xml in my application path, so I’m just passing the file name not the whole location. If you want to keep that file in some other location of your HDFS, then you can pass the whole HDFS path there too. In older version of Hive, user had to provide the hive-default.xml file by using property key oozie.hive.defaults while running Oozie Hive action, but from now on (Hive 0.8+) it’s not required anymore.

Step 6: Now you need to submit oozie job to run it :

[ambari-qa@m1 ~]$ oozie job -oozie http://m2.hdp22:11000/oozie -config job.properties -run

job: 0000004-161008041417432-oozie-oozi-W

Now you can check your oozie workflow status via oozie web UI or command prompt :

[ambari-qa@m1 ~]$ oozie job -oozie http://m2.hdp22:11000/oozie -info 0000004-161008041417432-oozie-oozi-W

Job ID : 0000004-161008041417432-oozie-oozi-W

————————————————————————————————————————————

Workflow Name : hive-wf

App Path      : hdfs://HDPINF/user/ambari-qa/tutorial/hive-oozie

Status        : SUCCEEDED

Run           : 0

User          : ambari-qa

Group         : –

Created       : 2016-10-08 11:02 GMT

Started       : 2016-10-08 11:02 GMT

Last Modified : 2016-10-08 11:02 GMT

Ended         : 2016-10-08 11:02 GMT

CoordAction ID: –

Actions

————————————————————————————————————————————

ID                                                                            Status    Ext ID                 Ext Status Err Code

————————————————————————————————————————————

0000004-161008041417432-oozie-oozi-W@:start:                                  OK                              OK                  

————————————————————————————————————————————

0000004-161008041417432-oozie-oozi-W@hive-node                                OK        job_1475917713796_0007 SUCCEEDED           

————————————————————————————————————————————

0000004-161008041417432-oozie-oozi-W@end                                      OK                              OK                  

————————————————————————————————————————————

If it is successful then you can check your table as it would have been uploaded with data.

hive> select * from demo;

OK

1 saurabh

Time taken: 0.328 seconds, Fetched: 1 row(s)

I hope this article will help you to run your hive sql in oozie workflow. Please feel free to reach out to me in case of any suggestion or doubt.

Common issue : 

Issue 1:You may see namenode issue, if you have hard coded namenode URI in your job.properties and your mentioned nn is in standby then you will see this issue. 

[ambari-qa@m1 ~]$ oozie job -oozie http://m2.hdp22:11000 -config job.properties -run

Error: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, Authentication failed, status: 404, message: Not Found

[ambari-qa@m1 ~]$ oozie job -oozie http://m2.hdp22:11000/oozie -config job.properties -run

Error: E0501 : E0501: Could not perform authorization operation, Operation category READ is not supported in state standby  at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)  at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1786)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1305)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3851)  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1011)  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2081)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2077)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:415)  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2075)

Resolution : To resolve this issue you need to use your HA service id instead of hardcoded NN URI in job.properties. 

[root@m1 hive_oozie_demo]# cat job.properties

nameNode=hdfs://HDPINF

Issue 2: If you see exit code [40000] error with following message then you need to look into your job.properties.
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if hdfs://HDPINF/apps/hive/warehouse/test.db/demo is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://HDPINF/apps/hive/warehouse/test.db/demo, expected: hdfs://HDPINF:8020
Intercepting System.exit(40000)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [40000]

Resolution : You need to change your namenode URI, i.e remove port from the end of URI.
Please keep nameNode=hdfs://HDPINF instead ofnameNode=hdfs://HDPINF:8020.

Issue 3: If you see following error then you may need to change your hive sql or approach because of this issue seems to be unresolved. 

FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ''hdfs://HDPINF:8020/user/ambari-qa/tutorial/hive-input/*'': 
Move from: hdfs://HDPINF:8020/user/ambari-qa/tutorial/hive-input/* to: hdfs://HDPINF/apps/hive/warehouse/test.db/demo is not valid.
 Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.
Intercepting System.exit(10028)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [10028]

Resolution : https://issues.apache.org/jira/browse/HIVE-8147

  • 0

Pig script with HCatLoader on Hive ORC table

Category : Pig

Sometime we have to run some pig command on hive orc tables then this article will help you to do that.

Step 1: First create hive orc table:

hive> CREATE TABLE ORC_Table(COL1 BIGINT,COL2 STRING) CLUSTERED BY (COL1) INTO 10 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\T’ STORED AS ORC TBLPROPERTIES (‘TRANSACTIONAL’=’TRUE’) ;

Step 2: Now insert data to this table:

hive> insert into orc_table values(122342,’test’);

hive> insert into orc_table values(231232,’rest’);

hive> select * from orc_table;

OK

122342 test

231232 rest

Time taken: 1.663 seconds, Fetched: 2 row(s)

Step 3: Now create pig script :

[user1@server1 ~]$ cat  myscript.pig

A = LOAD ‘test1.orc_table’ USING org.apache.hive.hcatalog.pig.HCatLoader();

Dump A;

Step 4: Now you have to run your pig script:

[user1@server1 ~]$ pig -useHCatalog -f myscript.pig

WARNING: Use “yarn jar” to launch YARN applications.

16/09/16 03:31:02 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL

16/09/16 03:31:02 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE

16/09/16 03:31:02 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType

2016-09-16 03:31:02,440 [main] INFO  org.apache.pig.Main – Apache Pig version 0.15.0.2.3.4.0-3485 (rexported) compiled Dec 16 2015, 04:30:33

2016-09-16 03:31:02,440 [main] INFO  org.apache.pig.Main – Logging error messages to: /home/user1/pig_1474011062438.log

2016-09-16 03:31:03,233 [main] INFO  org.apache.pig.impl.util.Utils – Default bootup file /home/user1/.pigbootup not found

2016-09-16 03:31:03,386 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://HDPINFHA

2016-09-16 03:31:04,269 [main] INFO  org.apache.pig.PigServer – Pig Script ID for the session: PIG-myscript.pig-eb253b46-2d2e-495c-9149-ef305ee4e408

2016-09-16 03:31:04,726 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl – Timeline service address: http://server2:8188/ws/v1/timeline/

2016-09-16 03:31:04,726 [main] INFO  org.apache.pig.backend.hadoop.ATSService – Created ATS Hook

2016-09-16 03:31:05,618 [main] INFO  hive.metastore – Trying to connect to metastore with URI thrift://server2:9083

2016-09-16 03:31:05,659 [main] INFO  hive.metastore – Connected to metastore.

2016-09-16 03:31:06,209 [main] INFO  org.apache.pig.tools.pigstats.ScriptState – Pig features used in the script: UNKNOWN

2016-09-16 03:31:06,247 [main] INFO  org.apache.pig.data.SchemaTupleBackend – Key [pig.schematuple] was not set… will not generate code.

2016-09-16 03:31:06,284 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer – {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}

2016-09-16 03:31:06,384 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler – File concatenation threshold: 100 optimistic? false

2016-09-16 03:31:06,409 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer – MR plan size before optimization: 1

2016-09-16 03:31:06,409 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer – MR plan size after optimization: 1

2016-09-16 03:31:06,576 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl – Timeline service address: http://server2:8188/ws/v1/timeline/

2016-09-16 03:31:06,758 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState – Pig script settings are added to the job

2016-09-16 03:31:06,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2016-09-16 03:31:06,999 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – This job cannot be converted run in-process

2016-09-16 03:31:07,292 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/hive-metastore-1.2.1.2.3.4.0-3485.jar to DistributedCache through /tmp/temp-1473630461/tmp428549735/hive-metastore-1.2.1.2.3.4.0-3485.jar

2016-09-16 03:31:07,329 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/libthrift-0.9.2.jar to DistributedCache through /tmp/temp-1473630461/tmp568922300/libthrift-0.9.2.jar

2016-09-16 03:31:07,542 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/hive-exec-1.2.1.2.3.4.0-3485.jar to DistributedCache through /tmp/temp-1473630461/tmp-1007595209/hive-exec-1.2.1.2.3.4.0-3485.jar

2016-09-16 03:31:07,577 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/libfb303-0.9.2.jar to DistributedCache through /tmp/temp-1473630461/tmp-1039107423/libfb303-0.9.2.jar

2016-09-16 03:31:07,609 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/jdo-api-3.0.1.jar to DistributedCache through /tmp/temp-1473630461/tmp-1375931436/jdo-api-3.0.1.jar

2016-09-16 03:31:07,642 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler-1.2.1.2.3.4.0-3485.jar to DistributedCache through /tmp/temp-1473630461/tmp-893657730/hive-hbase-handler-1.2.1.2.3.4.0-3485.jar

2016-09-16 03:31:07,674 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar to DistributedCache through /tmp/temp-1473630461/tmp-1850340790/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar

2016-09-16 03:31:07,705 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-1.2.1.2.3.4.0-3485.jar to DistributedCache through /tmp/temp-1473630461/tmp58999520/hive-hcatalog-pig-adapter-1.2.1.2.3.4.0-3485.jar

2016-09-16 03:31:07,775 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/pig/pig-0.15.0.2.3.4.0-3485-core-h2.jar to DistributedCache through /tmp/temp-1473630461/tmp-422634726/pig-0.15.0.2.3.4.0-3485-core-h2.jar

2016-09-16 03:31:07,808 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1473630461/tmp1167068812/automaton-1.11-8.jar

2016-09-16 03:31:07,840 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1473630461/tmp708151030/antlr-runtime-3.4.jar

2016-09-16 03:31:07,882 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Setting up single store job

2016-09-16 03:31:07,932 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – 1 map-reduce job(s) waiting for submission.

2016-09-16 03:31:08,248 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader – No job jar file set.  User classes may not be found. See Job or Job#setJar(String).

2016-09-16 03:31:08,351 [JobControl] INFO  org.apache.hadoop.hive.ql.log.PerfLogger – <PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>

2016-09-16 03:31:08,355 [JobControl] INFO  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat – ORC pushdown predicate: null

2016-09-16 03:31:08,416 [JobControl] INFO  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat – FooterCacheHitRatio: 0/0

2016-09-16 03:31:08,416 [JobControl] INFO  org.apache.hadoop.hive.ql.log.PerfLogger – </PERFLOG method=OrcGetSplits start=1474011068351 end=1474011068416 duration=65 from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>

2016-09-16 03:31:08,421 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil – Total input paths (combined) to process : 1

2016-09-16 03:31:08,514 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter – number of splits:1

2016-09-16 03:31:08,612 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter – Submitting tokens for job: job_1472564332053_0029

2016-09-16 03:31:08,755 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner – Job jar is not present. Not adding any jar to the list of resources.

2016-09-16 03:31:08,947 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl – Submitted application application_1472564332053_0029

2016-09-16 03:31:08,989 [JobControl] INFO  org.apache.hadoop.mapreduce.Job – The url to track the job: http://server2:8088/proxy/application_1472564332053_0029/

2016-09-16 03:31:08,990 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – HadoopJobId: job_1472564332053_0029

2016-09-16 03:31:08,990 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Processing aliases A

2016-09-16 03:31:08,990 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – detailed locations: M: A[1,4] C:  R:

2016-09-16 03:31:09,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – 0% complete

2016-09-16 03:31:09,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Running jobs are [job_1472564332053_0029]

2016-09-16 03:31:28,133 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – 50% complete

2016-09-16 03:31:28,133 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Running jobs are [job_1472564332053_0029]

2016-09-16 03:31:29,251 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl – Timeline service address: http://server2:8188/ws/v1/timeline/

2016-09-16 03:31:29,258 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate – Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server

2016-09-16 03:31:30,186 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl – Timeline service address: http://server2:8188/ws/v1/timeline/

HadoopVersion PigVersion UserId StartedAt FinishedAt Features

2.7.1.2.3.4.0-3485 0.15.0.2.3.4.0-3485 s0998dnz 2016-09-16 03:31:06 2016-09-16 03:31:30 UNKNOWN

Success!

Job Stats (time in seconds):

JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs

job_1472564332053_0029 1 0 5 5 5 5 0 0 0 0 A MAP_ONLY hdfs://HDPINFHA/tmp/temp-1473630461/tmp1899757076,

Input(s):

Successfully read 2 records (28587 bytes) from: “test1.orc_table”

Output(s):

Successfully stored 2 records (32 bytes) in: “hdfs://HDPINFHA/tmp/temp-1473630461/tmp1899757076”

Counters:

Total records written : 2

Total bytes written : 32

Spillable Memory Manager spill count : 0

Total bags proactively spilled: 0

Total records proactively spilled: 0

Job DAG:

job_1472564332053_0029

2016-09-16 03:31:30,822 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Success!

2016-09-16 03:31:30,825 [main] WARN  org.apache.pig.data.SchemaTupleBackend – SchemaTupleBackend has already been initialized

2016-09-16 03:31:30,834 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat – Total input paths to process : 1

2016-09-16 03:31:30,834 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil – Total input paths to process : 1

(122342,test)

(231232,rest)

2016-09-16 03:31:30,984 [main] INFO  org.apache.pig.Main – Pig script completed in 28 seconds and 694 milliseconds (28694 ms)