How to remove header from csv during loading to hive

  • 0

How to remove header from csv during loading to hive

Sometime we may have header in our data file and we do not want that header to loaded into our hive table or we want to ignore header then this article will help you.

[saurkuma@m1 ~]$ cat sampledata.csv

id,Name

1,Saurabh

2,Vishal

3,Jeba

4,Sonu

Step 1: Create a table with table properties to ignore it.

hive> create table test(id int,name string) row format delimited fields terminated by ‘,’ tblproperties(“skip.header.line.count”=”1”) ;

OK

Time taken: 0.233 seconds

hive> show tables;

OK

salesdata01

table1

table2

test

tmp

Time taken: 0.335 seconds, Fetched: 5 row(s)

hive> load data local inpath ‘/home/saurkuma/sampledata.csv’ overwrite into table test;

Loading data to table demo.test

Table demo.test stats: [numFiles=1, totalSize=41]

OK

Time taken: 0.979 seconds

hive> select * from test;

OK

1 Saurabh

2 Vishal

3 Jeba

4 Sonu

Time taken: 0.111 seconds, Fetched: 4 row(s)

To remove header in Pig:

A=load ‘sampledata.csv’ using PigStorage(‘,’);
B=FILTER A BY $0>1;

I hope this helped you to do your job in easy way. Please feel free to give your valuable suggestion or feedback.