How to fix corrupted or under replicated blocks issue

  • 2

How to fix corrupted or under replicated blocks issue

Category : Bigdata

To find out whether hadoop hdfs filesystem has corrupt blocks or not also to fix that we can use below steps :

[hdfs@m1 ~]$ hadoop fsck /

or

[hdfs@m1 ~]$ hadoop fsck hdfs://192.168.56.41:50070/

If you get any corrupted blocks or missing at the end of output like below :

Total size: 4396621856 B (Total open files size: 249 B)

Total dirs: 11535

Total files: 841

Total symlinks: 0 (Files currently being written: 4)

Total blocks (validated): 844 (avg. block size 5209267 B) (Total open file blocks (not validated): 3)

Minimally replicated blocks: 844 (100.0 %)

Over-replicated blocks: 0 (0.0 %)

Under-replicated blocks: 2 (0.23696682 %)

Mis-replicated blocks: 0 (0.0 %)

Default replication factor: 3

Average block replication: 3.0

Corrupt blocks: 0

Missing replicas: 14 (0.5498822 %)

Number of data-nodes: 3

Number of racks: 1

FSCK ended at Mon Feb 22 11:04:21 EST 2016 in 5505 milliseconds

The filesystem under path ‘/’ is HEALTHY

Now we have to find which files have missing or corrupted blocks via below commands:

hdfs fsck / | egrep -v '^\.+$' | grep -v replica | grep -v Replica

This will list out the affected files, and also files that might currently have under-replicated blocks (which isn't necessarily an issue). The output should include something like this with all your affected files.
/path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305
/path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B

Now we have to determine the importance of the file, whether can it just be removed or is there sensitive data that needs to be regenerated?
Remove the corrupted file from your hadoop cluster,This command will move the corrupted file to the trash.

[hdfs@m1 ~]$ hdfs dfs -rm /path/to/filename.ext
[hdfs@m1 ~]$ hdfs dfs -rm hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.ext

This might or might not be possible, To repair a corrupt file first step would be to gather information on the file's location, and blocks.
[hdfs@m1 ~]$ hdfs fsck /path/to/filename/fileextension -locations -blocks -files

Now you can track down the node where the corruption is. On those nodes, you can look through logs anddetermine what the issue is. If a disk was replaced, i/o errors on the server, etc. 
If possible to recover on that machine and get the partition with the blocks online that would report back to hadoop and the file would be healthy again. If that isn't possible, you will unfortunately have find another way to regenerate.

2 Comments

Nitin

April 1, 2018 at 9:01 am

Hi, I wanted to know what will happen if metastore is corrupt in hive?

    admin

    April 28, 2018 at 3:25 am

    You may need to repair or if you have backup then you need to restore it.

Leave a Reply to admin Cancel reply