Overview of HDFS

HDFS is Apache Hadoop’s flagship file system. It is designed to serve as a general purpose file system abstraction for structured and semi-structured big data sets. Best Use Case: High throughput, streaming reads and writes of extremely large files on “commodity hardware”. Worst Use Case: Low latency arbitrary reads and writes of small files. (HBase users, hold your arguments for now) High Throughput: If you have a pickup truck and a sports car to move your luggage, HDFS is one of the biggest pickup trucks out there, it is slow but it will move your entire luggage in one trip. Your DBMS will be the sports car. Multiple fast trips with a fraction of the luggage each time. The bigger question is not what you like to drive, but how much luggage you have?

HDFS is optimized to read the whole data set quickly. Streaming Reads and Writes: The pattern we go for in HDFS is write-once, read-many-times. This is your data-analytic access pattern. You bring in data from the frontier databases into HDFS and run repeated analytic on the same data set. The analytic that will benefit from HDFS in the one that touches the entire data set. The key optimization is not surrounding how fast you access the first record but how fast you can burn through the whole data set. Writes on the other hand can only be made to the end of the file by a single writer.

Extremely Large Files: HDFS comes into play when your data set is much larger than what a big fat data center node can hold. It is a distributed file system with a block side of 64 MB as compared to the Linux File System which has a block size of 4 KB.

HDFS’s ability to scale brought it fame, however its inter-connectivity is what helps it maintain the spot light. Even though HDFS earned its glory alongside Map Reduce as part of the Hadoop Core, HDFS is fast making a name for itself as an independent big data set store with a familiar interface. With all your data set inside HDFS, you can mix and match you access modes. If you want low latency, deploy Apache HBase. If you want in-memory real time analytic, deploy Apache Spark and Shark or Cloudera Impala. If your talent pool is geared towards one of the commercial BI stacks e.g., SAS, SAP, HP, IBM etc., most of them are providing out-of-box HDFS connectors. Apache Cassandra Project also supports HDFS as a data store. The list continues to grow.

Big Data needs big deployments. More machines will bring more cost. HDFS being open source technology (with enterprise support), its total cost of ownership (TCO) is considerably lower than competing closed source proprietary big data appliances. With terabytes of big data sets already present in HDFS deployments around the globe, the trend is towards providing more processing means for HDFS instead of creating a brand new data storage solution for big data sets.

HDFS comprised of interconnected clusters of nodes where files and directories reside. It has Name Node, DataNode and Secondary Name Node.

  • NameNode:It is also known as single point of failure node, it manages the file system namespace operations like opening, closing, and renaming files and directories regulates client access to files. A name node also maps data blocks to data nodes, which handle read and write requests from HDFS clients. The current design has a single NameNode for each cluster. NameNode keeps the entire namespace image in RAM. The HDFS namespace is a hierarchy of files and directories. Files and directories are represented on the NameNode by inodes. Inodes record attributes like permissions, modification and access times, namespace and disk space quotas.

           Image: The inodes and the list of blocks that define the metadata of the name system are called the image.

          Journal: Each client-initiated transaction is recorded in the journal.


  • Data Nodes:  The cluster can have thousands of DataNodes and tens of thousands of HDFS clients per cluster, as each DataNode may execute multiple application tasks concurrently. DataNodes store data as blocks within files. The file content is split into large blocks (typical y 128 megabytes), and each block of the file is independently replicated at multiple DataNodes. The blocks are stored on the local file system on the Data Nodes. Each block replica on a DataNode is represented by two files in the local native file system. The first file contains the data itself and the second file records the block’s metadata including checksums for the data and the generation stamp. The size of the data file equals the actual length of the block and does not require extra space to round it up to the nominal block size as in traditional file systems. Thus, if a block is half full it needs only half of the space of the full block on the local drive.


  • Secondary name node: The HDFS file system includes a so-called secondary NameNode, which misleads some people into thinking [citation needed] that when the primary NameNode goes offline, the secondary NameNode takes over. In fact, the secondary NameNode regularly connects with the primary NameNode and builds snapshots of the primary Name Node’s directory information, which the system then saves to local or remote directories. These check pointed images can be used to restart a failed primary NameNode without having to replay the entire journal of file-system actions, then to edit the log to create an up-to-date directory structure. Because the NameNode is the single point for storage and management of metadata. It can become a bottleneck for supporting a huge number of files, especial y a large number of small files. HDFS Federation, a new addition, aims to tackle this problem to a certain extent by al owing multiple


Leave a Reply