Category: Apache HDFS

File Management In HDFS

Deleting a file: When a file is deleted by a user or an application, it is not instantly removed. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored as long...

HDFS Data Organization

Staging :  A client request to create a file does not reach the NameNode immediately. Initially the HDFS client caches the file data into a temporary local file. Application writes are transparently redirected to...

HDFS Failover Mechanism

Each DataNode sends a Heartbeat message to the NameNode periodically. If NameNode does not receive any Heartbeat from a perticular DataNode, it marks the DataNode as dead and does not forward any new IO requests to them....

Data Management in HDFS

Data Replication : HDFS can store single large file across different machines. If the file is large, it is divided into a set of blocks. All the blocks would be of same size except the last block. These...

NameNode And DataNode

What is Name Node  ? NameNode is a storage unit where HDFS keeps all the metadata information about all the data stored in the file system. It stores A directory tree of all the...