Insights on Oracle & Tech: Apache Hadoop HDFS一Knowing the Basics

Tuesday, January 31, 2017

Apache Hadoop HDFS一Knowing the Basics

Hadoop HDFS (Hadoop Distributed File System) is a distributed Java-based file system for storing large volumes of data. It is designed:

To be a scalable, fault-tolerant, distributed storage system
To be the data management layer of Apache Hadoop

Hadoop (data management layer) = HDFS + YARN

YARN provides the resource management
HDFS provides the distributed storage for big data

HDFS works closely with a wide variety of concurrent data access applications, coordinated by YARN.

To span large clusters of commodity servers

HDFS will “just work” under a variety of physical and systemic circumstances.
HDFS cluster = NameNode + DataNodes

In this article, we will use Apache Hadoop HDFS from the Hortonworks Data Platform (HDP: version 2.4.2) in the discussion. For HDFS High Availability (HA) feature, our reference is based on [2].

HDFS Cluster

An HDFS cluster is comprised of a NameNode, which manages the cluster metadata, and DataNodes that store the data. Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster. Each cluster had a single NameNode, and if that machine or process became unavailable, the cluster as a whole would be unavailable .

You can follow the instructions here to format and start HDFS on Hortonworks Data Platform. HDFS can be accessed from applications in many different ways. Natively, HDFS provides a FileSystem Java API for applications to use. A C language wrapper for this Java API is also available. In addition, an HTTP browser can also be used to browse the files of an HDFS instance. Work is in progress to expose HDFS through the WebDAV protocol. For more information, read here.^[3,10,11]

Name Node

High-level summary of Name Node which it:

Provides high availability (HA) using redundant Name Nodes^[2]

NameNode (active)
Secondary NameNode (standby)

Maintains the following two metadata files (or checkpoint files):

fsimage file

Holds the entire file system namespace,^[12] including the mapping of blocks to files and file system properties

editlog file

Holds every change that occurs to the filesystem metadata

Namenode Web UI

To smoke test your NameNode server, you can use the following URL^[7,11]

http://$namenode.full.hostname:50070

to determine if you can reach the NameNode server with the browser. If successful, you can also select the Utilities menu to "browse the file system".

High Availability

The HDFS High Availability feature (vs. another new HDFS Federation feature) addresses the SPOF problem by providing the option of running two redundant NameNodes in the same cluster in an Active/Passive configuration with a hot standby. This allows a fast failover to a new NameNode in the case that a machine crashes, or a graceful administrator-initiated failover for the purpose of planned maintenance.

If your individual IDs of NameNodes are nn1 and nn2, you can get their service status using the following command:^[3]

$ sudo -u hdfs hdfs haadmin -getServiceState nn1
active
$ sudo -u hdfs hdfs haadmin -getServiceState nn2
standby

Metadata Files

When NameNode starts up, it reads FsImage and EditLog files from disk, merges all the transactions present in the EditLog to the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage.

Metadata files are stored at:

${dfs.namenode.name.dir}/edits
${dfs.namenode.name.dir}/fsimage

where dfs.namenode.name.dir property can be configured in hdfs-site.xml.^[8]

Data Node

High-level summary of Data Node:^[4]

Scalable Storage

HDFS cluster storage scales horizontally with the addition of DataNodes

Minimal data motion

Hadoop moves compute processes to the data on HDFS and not the other way around.

Processing tasks can occur on the physical node where the data resides, which significantly reduces network I/O and provides very high aggregate bandwidth.

Data Disk Failure一Heartbeats and replication

Each DataNode sends a Heartbeat message to the NameNode periodically.

If NameNode detects a DataNode stop sending Heartbeat message, it marks DataNode as dead and stop forwarding new IO requests to them.

The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons:

a DataNode may become unavailable
a replica may become corrupted
a hard disk on a DataNode may fail
the replication factor of a file may be increased

Data Rebalancing

HDFS automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold

Data Integrity一checksum

When a client creates an HDFS file, it computes a checksum of each block of the file and stores these checksums in a separate hidden file in the same HDFS namespace.

References

Hadoop Distributed File System (HDFS)
HDFS High Availability Using the Quorum Journal Manager

You can enable High Availability using multiple NameNodes either with

a shared storage on NFS or
a distributed edit log (called Journal).

HDFS Commands Guide (Apache Hadoop)

All HDFS commands are invoked by the bin/hdfs script and can be grouped into:

User commands
Administrator commands
Debug commands

HDFS Architecture (Apache Hadoop)
Apache Hadoop
HDFS Federation (Hortonworks)

In order to scale the name service horizontally, federation uses multiple independent Namenodes/namespaces.

HDFS Ports (Hortonworks)
Apache Ambari一Knowing the Basics (Xml and More)
hdfs-default.xml (2.7.1)
FileSystem Shell - Apache™ Hadoop
Hadoop NameNode Web Interface
Namespace (HDFS)

Consists of directories, files and blocks.
It supports all the namespace related file system operations such as create, delete, modify and list files and directories.