Lustre HPC distributed filesystem

Radiance

What do you do when you need to deploy a large filesystem that is scalable to the exabyte level and supports a large-client, simultaneous-access workload? You find a parallel distributed filesystem such as Lustre. In this article, I build the high-performance Lustre filesystem from source, install it on multiple machines, mount it from clients, and access them in parallel.

Lustre Filesystems

A distributed filesystem allows access to files from multiple hosts sharing the files within a computer network, which makes it possible for multiple users on multiple client machines to share files and storage resources. The client machines do not have direct access to the underlying block storage sharing those files; instead, they communicate with a set or cluster of server machines hosting those files and the filesystem to which they are written.

Lustre (or Linux Cluster) [1]-[3] is one such distributed filesystem, usually deployed for large-scale cluster high-performance computing (HPC). Licensed under the GNU General Public License (GPL), Lustre provides a solution in which high performance and scalability to tens of thousands of nodes (including the clients) and exabytes of storage becomes a reality and is relatively simple to deploy and configure. As of this writing, the Lustre project is at version 2.14, nearing the official release of 2.15 (currently under development), which will be the next long-term support (LTS) release.

Lustre contains somewhat of a unique architecture, with four major functional units: (1) a single Management Service (MGS), which can be hosted on its own machine or on one of the metadata machines; (2) the Metadata Service (MDS), which contains Metadata Targets (MDTs); (3) Object Storage Services (OSS), which store file data on one or more Object Storage

...

Use Express-Checkout link below to read the full article (PDF).