Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Kamal Hakimzadeh; Hooman Peiro Sajjad; Jim Dowling

doi:10.1007/978-3-662-43352-2_4

Conference Papers Year : 2014

Scaling HDFS with a Strongly Consistent Relational Model for Metadata

(1) , (1) , (1)

Kamal Hakimzadeh

Function : Author
PersonId : 978499

Swedish Institute of Computer Science [Stockholm]

Hooman Peiro Sajjad

Function : Author
PersonId : 978500

Swedish Institute of Computer Science [Stockholm]

Jim Dowling

Function : Author
PersonId : 978501

Swedish Institute of Computer Science [Stockholm]

Abstract

The Hadoop Distributed File System (HDFS) scales to store tens of petabytes of data despite the fact that the entire file system’s metadata must fit on the heap of a single Java virtual machine. The size of HDFS’ metadata is limited to under 100 GB in production, as garbage collection events in bigger clusters result in heartbeats timing out to the metadata server (NameNode).In this paper, we address the problem of how to migrate the HDFS’ metadata to a relational model, so that we can support larger amounts of storage on a shared-nothing, in-memory, distributed database. Our main contribution is that we show how to provide at least as strong consistency semantics as HDFS while adding support for a multiple-writer, multiple-reader concurrency model. We guarantee freedom from deadlocks by logically organizing inodes (and their constituent blocks and replicas) into a hierarchy and having all metadata operations agree on a global order for acquiring both explicit locks and implicit locks on subtrees in the hierarchy. We use transactions with pessimistic concurrency control to ensure the safety and progress of metadata operations. Finally, we show how to improve performance of our solution by introducing a snapshotting mechanism at NameNodes that minimizes the number of roundtrips to the database.

Domains

Fichier principal

326177_1_En_4_Chapter.pdf (591)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-01287731

Submitted on : Monday, March 14, 2016-10:48:49 AM

Last modification on : Thursday, May 12, 2016-10:49:50 AM

Long-term archiving on : Sunday, November 13, 2016-6:04:47 PM

Dates and versions

hal-01287731 , version 1 (14-03-2016)

Licence

Attribution

Identifiers

HAL Id : hal-01287731 , version 1
DOI : 10.1007/978-3-662-43352-2_4

Cite

Kamal Hakimzadeh, Hooman Peiro Sajjad, Jim Dowling. Scaling HDFS with a Strongly Consistent Relational Model for Metadata. 4th International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2014, Berlin, Germany. pp.38-51, ⟨10.1007/978-3-662-43352-2_4⟩. ⟨hal-01287731⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP-LNCS IFIP IFIP-TC IFIP-WG IFIP-LNCS-8460 IFIP-TC6 IFIP-WG6-1

79 View

437 Download

Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Abstract

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share