HydraFS

Cristian Ungureanu, Benjamin Atkin, Akshat
Aranya, Salil Gokhale, Stephen Rago, Grzegorz
Całkowski, Cezary Dubnicki, and Aniruddha
Bohra
NEC Laboratories America
Presented By : G.A.Dilruk (148209B)

 What is HYDRAstor ?
 The Research Problem
 Challenges
 The Design of HydraFS
 Evaluation
 Future Enhancements

 HYDRAstor is a content-addressable storage (CAS) system to
build storage solutions.
 Data Deduplication  High Throughput
 Multi Storage Nodes  Data Replication

 Main barrier is absence of a standard API to access
data in HYDRAstor.
 People are lazy to change existing applications.
 Applications may need to deal with unique
characteristics of CAS such as block immutability
and high latency.
 Solution to build a standard file system on top of
the HYDRAstor CAS System.

 Blocks are immutable, so data updates are more
expensive in a CAS.
 Latency of the block operations is very high.
 Cache misses for metadata blocks have a
significant impact on performance.
 Variable block size for better Deduplication.

 High throughput of sequential reads and writes.
 Minimize the number of dependent I/O
operations.
 Availability guarantees of HydraFS must be no
worse than standard Unix file system.
 File system must efficiently support both local and
remote file access.

Challenge Design Strategy
Blocks are
immutable
Decouple data and meta data processing
though a log buffer and batch operation in
meta data.
High latency of the
block operations
Read cache and write buffer.
Cache misses for
metadata blocks
Fixed-size caches and use admission
control to limit the number of concurrent
operations.
Variable block size Chunking algorithm similar to Rabin
Fingerprinting

File
Server
Commit
Server
Transaction Log
Data Blocks
Super Blocks
File Operation

 Comparison of raw device and HydraFS file system
throughput for iSCSI and Hydra.

 Hydra and HydraFS write throughput with varying
duplication ratio.

 Multiple nodes for File server to make failover
transparent and automatic.
 Integrating other algorithm like Bimodal
chunking for current chucking algorithm which is
similar to Rabin fingerprinting.
 HydraFS is acceptable for secondary storage
platform for backup appliance. Strategic way to
reduce I/O latency for primary storage.

 The Hydra File system : A first approach to a
distributed file system by Benjamin Gonzalez. 24th
October 2005. Computer science Department
Loyola University Chicago, IL 60611, USA.
 https://www.necam.com/HYDRAstor/doc.cfm?t=
FAQs

HydraFS

More Related Content

What's hot

Viewers also liked

Similar to HydraFS

More from Asanka Dilruk

HydraFS