Stay Informed:

COVID-19 (coronavirus) information
Zoom Links: Zoom Help | Teaching with Zoom | Zoom Quick Guide

Scalable, Global Namespaces with Programmable Storage

Speaker Name: 
Michael Sevilla
Speaker Title: 
Member of Technical Staff
Speaker Organization: 
TidalScale
Start Time: 
Tuesday, May 21, 2019 - 9:50am
End Time: 
Tuesday, May 21, 2019 - 11:25am
Location: 
BE-156
Organizer: 
CROSS

Abstract: 

Global file system namespaces are difficult to scale because of the overheads of POSIX IO metadata management. Prior scalable file system metadata IO work integrates optimizations into 'clean-slate' file systems, which are hard to manage, and/or 'dirty-slate' file systems, which are challenging to understand and evolve. The fundamental insight of this work is that the default policies of metadata management techniques in today's file systems are causing scalability problems for specialized use cases. Our solution dynamically assigns customized solutions to various parts of the file system namespace, which facilitates domain-specific policies that shape metadata management techniques. To systematically explore this design space, we build a programmable file system with APIs that let developers of higher layers express their domain-specific knowledge in a storage-agnostic way. Policy engines embedded in the file system use this knowledge to guide internal mechanisms to make metadata management more scalable. Using these frameworks, we design scalable policies, inspired by the workload, for (1) subtree load balancing, (2) relaxing subtree consistency and durability semantics, and (3) subtree schemas and generators.
 

Bio:

Michael is a software engineer at TidalScale working on virtualized storage. He received his PhD from the University of California, Santa Cruz, studying file system metadata load balancing, consistency/durability semantics, and namespace structures. Previously, he worked at Los Alamos National Laboratories, where he designed storage systems for HPC applications, and Hewlett Packard Enterprise's Advanced Development Team, where he designed storage solutions and reproducibility tools for big data processing stacks.