Stay Informed:

COVID-19 (coronavirus) information
Zoom Links: Zoom Help | Teaching with Zoom | Zoom Quick Guide

Advancement: Solving distributed systems problems by analyzing explanations in aggregate

Speaker Name: 
Kamala Ramasubramanian
Speaker Title: 
PhD Student (Advisor: Peter Alvaro)
Speaker Organization: 
Computer Science
Start Time: 
Wednesday, May 29, 2019 - 9:30am
End Time: 
Wednesday, May 29, 2019 - 11:30am
Location: 
Engineering 2, Room 280
Organizer: 
Peter Alvaro

Abstract:  Logs, traces and provenance data are explanations of executions which can aid in understanding, implementing, operating and troubleshooting distributed systems. Oftentimes, they already contain answers to questions that correspond to common distributed systems problems. A few examples of questions are verification-related questions such as "What do all successful executions have in common?" or “What is the optimal ordering to explore fault scenarios?”. An automated data collection infrastructure might pose questions such as “Was this execution successful?” or “Is this the first explanation with this structure to be observed?”. Developers may ask debugging-related questions such as "How does an anomalous execution differ from those seen in steady state?”.
Current tooling is geared toward analysis of explanations to address performance problems specifically rather than answering the types of questions that require reasoning not only about the events in a system execution but also their interactions. Further, to extract the desired answers from explanations, reasoning about a single explanation is insufficient. Rather, we need to reason in aggregate across many explanations. In my thesis work, I propose and evaluate methodologies to answer questions which require analysis and comparison of large numbers of explanations. A key differentiating feature of my approach is abstracting away individual execution details and using the abstracted explanations to automatically perform structure aware aggregate reasoning and analysis.