Defense: Probabilistic Methods for Data-Driven Social Good

Speaker Name: 
Sabina Tomkins
Speaker Title: 
PhD Candidate
Speaker Organization: 
Technology & Information Management
Start Time: 
Tuesday, July 31, 2018 - 10:00am
End Time: 
Tuesday, July 31, 2018 - 12:00pm
Location: 
Engineering 2, Room 475
Organizer: 
Lise Getoor

Abstract:  Computational methods offer many opportunities for addressing societally-relevant questions. For example, societally-relevant questions questions can be defined as prediction problems and approached with appropriate data-driven frameworks. Additionally, a distinguishing goal is to improve understanding of social phenomena. In addition to predicting outcomes, data-driven techniques can uncover hidden structure in complex problems. In this work, I describe social good problems which can be solved with a collective probabilistic approach.

In each problem setting, I demonstrate how introducing, representing and modeling relational structure can improve predictive performance for this class of problems. When relevant, these models can also provide insight into underlying and unseen phenomena. This work spans three large areas of societally relevant problems: sustainability, education and malicious behavior. While different in many ways, each area offers a wealth of problems which benefit from a collective probabilistic approach. By analyzing these diverse problems, I present cohesive and domain independent conclusions which span across domains. For example, I demonstrate the utility of modeling participant interactions both in online classrooms and cyberbullying incidents. I show how to include latent-structure in a diverse range of prediction problems, from the future purchases of online shoppers to the future movements of human traffickers. Throughout each problem setting I show that an approach which can mod! el dependencies between random variables, as well as domain knowledge and contextual priors, can yield fresh insights into the nature of complex societal challenges; even when data is limited.