Stay Informed:

COVID-19 (coronavirus) information
Zoom Links: Zoom Help | Teaching with Zoom | Zoom Quick Guide

Advancement: Diversifying Language Generated by Deep Learning Models in Dialogue Systems

Speaker Name: 
Jurik Juraska
Speaker Title: 
Ph.D. Student
Start Time: 
Wednesday, December 4, 2019 - 10:00am
Engineering 2, Room 399

Conversational AI has seen tremendous progress in recent years, achieving near-human or even surpassing human performance in certain well-defined tasks, including speech recognition and question answering. Yet it tends to struggle with tasks which are less constrained, in particular those that involve producing human language. Current approaches to natural language generation (NLG) in dialogue systems still heavily rely on techniques that lack scalability and transferability to different domains, despite the general embrace of more robust methods by the NLG community, in particular deep learning models. These rely on large amounts of annotated data, yet they tend to produce generic sentences that lack most of the human language nuances that make it creative and varied. The naturalness of the generated language is an important factor affecting the perceived quality of a dialogue system. We therefore explore different ways of ensuring outpu! t diversity in neural data-to-text generation without a negative impact on semantic accuracy.

After experimenting with training data manipulation and model input augmentation, both of which enable stylistic control to a certain degree, we performed a successful pilot study of a new inference method for a neural sequence-to-sequence model. The inference method based on Monte-Carlo Tree Search (MCTS) automatically promotes diversity better than beam search and, at the same time, optimizes for an arbitrary metric. While our preliminary experiments were performed with the BLEU metric, we aim to develop a comprehensive referenceless metric to guide the tree search instead. Our automatic slot aligner, which was responsible for large performance gains of our neural NLG model, will lend itself well to this task alongside a state-of-the-art language model. Finally, combining MCTS with reinforcement learning, in a similar way to the successful AlphaGo Zero system, we expect to achieve better results in real time than with the standard beam search inference.

Event Type: 
Professor Marilyn Walker
Graduate Program: 
Computer Science Ph.D.