Advancement: Learning Characteristics of Creative Language for Generating Stylistic Variation

Speaker Name: 
Shereen Oraby
Speaker Title: 
PhD Student
Speaker Organization: 
Computer Science
Start Time: 
Tuesday, July 25, 2017 - 11:00am
End Time: 
Tuesday, July 25, 2017 - 1:00pm
Engineering 2, Room 475
Marilyn Walker

Abstract:  Many of the creative and figurative elements that make language exciting are lost in translation in current language generation engines. In order to transfer elements of stylistic language into a generation pipeline, we closely study highly-expressive language forms: emotional argumentation, sarcasm, rhetorical questions, and hyperbole. We use a bootstrapping approach to collect corpora for each form in debate forums and Twitter, train machine learning models to identify them, and learn some highly-precise features they exhibit. Using the features we learn, our goal is to improve upon traditional generation models by integrating elements of creativity and style. We focus on the restaurant review domain, where data is prevalent and language is very creative, and present a pilot study showing that stylistically-varied reviews we create are judged as more convincing, interesting, and natural than traditional templates. We now aim to explore how we can integrate style modules into statistical and neural generation pipelines to make the realized output more stylistically varied and engaging than the state of the art.