Abstract
While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.
Biography
Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.
Abstract
Recent advances in large pretrained language models have unlocked new exciting applications for Natural Language Generation for creative tasks, such as lyrics or humour generation. In this talk we will discuss recent works by our team at Alexa AI and discuss current challenges: (1) Pun understanding and generation: We release new datasets for pun understanding and the novel task of context-situated pun generation, and demonstrate the value of our annotations for pun classification and generation tasks. (2) Song lyric generation: we design a hierarchical lyric generation framework that enables us to generate pleasantly-singable lyrics without training on melody-lyric aligned data, and show that our approach is competitive with strong baselines supervised on parallel data. (3) Create with Alexa: a multimodal story creation experience recently launched on Alexa devices, which leverages story text generation models in tandem with story visualization and background music generation models to produce multimodal stories for kids.
Biography
Alessandra Cervone is an Applied Scientist in the Natural Understanding team at Amazon Alexa AI. Alessandra holds an MSc in Speech and Language Processing from University of Edinburgh and a PhD in CS from University of Trento (Italy). During her PhD, Alessandra worked on computational models of coherence in open-domain dialogue advised by Giuseppe Riccardi. In the first year of the PhD, she was the team leader of one of the teams selected to compete in the first edition of the Alexa Prize. More recently, her research interests have been focused on natural language generation and its evaluation, in particular in the context of creative AI applications.