Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation

Nov 1, 2020·

Giuseppe Russo

Nora Hollenstein

Claudiu Cristian Musat

Ce Zhang

· 0 min read

PDF Cite

Abstract

We introduce CGA, a conditional VAE architecture, to control, generate, and augment text. CGA is able to generate natural English sentences controlling multiple semantic and syntactic attributes by combining adversarial learning with a context-aware loss and a cyclical word dropout routine. We demonstrate the value of the individual model components in an ablation study. The scalability of our approach is ensured through a single discriminator, independently of the number of attributes. We show high quality, diversity, and attribute control in the generated sentences through a series of automatic and human assessments. As the main application of our work, we test the potential of this new NLG model in a data augmentation scenario. In a downstream NLP task, the sentences generated by our CGA model show significant improvements over a strong baseline, and a classification performance often comparable to adding the same amount of additional real data.

Type

Conference paper

Publication

In Findings of the Association for Computational Linguistics (EMNLP2020)

Last updated on Nov 1, 2020

Natural Language Generation Data Augmentation Conditional VAE Controlled Text Generation

← Understanding Online Migration Decisions Following the Banning of Radical Communities Apr 30, 2023

An example preprint / working paper Apr 7, 2019 →