QAGAN: Adversarial Approach To Learning Domain Invariant Language Features

Ford Greenfield Labs, Palo Alto

[sshriva5@ford.com]

Kaiyue Wang

Waymo LLC

[garywang@stanford.edu]

Abstract

Training models that are robust to data domain shift have gained an increasing interest both in academia and industry. Question-Answering language models, being one of the typical problem in Natural Language Processing (NLP) research, has received much success with the advent of large transformer models. However, existing approaches mostly work under the assumption that data is drawn from same distribution during training and testing which is unrealistic and non-scalable in the wild.


In this paper, we explore adversarial training approach towards learning domain-invariant features so that language models can generalize well to out-of-domain datasets. We also inspect various other ways to boost our model performance including data augmentation by paraphrasing sentences, conditioning end of answer span prediction on the start word, and carefully designed annealing function. Our initial results shows that in combination with these methods, we are able to achieve 15.2% improvement in EM score and 5.6% boost in F1 score on out-of-domain validation datasets over the baseline. We also dissect our model outputs and visualize the model hidden-states by projecting them onto a lower-dimensional space, and discover that our specific adversarial training approach indeed encourages the model to learn domain invariant embedding and bring them closer in the multi-dimensional space.

tl;dr


In this work, we explore adversarial training approach towards learning domain-invariant features so that language models can generalize well to out-of-domain datasets. We also inspect various other ways to boost our model performance including data augmentation by paraphrasing sentences, conditioning end of answer span prediction on the start word, and carefully designed annealing function.

We also dissect our model outputs and visualize the model hidden-states by projecting them onto a lower-dimensional space, and discover that our specific adversarial training approach indeed encourages the model to learn domain invariant embedding and bring them closer in the multi-dimensional space.

Figure - Each dataset embeddings are clustered in its own little island, signifying that domain-gap exists across various QA datasets.

Figure - Dataset embeddings are closer together and are interleaved well, signifying that domain-pap across various QA datasets has reduced as a result of applying QAGAN.

Figure - Variants of QAGAN

qualitative results

citation

@misc{shrivastava2022qagan,

title={QAGAN: Adversarial Approach To Learning Domain Invariant Language Features},

author={Shubham Shrivastava and Kaiyue Wang},

year={2022},

url={https://github.com/towardsautonomy/QAGAN}

}