Twin-GAN for Neural Machine Translation
Zhao, J.,
Huang, L.,
Sun, R.,
Bing, L.,
and Qu, H.
ICAART,
2021
In recent years, Neural Machine Translation (NMT) has achieved great success, but we can not ignore two
important problems. One is the exposure bias caused by the different strategies between training and inference, and the other is that the NMT model generates the best candidate word for the current step yet a bad
element of the whole sentence. The popular methods to solve these two problems are Schedule Sampling and
Generative Adversarial Networks (GANs) respectively, and both achieved some success. In this paper, we
proposed a more precise approach called “similarity selection” combining a new GAN structure called twinGAN to solve the above two problems. There are two generators and two discriminators in the twin-GAN.
One generator uses the “similarity selection” and the other one uses the same way as inference (simulate the
inference process). One discriminator guides generators at the sentence level, and the other discriminator
forces these two generators to have similar distributions. Moreover, we performed a lot of experiments on the
IWSLT 2014 German→English (De→En) and the WMT’17 Chinese→English (Zh→En) and the result shows
that we improved the performance compared to some other strong baseline models which based on recurrent
architecture.