Existing models on open-domain comment generation are difficult to train, and they produce repetitive and uninteresting responses. The problem is due to multiple and contradictory responses from a single article, and by the rigidity of retrieval methods. To solve this problem, we propose a combined approach to retrieval and generation methods. We propose an attentive scorer to retrieve informative and relevant comments by leveraging user-generated data. Then, we use such comments, together with the article, as input for a sequence-to-sequence model with copy mechanism. We show the robustness of our model and how it can alleviate the aforementioned issue by using a large scale comment generation dataset. The result shows that the proposed generative model significantly outperforms strong baseline such as Seq2Seq with attention and Information Retrieval models by around 27 and 30 BLEU-1 points respectively.
Recommended citation: Lin, Z., Winata, G. I., & Fung, P. (2019, May). Learning comment generation by leveraging user-generated data. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7225-7229). IEEE.