Dual-factor Generation Model for Conversation

Authors:
Ruqing Zhang

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

,
Jiafeng Guo

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

0000-0002-1086-0202
View Profile

,
Yixing Fan

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

,
Yanyan Lan

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

,
Xueqi Cheng

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

University of Chinese Academy of Sciences, Beijing, China; CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 38 Issue 3Article No.: 31pp 1–31https://doi.org/10.1145/3394052

Published:05 June 2020Publication History

ACM Transactions on Information Systems

Abstract

The conversation task is usually formulated as a conditional generation problem, i.e., to generate a natural and meaningful response given the input utterance. Generally speaking, this formulation is apparently based on an oversimplified assumption that the response is solely dependent on the input utterance. It ignores the subjective factor of the responder, e.g., his/her emotion or knowledge state, which is a major factor that affects the response in practice. Without explicitly differentiating such subjective factor behind the response, existing generation models can only learn the general shape of conversations, leading to the blandness problem of the response. Moreover, there is no intervention mechanism within the existing generation process, since the response is fully decided by the input utterance. In this work, we propose to view the conversation task as a dual-factor generation problem, including an objective factor denoting the input utterance and a subjective factor denoting the responder state. We extend the existing neural sequence-to-sequence (Seq2Seq) model to accommodate the responder state modeling. We introduce two types of responder state, i.e., discrete and continuous state, to model emotion state and topic preference state, respectively. We show that with our dual-factor generation model, we can not only better fit the conversation data, but also actively control the generation of the response with respect to sentiment or topic specificity.

References

Rami Al-Rfou, Marc Pickett, Javier Snaider, Yun-hsuan Sung, Brian Strope, and Ray Kurzweil. 2016. Conversational contextual cues: The case of personalization and history for response ranking. arXiv preprint arXiv:1606.00372 (2016).Google Scholar
Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, and Lili Mou. 2018. Affective neural response generation. In Proceedings of the European Conference on Information Retrieval. Springer, 154–166.Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
Antoine Bordes, Y-Lan Boureau, and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683 (2016).Google Scholar
Zhongxia Chen, Ruihua Song, Xing Xie, Jian-Yun Nie, Xiting Wang, Fuzheng Zhang, and Enhong Chen. 2019. Neural response generation with relevant emotions for short text conversation. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. Springer, 117–129.Google ScholarDigital Library
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, and Li Deng. 2016. Towards end-to-end reinforcement learning of dialogue agents for information access. arXiv preprint arXiv:1609.00777 (2016).Google Scholar
Joseph L. Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Meas. 33, 3 (1973), 613–619.Google ScholarCross Ref
Jianfeng Gao, Michel Galley, Lihong Li, et al. 2019. Neural approaches to conversational AI. Found. Trends® Inf. Ret. 13, 2–3 (2019), 127–298.Google ScholarDigital Library
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.Google Scholar
Matthew Henderson, Blaise Thomson, and Jason D. Williams. 2014. The second dialog state tracking challenge. In Proceedings of the 15th Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’14). 263–272.Google Scholar
Chenyang Huang, Osmar Zaiane, Amine Trabelsi, and Nouha Dziri. 2018. Automatic dialogue generation with expressed emotions. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Vol. 2. 49–54.Google ScholarCross Ref
Anil Jain, Karthik Nandakumar, and Arun Ross. 2005. Score normalization in multimodal biometric systems. Pattern Recog. 38, 12 (2005), 2270–2285.Google ScholarDigital Library
Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An information retrieval approach to short text conversation. arXiv preprint arXiv:1408.6988 (2014).Google Scholar
Michael Kearns. 2000. Cobot in LambdaMOO: A social statistics agent. In AAAI/IAAI.Google Scholar
Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations.Google Scholar
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A diversity-promoting objective function for neural conversation models. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL’16).Google ScholarCross Ref
Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, and Bill Dolan. 2016. A persona-based neural conversation model. In Proceedings of the 54th Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Jiwei Li, Will Monroe, and Dan Jurafsky. 2017. Data distillation for controlling specificity in dialogue generation. arXiv preprint arXiv:1702.06703 (2017).Google Scholar
Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, and Dan Jurafsky. 2016. Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541 (2016).Google Scholar
Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547 (2017).Google Scholar
Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using N-gram co-occurrence statistics. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 150–157.Google ScholarDigital Library
Feng Liu, Qirong Mao, Liangjun Wang, Nelson Ruwa, Jianping Gou, and Yongzhao Zhan. 2018. An emotion-based responding model for natural language conversation. World Wide Web-internet Web Inf. Syst. 9 (2018), 1–19.Google Scholar
Yi Luan, Chris Brockett, Bill Dolan, Jianfeng Gao, and Michel Galley. 2017. Multi-task learning for speaker-role adaptation in neural conversation models. arXiv preprint arXiv:1710.07388 (2017).Google Scholar
Liangchen Luo, Wenhao Huang, Qi Zeng, Zaiqing Nie, and Xu Sun. 2019. Learning personalized end-to-end goal-oriented dialog. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6794–6801.Google ScholarDigital Library
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, Nov. (2008), 2579–2605.Google Scholar
Yang Min, Zhao Zhou, Zhao Wei, Xiaojun Chen, and Zigang Cao. 2017. Personalized response generation via domain adaptation. In Proceedings of the 40th International ACM SIGIR Conference.Google Scholar
Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In Proceedings of the International Conference on Computational Linguistics (COLING’16).Google Scholar
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311–318.Google Scholar
Diana Perez-Marin. 2011. Conversational Agents and Natural Language Interaction: Techniques and Effective Practices: Techniques and Effective Practices. IGI Global.Google Scholar
Qian Qiao, Minlie Huang, and Xiaoyan Zhu. 2018. Assigning personality/identity to a chatting machine for coherent conversation generation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’18).Google Scholar
Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, and Wei Chu. 2017. ALIME chat: A sequence to sequence and rerank based chatbot engine. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 498–503.Google ScholarCross Ref
Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 583–593.Google Scholar
Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the SIGIR’94. Springer, 232–241.Google ScholarDigital Library
Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’16). 3776–3784.Google Scholar
Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Google ScholarCross Ref
Xiaoyu Shen, Hui Su, Yanran Li, Wenjie Li, Shuzi Niu, Yang Zhao, Akiko Aizawa, and Guoping Long. 2017. A conditional variational framework for dialog generation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714 (2015).Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS’14). 3104–3112.Google Scholar
Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? An empirical study on context-aware neural conversational models. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 231–236.Google ScholarCross Ref
Oriol Vinyals and Quoc Le. 2015. A neural conversational model. arXiv preprint arXiv:1506.05869 (2015).Google Scholar
Marilyn A. Walker, Rebecca Passonneau, and Julie E. Boland. 2001. Quantitative and qualitative evaluation of DARPA Communicator spoken dialogue systems. In Proceedings of the 39th Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 515–522.Google Scholar
Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A dataset for research on short-text conversations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google Scholar
Jason Williams, Antoine Raux, Deepak Ramachandran, and Alan Black. 2013. The dialog state tracking challenge. In Proceedings of the SIGDIAL Conference. 404–413.Google Scholar
Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2016. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627 (2016).Google Scholar
Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proceedings of the AAAI. 3351–3357.Google Scholar
Rui Yan, Yiping Song, and Hua Wu. 2016. Learning to respond with deep neural networks for retrieval-based human-computer conversation system. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 55–64.Google ScholarDigital Library
Rui Yan, Yiping Song, Xiangyang Zhou, and Hua Wu. 2016. Shall I be your chat companion?: Towards an online human-computer conversation system. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 649–658.Google ScholarDigital Library
Rui Yan, Dongyan Zhao, et al. 2017. Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 685–694.Google ScholarDigital Library
Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, and Jingjing Liu. 2019. A hybrid retrieval-generation neural conversation model. arXiv preprint arXiv:1904.09068 (2019).Google Scholar
Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Jun Huang, and Haiqing Chen. 2018. Response ranking with deep matching networks and external knowledge in information-seeking conversation systems. In Proceedings of the 41st International ACM SIGIR Conference on Research 8 Development in Information Retrieval. ACM, 245–254.Google ScholarDigital Library
Kaisheng Yao, Baolin Peng, Geoffrey Zweig, and Kam-Fai Wong. 2016. An attentional neural conversation model with improved specificity. arXiv preprint arXiv:1606.01292 (2016).Google Scholar
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google Scholar
Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Jun Xu, and Xueqi Cheng. 2018. Learning to control the specificity in neural response generation. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1108–1117.Google ScholarCross Ref
Wei Nan Zhang, Qingfu Zhu, Yifa Wang, Yanyan Zhao, and Ting Liu. 2017. Neural personalized response generation as domain adaptation. World Wide Web-internet Web Inf. Syst. 4 (2017), 1–20.Google Scholar
Ganbin Zhou, Ping Luo, Rongyu Cao, Fen Lin, Bo Chen, and Qing He. 2017. Mechanism-aware neural machine for dialogue response generation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’17). 3400–3407.Google Scholar
Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2018. Emotional chatting machine: Emotional conversation generation with internal and external memory. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’18).Google Scholar
Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016. Multi-view response selection for human-computer conversation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 372–381.Google ScholarCross Ref
Xianda Zhou and William Yang Wang. 2018. MojiTalk: Generating emotional responses at scale. In Proceedings of the Meeting of the Association for Computational Linguistics.Google ScholarCross Ref

Index Terms

Dual-factor Generation Model for Conversation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Conversation votes: enabling anonymous cues
CHI EA '07: CHI '07 Extended Abstracts on Human Factors in Computing Systems

In this work we describe Conversation Votes, a visualization to create new backchannels in conversation and augment collocated interaction. We expand the idea of a social mirror, a reflection of interaction, to incorporate direct user feedback in the ...
Read More
Conversation trees and threaded chats
CSCW '00: Proceedings of the 2000 ACM conference on Computer supported cooperative work

Chat programs and instant messaging services are increasingly popular among Internet users. However, basic issues with the interfaces and data structures of most forms of chat limit their utility for use in formal interactions (like group meetings) and ...
Read More
Conversational gaze mechanisms for humanlike robots

During conversations, speakers employ a number of verbal and nonverbal mechanisms to establish who participates in the conversation, when, and in what capacity. Gaze cues and mechanisms are particularly instrumental in establishing the participant roles ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 38, Issue 3
July 2020
311 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3394096
Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 June 2020
- Online AM: 7 May 2020
- Accepted: 1 April 2020
- Revised: 1 February 2020
- Received: 1 October 2019
Published in tois Volume 38, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Conversation
dual-factor generation
responder state modeling
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 761
  Total Downloads
- Downloads (Last 12 months)160
- Downloads (Last 6 weeks)29
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Dual-factor Generation Model for Conversation

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Conversation votes: enabling anonymous cues

Conversation trees and threaded chats

Conversational gaze mechanisms for humanlike robots