Building chatbots from forum data

Model selection using question answering metrics

Martin Boyanov, Ivan Koychev, Preslav Nakov, Alessandro Moschitti, Giovanni Martino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we exrtact pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new model selection strategy based on QA measures. Finally, we propose to use extrinsic evaluation with respect to a QA task as an automatic evaluation method for chatbots. Fhe evaluation shows that the model achieves a MAP of 63.5% on the extrinsic task. Moreover, it can answer correctly 49.5% of the questions when they are similar to questions asked in the forum, and 47.3% of the questions when they are more conversational in style.

Original languageEnglish
Title of host publicationInternational Conference on Recent Advances in Natural Language Processing
Subtitle of host publicationMeet Deep Learning, RANLP 2017 - Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages121-129
Number of pages9
Volume2017-September
ISBN (Electronic)9789544520489
DOIs
Publication statusPublished - 1 Jan 2017
Event11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017 - Varna, Bulgaria
Duration: 2 Sep 20178 Sep 2017

Other

Other11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017
CountryBulgaria
CityVarna
Period2/9/178/9/17

Fingerprint

Data structures

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Cite this

Boyanov, M., Koychev, I., Nakov, P., Moschitti, A., & Martino, G. (2017). Building chatbots from forum data: Model selection using question answering metrics. In International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings (Vol. 2017-September, pp. 121-129). Association for Computational Linguistics (ACL). https://doi.org/10.26615/978-954-452-049-6-018

Building chatbots from forum data : Model selection using question answering metrics. / Boyanov, Martin; Koychev, Ivan; Nakov, Preslav; Moschitti, Alessandro; Martino, Giovanni.

International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September Association for Computational Linguistics (ACL), 2017. p. 121-129.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Boyanov, M, Koychev, I, Nakov, P, Moschitti, A & Martino, G 2017, Building chatbots from forum data: Model selection using question answering metrics. in International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. vol. 2017-September, Association for Computational Linguistics (ACL), pp. 121-129, 11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, 2/9/17. https://doi.org/10.26615/978-954-452-049-6-018
Boyanov M, Koychev I, Nakov P, Moschitti A, Martino G. Building chatbots from forum data: Model selection using question answering metrics. In International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September. Association for Computational Linguistics (ACL). 2017. p. 121-129 https://doi.org/10.26615/978-954-452-049-6-018
Boyanov, Martin ; Koychev, Ivan ; Nakov, Preslav ; Moschitti, Alessandro ; Martino, Giovanni. / Building chatbots from forum data : Model selection using question answering metrics. International Conference on Recent Advances in Natural Language Processing: Meet Deep Learning, RANLP 2017 - Proceedings. Vol. 2017-September Association for Computational Linguistics (ACL), 2017. pp. 121-129
@inproceedings{fc5bdd19ee9c4677974aa9e0a1de62cb,
title = "Building chatbots from forum data: Model selection using question answering metrics",
abstract = "We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we exrtact pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new model selection strategy based on QA measures. Finally, we propose to use extrinsic evaluation with respect to a QA task as an automatic evaluation method for chatbots. Fhe evaluation shows that the model achieves a MAP of 63.5{\%} on the extrinsic task. Moreover, it can answer correctly 49.5{\%} of the questions when they are similar to questions asked in the forum, and 47.3{\%} of the questions when they are more conversational in style.",
author = "Martin Boyanov and Ivan Koychev and Preslav Nakov and Alessandro Moschitti and Giovanni Martino",
year = "2017",
month = "1",
day = "1",
doi = "10.26615/978-954-452-049-6-018",
language = "English",
volume = "2017-September",
pages = "121--129",
booktitle = "International Conference on Recent Advances in Natural Language Processing",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Building chatbots from forum data

T2 - Model selection using question answering metrics

AU - Boyanov, Martin

AU - Koychev, Ivan

AU - Nakov, Preslav

AU - Moschitti, Alessandro

AU - Martino, Giovanni

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we exrtact pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new model selection strategy based on QA measures. Finally, we propose to use extrinsic evaluation with respect to a QA task as an automatic evaluation method for chatbots. Fhe evaluation shows that the model achieves a MAP of 63.5% on the extrinsic task. Moreover, it can answer correctly 49.5% of the questions when they are similar to questions asked in the forum, and 47.3% of the questions when they are more conversational in style.

AB - We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we exrtact pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new model selection strategy based on QA measures. Finally, we propose to use extrinsic evaluation with respect to a QA task as an automatic evaluation method for chatbots. Fhe evaluation shows that the model achieves a MAP of 63.5% on the extrinsic task. Moreover, it can answer correctly 49.5% of the questions when they are similar to questions asked in the forum, and 47.3% of the questions when they are more conversational in style.

UR - http://www.scopus.com/inward/record.url?scp=85045740407&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045740407&partnerID=8YFLogxK

U2 - 10.26615/978-954-452-049-6-018

DO - 10.26615/978-954-452-049-6-018

M3 - Conference contribution

VL - 2017-September

SP - 121

EP - 129

BT - International Conference on Recent Advances in Natural Language Processing

PB - Association for Computational Linguistics (ACL)

ER -