Regularized and retrofitted models for learning sentence representation with context

Tanay Kumar Saha, Shafiq Rayhan Joty, Naeemul Hassan, Mohammad Al Hasan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Vector representation of sentences is important for many text processing tasks that involve classifying, clustering, or ranking sentences. For solving these tasks, bag-of-word based representation has been used for a long time. In recent years, distributed representation of sentences learned by neural models from unlabeled data has been shown to outperform traditional bag-of-words representations. However, most existing methods belonging to the neural models consider only the content of a sentence, and disregard its relations with other sentences in the context. In this paper, we first characterize two types of contexts depending on their scope and utility. We then propose two approaches to incorporate contextual information into content-based models. We evaluate our sentence representation models in a setup, where context is available to infer sentence vectors. Experimental results demonstrate that our proposed models outshine existing models on three fundamental tasks, such as, classifying, clustering, and ranking sentences.

Original languageEnglish
Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages547-556
Number of pages10
VolumePart F131841
ISBN (Electronic)9781450349185
DOIs
Publication statusPublished - 6 Nov 2017
Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
Duration: 6 Nov 201710 Nov 2017

Other

Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017
CountrySingapore
CitySingapore
Period6/11/1710/11/17

    Fingerprint

Keywords

  • Classification
  • Clustering
  • Discourse
  • Distributed representation of sentences
  • Feature learning
  • Ranking
  • Retrofitting
  • Sen2Vec

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Saha, T. K., Rayhan Joty, S., Hassan, N., & Al Hasan, M. (2017). Regularized and retrofitted models for learning sentence representation with context. In CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management (Vol. Part F131841, pp. 547-556). Association for Computing Machinery. https://doi.org/10.1145/3132847.3133011