Noun compound interpretation using paraphrasing verbs: Feasibility study

Research output: Chapter in Book/Report/Conference proceedingConference contribution

24 Citations (Scopus)

Abstract

The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds' semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the target nouns, with associated weights, e.g., malaria mosquito can be represented as follows: carry (23), spread (16), cause (12), transmit (9), etc. These verbs are directly usable as paraphrases, and using multiple of them simultaneously yields an appealing fine-grained semantic representation. In the present paper, we describe the process of constructing such representations for 250 noun-noun compounds previously proposed in the linguistic literature by Levi (1978) [2]. In particular, using human subjects recruited through Amazon Mechanical Turk Web Service, we create a valuable manually-annotated resource for noun compound interpretation, which we make publicly available with the hope to inspire further research in paraphrase-based noun compound interpretation. We further perform a number of experiments, including a comparison to automatically generated weight vectors, in order to assess the dataset quality and the feasibility of the idea of using paraphrasing verbs to characterise noun compounds' semantics; the results are quite promising.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages103-117
Number of pages15
Volume5253 LNAI
DOIs
Publication statusPublished - 25 Sep 2008
Externally publishedYes
Event13th International Conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA 2008 - Varna, Bulgaria
Duration: 4 Sep 20086 Sep 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5253 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other13th International Conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA 2008
CountryBulgaria
CityVarna
Period4/9/086/9/08

Fingerprint

Semantics
Malaria
Apple
Stem Cells
Stem cells
Linguistics
Web services
Web Services
Resources
Target
Processing
Experiment
Interpretation
Experiments
Text
Human

Keywords

  • Lexical semantics
  • Noun compounds
  • Paraphrasing

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Nakov, P. (2008). Noun compound interpretation using paraphrasing verbs: Feasibility study. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5253 LNAI, pp. 103-117). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5253 LNAI). https://doi.org/10.1007/978-3-540-85776-1_10

Noun compound interpretation using paraphrasing verbs : Feasibility study. / Nakov, Preslav.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5253 LNAI 2008. p. 103-117 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5253 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nakov, P 2008, Noun compound interpretation using paraphrasing verbs: Feasibility study. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5253 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5253 LNAI, pp. 103-117, 13th International Conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA 2008, Varna, Bulgaria, 4/9/08. https://doi.org/10.1007/978-3-540-85776-1_10
Nakov P. Noun compound interpretation using paraphrasing verbs: Feasibility study. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5253 LNAI. 2008. p. 103-117. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-85776-1_10
Nakov, Preslav. / Noun compound interpretation using paraphrasing verbs : Feasibility study. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5253 LNAI 2008. pp. 103-117 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7c87574dc2624fafbfff0daef3bb0f2d,
title = "Noun compound interpretation using paraphrasing verbs: Feasibility study",
abstract = "The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds' semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the target nouns, with associated weights, e.g., malaria mosquito can be represented as follows: carry (23), spread (16), cause (12), transmit (9), etc. These verbs are directly usable as paraphrases, and using multiple of them simultaneously yields an appealing fine-grained semantic representation. In the present paper, we describe the process of constructing such representations for 250 noun-noun compounds previously proposed in the linguistic literature by Levi (1978) [2]. In particular, using human subjects recruited through Amazon Mechanical Turk Web Service, we create a valuable manually-annotated resource for noun compound interpretation, which we make publicly available with the hope to inspire further research in paraphrase-based noun compound interpretation. We further perform a number of experiments, including a comparison to automatically generated weight vectors, in order to assess the dataset quality and the feasibility of the idea of using paraphrasing verbs to characterise noun compounds' semantics; the results are quite promising.",
keywords = "Lexical semantics, Noun compounds, Paraphrasing",
author = "Preslav Nakov",
year = "2008",
month = "9",
day = "25",
doi = "10.1007/978-3-540-85776-1_10",
language = "English",
isbn = "3540857753",
volume = "5253 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "103--117",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Noun compound interpretation using paraphrasing verbs

T2 - Feasibility study

AU - Nakov, Preslav

PY - 2008/9/25

Y1 - 2008/9/25

N2 - The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds' semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the target nouns, with associated weights, e.g., malaria mosquito can be represented as follows: carry (23), spread (16), cause (12), transmit (9), etc. These verbs are directly usable as paraphrases, and using multiple of them simultaneously yields an appealing fine-grained semantic representation. In the present paper, we describe the process of constructing such representations for 250 noun-noun compounds previously proposed in the linguistic literature by Levi (1978) [2]. In particular, using human subjects recruited through Amazon Mechanical Turk Web Service, we create a valuable manually-annotated resource for noun compound interpretation, which we make publicly available with the hope to inspire further research in paraphrase-based noun compound interpretation. We further perform a number of experiments, including a comparison to automatically generated weight vectors, in order to assess the dataset quality and the feasibility of the idea of using paraphrasing verbs to characterise noun compounds' semantics; the results are quite promising.

AB - The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds' semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the target nouns, with associated weights, e.g., malaria mosquito can be represented as follows: carry (23), spread (16), cause (12), transmit (9), etc. These verbs are directly usable as paraphrases, and using multiple of them simultaneously yields an appealing fine-grained semantic representation. In the present paper, we describe the process of constructing such representations for 250 noun-noun compounds previously proposed in the linguistic literature by Levi (1978) [2]. In particular, using human subjects recruited through Amazon Mechanical Turk Web Service, we create a valuable manually-annotated resource for noun compound interpretation, which we make publicly available with the hope to inspire further research in paraphrase-based noun compound interpretation. We further perform a number of experiments, including a comparison to automatically generated weight vectors, in order to assess the dataset quality and the feasibility of the idea of using paraphrasing verbs to characterise noun compounds' semantics; the results are quite promising.

KW - Lexical semantics

KW - Noun compounds

KW - Paraphrasing

UR - http://www.scopus.com/inward/record.url?scp=52149099027&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=52149099027&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-85776-1_10

DO - 10.1007/978-3-540-85776-1_10

M3 - Conference contribution

AN - SCOPUS:52149099027

SN - 3540857753

SN - 9783540857754

VL - 5253 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 103

EP - 117

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -