The patents retrieval prototype in the MOLTO project

Milen Chechev, Meritxell Gonzàlez, Lluis Marques, Cristina España-Bonet

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Original languageEnglish
Title of host publicationWWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion
Pages231-234
Number of pages4
DOIs
Publication statusPublished - 21 May 2012
Externally publishedYes
Event21st Annual Conference on World Wide Web, WWW'12 - Lyon, France
Duration: 16 Apr 201220 Apr 2012

Other

Other21st Annual Conference on World Wide Web, WWW'12
CountryFrance
CityLyon
Period16/4/1220/4/12

Fingerprint

Semantics
World Wide Web
Drug products
Ontology

Keywords

  • Automatic semantic annotations
  • Multilingual information retrieval
  • Patent translation

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Chechev, M., Gonzàlez, M., Marques, L., & España-Bonet, C. (2012). The patents retrieval prototype in the MOLTO project. In WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion (pp. 231-234) https://doi.org/10.1145/2187980.2188016

The patents retrieval prototype in the MOLTO project. / Chechev, Milen; Gonzàlez, Meritxell; Marques, Lluis; España-Bonet, Cristina.

WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion. 2012. p. 231-234.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chechev, M, Gonzàlez, M, Marques, L & España-Bonet, C 2012, The patents retrieval prototype in the MOLTO project. in WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion. pp. 231-234, 21st Annual Conference on World Wide Web, WWW'12, Lyon, France, 16/4/12. https://doi.org/10.1145/2187980.2188016
Chechev M, Gonzàlez M, Marques L, España-Bonet C. The patents retrieval prototype in the MOLTO project. In WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion. 2012. p. 231-234 https://doi.org/10.1145/2187980.2188016
Chechev, Milen ; Gonzàlez, Meritxell ; Marques, Lluis ; España-Bonet, Cristina. / The patents retrieval prototype in the MOLTO project. WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion. 2012. pp. 231-234
@inproceedings{187de41dfcbe4e50aa92ea3e300241d3,
title = "The patents retrieval prototype in the MOLTO project",
abstract = "This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria. Copyright is held by the International World Wide Web Conference Committee (IW3C2).",
keywords = "Automatic semantic annotations, Multilingual information retrieval, Patent translation",
author = "Milen Chechev and Meritxell Gonz{\`a}lez and Lluis Marques and Cristina Espa{\~n}a-Bonet",
year = "2012",
month = "5",
day = "21",
doi = "10.1145/2187980.2188016",
language = "English",
isbn = "9781450312301",
pages = "231--234",
booktitle = "WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion",

}

TY - GEN

T1 - The patents retrieval prototype in the MOLTO project

AU - Chechev, Milen

AU - Gonzàlez, Meritxell

AU - Marques, Lluis

AU - España-Bonet, Cristina

PY - 2012/5/21

Y1 - 2012/5/21

N2 - This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

AB - This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

KW - Automatic semantic annotations

KW - Multilingual information retrieval

KW - Patent translation

UR - http://www.scopus.com/inward/record.url?scp=84861016496&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861016496&partnerID=8YFLogxK

U2 - 10.1145/2187980.2188016

DO - 10.1145/2187980.2188016

M3 - Conference contribution

SN - 9781450312301

SP - 231

EP - 234

BT - WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion

ER -