PAN@FIRE

Overview of the cross-language !ndian text re-use detection competition

Alberto Barron, Paolo Rosso, Sobha Lalitha Devi, Paul Clough, Mark Stevenson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages59-70
Number of pages12
Volume7536 LNCS
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event3rd International Workshop on Multilingual Information Access in South Asian Languages, FIRE 2011 - Bombay
Duration: 2 Dec 20114 Dec 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7536 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other3rd International Workshop on Multilingual Information Access in South Asian Languages, FIRE 2011
CityBombay
Period2/12/114/12/11

Fingerprint

Reuse
Language
Text
Evaluate
Evaluation
Corpus
Model

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Barron, A., Rosso, P., Devi, S. L., Clough, P., & Stevenson, M. (2013). PAN@FIRE: Overview of the cross-language !ndian text re-use detection competition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7536 LNCS, pp. 59-70). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7536 LNCS). https://doi.org/10.1007/978-3-642-40087-2_6

PAN@FIRE : Overview of the cross-language !ndian text re-use detection competition. / Barron, Alberto; Rosso, Paolo; Devi, Sobha Lalitha; Clough, Paul; Stevenson, Mark.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7536 LNCS 2013. p. 59-70 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7536 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Barron, A, Rosso, P, Devi, SL, Clough, P & Stevenson, M 2013, PAN@FIRE: Overview of the cross-language !ndian text re-use detection competition. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7536 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7536 LNCS, pp. 59-70, 3rd International Workshop on Multilingual Information Access in South Asian Languages, FIRE 2011, Bombay, 2/12/11. https://doi.org/10.1007/978-3-642-40087-2_6
Barron A, Rosso P, Devi SL, Clough P, Stevenson M. PAN@FIRE: Overview of the cross-language !ndian text re-use detection competition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7536 LNCS. 2013. p. 59-70. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-40087-2_6
Barron, Alberto ; Rosso, Paolo ; Devi, Sobha Lalitha ; Clough, Paul ; Stevenson, Mark. / PAN@FIRE : Overview of the cross-language !ndian text re-use detection competition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7536 LNCS 2013. pp. 59-70 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7ca0080b781b44bda443a5b38ab1faa7,
title = "PAN@FIRE: Overview of the cross-language !ndian text re-use detection competition",
abstract = "The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source.",
author = "Alberto Barron and Paolo Rosso and Devi, {Sobha Lalitha} and Paul Clough and Mark Stevenson",
year = "2013",
doi = "10.1007/978-3-642-40087-2_6",
language = "English",
isbn = "9783642400865",
volume = "7536 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "59--70",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - PAN@FIRE

T2 - Overview of the cross-language !ndian text re-use detection competition

AU - Barron, Alberto

AU - Rosso, Paolo

AU - Devi, Sobha Lalitha

AU - Clough, Paul

AU - Stevenson, Mark

PY - 2013

Y1 - 2013

N2 - The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source.

AB - The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source.

UR - http://www.scopus.com/inward/record.url?scp=84893397088&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893397088&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-40087-2_6

DO - 10.1007/978-3-642-40087-2_6

M3 - Conference contribution

SN - 9783642400865

VL - 7536 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 59

EP - 70

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -