An ensemble-rich multi-aspect approach for robust style change detection: Notebook for PAN at CLEF-2018

Dimitrina Zlatkova, Daniel Kopev, Kristiyan Mitov, Atanas Atanasov, Momchil Hardalov, Ivan Koychev, Preslav Nakov

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

We describe the winning system for the PAN@CLEF 2018 task on Style Change Detection. Given a document, the goal is to determine whether it contains style change. We present our supervised approach, which combines a TF.IDF representation of the documents with features specifically engineered for the task and which makes predictions using an ensemble of diverse models including SVM, Random Forest, AdaBoost, MLP and LightGBM. We further perform comparative analysis on the performance of the models on three different datasets, two of which we have developed for the task. Moreover, we release our code in order to enable further research.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume2125
Publication statusPublished - 1 Jan 2018
Event19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, France
Duration: 10 Sep 201814 Sep 2018

    Fingerprint

Keywords

  • Multi-authorship
  • Natural Language Processing- Gradient boosting machines-Deep Learning
  • Stacking ensemble
  • Style change
  • Stylometry

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Zlatkova, D., Kopev, D., Mitov, K., Atanasov, A., Hardalov, M., Koychev, I., & Nakov, P. (2018). An ensemble-rich multi-aspect approach for robust style change detection: Notebook for PAN at CLEF-2018. CEUR Workshop Proceedings, 2125.