Hierarchical text classification for supporting educational programs

Qi Ju, Chiara Ravagni, Alessandro Moschitti, Giampiero Vaschetto

Research output: Contribution to journalConference article


More than two decades have passed since the first design of the CONSTRUE system [2], a powerful rule-based model for the categorization of Reuters news. Nowadays, statistical approaches are well assessed and they allow for an easy design of text classification (TC) systems. Additionally, the Web has emphasized the need of approaches for digesting large amount of textual information and making it more easily accessible, e.g., thorough hierarchical taxonomies like Dmoz or Yahoo! categories. Surprisingly, automated approaches have not proved yet to be indispensable for such categorization processes. This suggests that the role of TC might be different from simply routing documents to different topical categories. In this paper, we provide evidence of the promising use of TC as a support for an interesting and high level human activity in the educational context. The latter refers to the selection and definition of educational programs tailored on specific needs of pupils, who sometime require particular attention and actions to solve their learning problems. TC in this context is exploited to automatically extract several aspects and properties from learning objects, i.e., didactic material, in terms of semantic labels. These can be used to organized the different pieces of material in specific didactic program, which can address specific deficiencies of pupils. The TC experiments, carried out with state-of-the-art algorithms and a small set of training data, show that automatic classifiers can easily derive labels like, didactic context, school matter, pupil difficulties and educative solution type.

Original languageEnglish
Pages (from-to)18-25
Number of pages8
JournalCEUR Workshop Proceedings
Publication statusPublished - 1 Dec 2012
Event3rd Italian Information Retrieval Workshop, IIR 2012 - Bari, Italy
Duration: 26 Jan 201227 Jan 2012



  • E-learning
  • Hierarchical text classification
  • Information management applications

ASJC Scopus subject areas

  • Computer Science(all)

Cite this