Linear online learning over structured data with distributed tree kernels

Simone Filice, Danilo Croce, Roberto Basili, Fabio Massimo Zanzotto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.

Original languageEnglish
Title of host publicationProceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013
PublisherIEEE Computer Society
Pages123-128
Number of pages6
Volume1
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event2013 12th International Conference on Machine Learning and Applications, ICMLA 2013 - Miami, FL
Duration: 4 Dec 20137 Dec 2013

Other

Other2013 12th International Conference on Machine Learning and Applications, ICMLA 2013
CityMiami, FL
Period4/12/137/12/13

Fingerprint

Labeling
Learning algorithms
Learning systems
Semantics
Processing
Experiments

Keywords

  • Distributed Trees
  • Online Learning
  • Tree Kernels

ASJC Scopus subject areas

  • Computer Science Applications
  • Human-Computer Interaction

Cite this

Filice, S., Croce, D., Basili, R., & Zanzotto, F. M. (2013). Linear online learning over structured data with distributed tree kernels. In Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013 (Vol. 1, pp. 123-128). [6784598] IEEE Computer Society. https://doi.org/10.1109/ICMLA.2013.28

Linear online learning over structured data with distributed tree kernels. / Filice, Simone; Croce, Danilo; Basili, Roberto; Zanzotto, Fabio Massimo.

Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013. Vol. 1 IEEE Computer Society, 2013. p. 123-128 6784598.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Filice, S, Croce, D, Basili, R & Zanzotto, FM 2013, Linear online learning over structured data with distributed tree kernels. in Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013. vol. 1, 6784598, IEEE Computer Society, pp. 123-128, 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013, Miami, FL, 4/12/13. https://doi.org/10.1109/ICMLA.2013.28
Filice S, Croce D, Basili R, Zanzotto FM. Linear online learning over structured data with distributed tree kernels. In Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013. Vol. 1. IEEE Computer Society. 2013. p. 123-128. 6784598 https://doi.org/10.1109/ICMLA.2013.28
Filice, Simone ; Croce, Danilo ; Basili, Roberto ; Zanzotto, Fabio Massimo. / Linear online learning over structured data with distributed tree kernels. Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013. Vol. 1 IEEE Computer Society, 2013. pp. 123-128
@inproceedings{8e4cf276d297473f8b1261d744ca3eb6,
title = "Linear online learning over structured data with distributed tree kernels",
abstract = "Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.",
keywords = "Distributed Trees, Online Learning, Tree Kernels",
author = "Simone Filice and Danilo Croce and Roberto Basili and Zanzotto, {Fabio Massimo}",
year = "2013",
doi = "10.1109/ICMLA.2013.28",
language = "English",
volume = "1",
pages = "123--128",
booktitle = "Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Linear online learning over structured data with distributed tree kernels

AU - Filice, Simone

AU - Croce, Danilo

AU - Basili, Roberto

AU - Zanzotto, Fabio Massimo

PY - 2013

Y1 - 2013

N2 - Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.

AB - Online algorithms are an important class of learning machines as they are extremely simple and computationally efficient. Kernel methods versions can handle structured data, such as trees, and achieve state-of-the-art performance. However kernelized versions of Online Learning algorithms slow down when the number of support vectors becomes large. The traditional way to cope with this problem is introducing budgets that set the maximum number of support vectors. In this paper, we investigate Distributed Trees (DT) as an efficient way to use structured data in online learning. DTs effectively embed the huge feature space of the tree fragments into small vectors, so enabling the use of linear versions of kernel machines over tree structured data. We experiment with the Passive-Aggressive (PA) algorithm by comparing the linear and the kernelized version. A massive dataset made with tree structured data is employed: it is originated from a natural language processing task, the Boundary Detection in the context of Semantic Role Labeling over Frame Net. Results on a sample of the final data show that the DTs along with the Linear PA algorithm and the Tree Kernel along with the Bundgeted PA achieve comparable results in terms of f1-measure. Finally, the exploration of the full dataset allows the former to improve the performance on the classification task, with respect to the latter.

KW - Distributed Trees

KW - Online Learning

KW - Tree Kernels

UR - http://www.scopus.com/inward/record.url?scp=84899466044&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899466044&partnerID=8YFLogxK

U2 - 10.1109/ICMLA.2013.28

DO - 10.1109/ICMLA.2013.28

M3 - Conference contribution

AN - SCOPUS:84899466044

VL - 1

SP - 123

EP - 128

BT - Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013

PB - IEEE Computer Society

ER -