Tikhonov or lasso regularization: Which is better and when

Fei Wang, Sanjay Chawla, Wei Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

It is well known that supervised learning problems with ℓ1 (Lasso) and ℓ2 (Tikhonov or Ridge) regularizers will result in very different solutions. Forexample, the ℓ1 solution vector will be sparser and can potentially beused both for prediction and feature selection. However, given a data set it isoften hard to determine which form of regularizationis more applicable in a given context. In this paper we use mathematical propertiesof the two regularization methods followed by detailed experimentation to understand their impact basedon four characteristics: non-stationarity of the data generating process, level of noise in the data sensingmechanism, degree of correlation between dependent and independent variables and the shape of the data set. The practical outcome of our research is that it can serve as a guide forpractitioners of large scale data mining and machine learning tools in their day-to-day practice.

Original languageEnglish
Title of host publicationProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
Pages795-802
Number of pages8
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event25th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2013 - Washington, DC
Duration: 4 Nov 20136 Nov 2013

Other

Other25th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2013
CityWashington, DC
Period4/11/136/11/13

Fingerprint

Supervised learning
Data mining
Learning systems
Feature extraction

Keywords

  • Classification
  • Lasso
  • Regularization

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Computer Science Applications

Cite this

Wang, F., Chawla, S., & Liu, W. (2013). Tikhonov or lasso regularization: Which is better and when. In Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI (pp. 795-802). [6735333] https://doi.org/10.1109/ICTAI.2013.122

Tikhonov or lasso regularization : Which is better and when. / Wang, Fei; Chawla, Sanjay; Liu, Wei.

Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI. 2013. p. 795-802 6735333.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, F, Chawla, S & Liu, W 2013, Tikhonov or lasso regularization: Which is better and when. in Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI., 6735333, pp. 795-802, 25th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2013, Washington, DC, 4/11/13. https://doi.org/10.1109/ICTAI.2013.122
Wang F, Chawla S, Liu W. Tikhonov or lasso regularization: Which is better and when. In Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI. 2013. p. 795-802. 6735333 https://doi.org/10.1109/ICTAI.2013.122
Wang, Fei ; Chawla, Sanjay ; Liu, Wei. / Tikhonov or lasso regularization : Which is better and when. Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI. 2013. pp. 795-802
@inproceedings{2c4e2b9c067a48c7adc5b568816d3bea,
title = "Tikhonov or lasso regularization: Which is better and when",
abstract = "It is well known that supervised learning problems with ℓ1 (Lasso) and ℓ2 (Tikhonov or Ridge) regularizers will result in very different solutions. Forexample, the ℓ1 solution vector will be sparser and can potentially beused both for prediction and feature selection. However, given a data set it isoften hard to determine which form of regularizationis more applicable in a given context. In this paper we use mathematical propertiesof the two regularization methods followed by detailed experimentation to understand their impact basedon four characteristics: non-stationarity of the data generating process, level of noise in the data sensingmechanism, degree of correlation between dependent and independent variables and the shape of the data set. The practical outcome of our research is that it can serve as a guide forpractitioners of large scale data mining and machine learning tools in their day-to-day practice.",
keywords = "Classification, Lasso, Regularization",
author = "Fei Wang and Sanjay Chawla and Wei Liu",
year = "2013",
doi = "10.1109/ICTAI.2013.122",
language = "English",
isbn = "9781479929719",
pages = "795--802",
booktitle = "Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI",

}

TY - GEN

T1 - Tikhonov or lasso regularization

T2 - Which is better and when

AU - Wang, Fei

AU - Chawla, Sanjay

AU - Liu, Wei

PY - 2013

Y1 - 2013

N2 - It is well known that supervised learning problems with ℓ1 (Lasso) and ℓ2 (Tikhonov or Ridge) regularizers will result in very different solutions. Forexample, the ℓ1 solution vector will be sparser and can potentially beused both for prediction and feature selection. However, given a data set it isoften hard to determine which form of regularizationis more applicable in a given context. In this paper we use mathematical propertiesof the two regularization methods followed by detailed experimentation to understand their impact basedon four characteristics: non-stationarity of the data generating process, level of noise in the data sensingmechanism, degree of correlation between dependent and independent variables and the shape of the data set. The practical outcome of our research is that it can serve as a guide forpractitioners of large scale data mining and machine learning tools in their day-to-day practice.

AB - It is well known that supervised learning problems with ℓ1 (Lasso) and ℓ2 (Tikhonov or Ridge) regularizers will result in very different solutions. Forexample, the ℓ1 solution vector will be sparser and can potentially beused both for prediction and feature selection. However, given a data set it isoften hard to determine which form of regularizationis more applicable in a given context. In this paper we use mathematical propertiesof the two regularization methods followed by detailed experimentation to understand their impact basedon four characteristics: non-stationarity of the data generating process, level of noise in the data sensingmechanism, degree of correlation between dependent and independent variables and the shape of the data set. The practical outcome of our research is that it can serve as a guide forpractitioners of large scale data mining and machine learning tools in their day-to-day practice.

KW - Classification

KW - Lasso

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=84897688011&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897688011&partnerID=8YFLogxK

U2 - 10.1109/ICTAI.2013.122

DO - 10.1109/ICTAI.2013.122

M3 - Conference contribution

AN - SCOPUS:84897688011

SN - 9781479929719

SP - 795

EP - 802

BT - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI

ER -