Online feature selection algorithm with bayesian ℓ 1 regularization

Yunpeng Cai, Yijun Sun, Jian Li, Steven Goodison

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We propose a novel online-learning based feature selection algorithm for supervised learning in the presence of a huge amount of irrelevant features. The key idea of the algorithm is to decompose a nonlinear problem into a set of locally linear ones through local learning, and then estimate the relevance of features globally in a large margin framework with ℓ1 regularization. Unlike batch learning, the regularization parameter in online learning has to be tuned on-thefly with the increasing of training data. We address this issue within the Bayesian learning paradigm, and provide an analytic solution for automatic estimation of the regularization parameter via variational methods. Numerical experiments on a variety of benchmark data sets are presented that demonstrate the effectiveness of the newly proposed feature selection algorithm.

Original languageEnglish (US)
Title of host publication13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
Pages401-413
Number of pages13
DOIs
StatePublished - Jul 23 2009
Externally publishedYes
Event13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 - Bangkok, Thailand
Duration: Apr 27 2009Apr 30 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5476 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
CountryThailand
CityBangkok
Period4/27/094/30/09

Fingerprint

Feature Selection
Feature extraction
Regularization
Online Learning
Regularization Parameter
Bayesian Learning
Supervised learning
Supervised Learning
Analytic Solution
Variational Methods
Margin
Batch
Nonlinear Problem
Paradigm
Numerical Experiment
Benchmark
Decompose
Estimate
Demonstrate
Experiments

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Cai, Y., Sun, Y., Li, J., & Goodison, S. (2009). Online feature selection algorithm with bayesian ℓ 1 regularization. In 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 (pp. 401-413). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5476 LNAI). https://doi.org/10.1007/978-3-642-01307-2_37

Online feature selection algorithm with bayesian ℓ 1 regularization. / Cai, Yunpeng; Sun, Yijun; Li, Jian; Goodison, Steven.

13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009. 2009. p. 401-413 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5476 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cai, Y, Sun, Y, Li, J & Goodison, S 2009, Online feature selection algorithm with bayesian ℓ 1 regularization. in 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5476 LNAI, pp. 401-413, 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009, Bangkok, Thailand, 4/27/09. https://doi.org/10.1007/978-3-642-01307-2_37
Cai Y, Sun Y, Li J, Goodison S. Online feature selection algorithm with bayesian ℓ 1 regularization. In 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009. 2009. p. 401-413. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-01307-2_37
Cai, Yunpeng ; Sun, Yijun ; Li, Jian ; Goodison, Steven. / Online feature selection algorithm with bayesian ℓ 1 regularization. 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009. 2009. pp. 401-413 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{2cb6bf57b14142c88211ea9d00961466,
title = "Online feature selection algorithm with bayesian ℓ 1 regularization",
abstract = "We propose a novel online-learning based feature selection algorithm for supervised learning in the presence of a huge amount of irrelevant features. The key idea of the algorithm is to decompose a nonlinear problem into a set of locally linear ones through local learning, and then estimate the relevance of features globally in a large margin framework with ℓ1 regularization. Unlike batch learning, the regularization parameter in online learning has to be tuned on-thefly with the increasing of training data. We address this issue within the Bayesian learning paradigm, and provide an analytic solution for automatic estimation of the regularization parameter via variational methods. Numerical experiments on a variety of benchmark data sets are presented that demonstrate the effectiveness of the newly proposed feature selection algorithm.",
author = "Yunpeng Cai and Yijun Sun and Jian Li and Steven Goodison",
year = "2009",
month = "7",
day = "23",
doi = "10.1007/978-3-642-01307-2_37",
language = "English (US)",
isbn = "3642013066",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "401--413",
booktitle = "13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009",

}

TY - GEN

T1 - Online feature selection algorithm with bayesian ℓ 1 regularization

AU - Cai, Yunpeng

AU - Sun, Yijun

AU - Li, Jian

AU - Goodison, Steven

PY - 2009/7/23

Y1 - 2009/7/23

N2 - We propose a novel online-learning based feature selection algorithm for supervised learning in the presence of a huge amount of irrelevant features. The key idea of the algorithm is to decompose a nonlinear problem into a set of locally linear ones through local learning, and then estimate the relevance of features globally in a large margin framework with ℓ1 regularization. Unlike batch learning, the regularization parameter in online learning has to be tuned on-thefly with the increasing of training data. We address this issue within the Bayesian learning paradigm, and provide an analytic solution for automatic estimation of the regularization parameter via variational methods. Numerical experiments on a variety of benchmark data sets are presented that demonstrate the effectiveness of the newly proposed feature selection algorithm.

AB - We propose a novel online-learning based feature selection algorithm for supervised learning in the presence of a huge amount of irrelevant features. The key idea of the algorithm is to decompose a nonlinear problem into a set of locally linear ones through local learning, and then estimate the relevance of features globally in a large margin framework with ℓ1 regularization. Unlike batch learning, the regularization parameter in online learning has to be tuned on-thefly with the increasing of training data. We address this issue within the Bayesian learning paradigm, and provide an analytic solution for automatic estimation of the regularization parameter via variational methods. Numerical experiments on a variety of benchmark data sets are presented that demonstrate the effectiveness of the newly proposed feature selection algorithm.

UR - http://www.scopus.com/inward/record.url?scp=67650661623&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67650661623&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-01307-2_37

DO - 10.1007/978-3-642-01307-2_37

M3 - Conference contribution

AN - SCOPUS:67650661623

SN - 3642013066

SN - 9783642013065

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 401

EP - 413

BT - 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009

ER -