Learning classifiers from distributed, ontology-extended data sources

Doina Caragea, Jun Zhang, Jyotishman Pathak, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages363-373
Number of pages11
Volume4081 LNCS
StatePublished - 2006
Externally publishedYes
Event8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006 - Krakow, Poland
Duration: Sep 4 2006Sep 8 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4081 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006
CountryPoland
CityKrakow
Period9/4/069/8/06

Fingerprint

Information Storage and Retrieval
Ontology
Classifiers
Classifier
Learning
Bayesian networks
Decision trees
Decision Trees
Semantics
Naive Bayes Classifier
Acoustic waves
Performance Guarantee
Bayesian Networks
Decision tree
Correspondence

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Caragea, D., Zhang, J., Pathak, J., & Honavar, V. (2006). Learning classifiers from distributed, ontology-extended data sources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4081 LNCS, pp. 363-373). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4081 LNCS).

Learning classifiers from distributed, ontology-extended data sources. / Caragea, Doina; Zhang, Jun; Pathak, Jyotishman; Honavar, Vasant.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4081 LNCS 2006. p. 363-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4081 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Caragea, D, Zhang, J, Pathak, J & Honavar, V 2006, Learning classifiers from distributed, ontology-extended data sources. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4081 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4081 LNCS, pp. 363-373, 8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006, Krakow, Poland, 9/4/06.
Caragea D, Zhang J, Pathak J, Honavar V. Learning classifiers from distributed, ontology-extended data sources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4081 LNCS. 2006. p. 363-373. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Caragea, Doina ; Zhang, Jun ; Pathak, Jyotishman ; Honavar, Vasant. / Learning classifiers from distributed, ontology-extended data sources. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4081 LNCS 2006. pp. 363-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8a88a76f838e4c858ecdaae189feb74d,
title = "Learning classifiers from distributed, ontology-extended data sources",
abstract = "There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.",
author = "Doina Caragea and Jun Zhang and Jyotishman Pathak and Vasant Honavar",
year = "2006",
language = "English (US)",
isbn = "3540377360",
volume = "4081 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "363--373",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Learning classifiers from distributed, ontology-extended data sources

AU - Caragea, Doina

AU - Zhang, Jun

AU - Pathak, Jyotishman

AU - Honavar, Vasant

PY - 2006

Y1 - 2006

N2 - There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

AB - There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

UR - http://www.scopus.com/inward/record.url?scp=33751377738&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33751377738&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33751377738

SN - 3540377360

SN - 9783540377368

VL - 4081 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 363

EP - 373

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -