Learning classifiers from distributed, ontology-extended data sources

Doina Caragea; Jun Zhang; Jyotishman Pathak; Vasant Honavar

doi:10.1007/11823728_35

Learning classifiers from distributed, ontology-extended data sources

Doina Caragea, Jun Zhang, Jyotishman Pathak, Vasant Honavar

Quantitative Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

Original language	English (US)
Title of host publication	Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings
Publisher	Springer Verlag
Pages	363-373
Number of pages	11
ISBN (Print)	3540377360, 9783540377368
DOIs	https://doi.org/10.1007/11823728_35
State	Published - 2006
Event	8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006 - Krakow, Poland Duration: Sep 4 2006 → Sep 8 2006

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	4081 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Other

Other	8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006
Country/Territory	Poland
City	Krakow
Period	9/4/06 → 9/8/06

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/11823728_35

Cite this

Caragea, D., Zhang, J., Pathak, J., & Honavar, V. (2006). Learning classifiers from distributed, ontology-extended data sources. In Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings (pp. 363-373). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4081 LNCS). Springer Verlag. https://doi.org/10.1007/11823728_35

Learning classifiers from distributed, ontology-extended data sources. / Caragea, Doina; Zhang, Jun; Pathak, Jyotishman et al.
Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings. Springer Verlag, 2006. p. 363-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4081 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Caragea, D, Zhang, J, Pathak, J & Honavar, V 2006, Learning classifiers from distributed, ontology-extended data sources. in Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4081 LNCS, Springer Verlag, pp. 363-373, 8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006, Krakow, Poland, 9/4/06. https://doi.org/10.1007/11823728_35

Caragea D, Zhang J, Pathak J, Honavar V. Learning classifiers from distributed, ontology-extended data sources. In Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings. Springer Verlag. 2006. p. 363-373. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/11823728_35

Caragea, Doina ; Zhang, Jun ; Pathak, Jyotishman et al. / Learning classifiers from distributed, ontology-extended data sources. Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings. Springer Verlag, 2006. pp. 363-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{8a88a76f838e4c858ecdaae189feb74d,

title = "Learning classifiers from distributed, ontology-extended data sources",

abstract = "There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.",

author = "Doina Caragea and Jun Zhang and Jyotishman Pathak and Vasant Honavar",

year = "2006",

doi = "10.1007/11823728_35",

language = "English (US)",

isbn = "3540377360",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "363--373",

booktitle = "Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings",

note = "8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006 ; Conference date: 04-09-2006 Through 08-09-2006",

}

TY - GEN

T1 - Learning classifiers from distributed, ontology-extended data sources

AU - Caragea, Doina

AU - Zhang, Jun

AU - Pathak, Jyotishman

AU - Honavar, Vasant

PY - 2006

Y1 - 2006

N2 - There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

AB - There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources.

UR - http://www.scopus.com/inward/record.url?scp=33751377738&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33751377738&partnerID=8YFLogxK

U2 - 10.1007/11823728_35

DO - 10.1007/11823728_35

M3 - Conference contribution

AN - SCOPUS:33751377738

SN - 3540377360

SN - 9783540377368

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 363

EP - 373

BT - Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings

PB - Springer Verlag

T2 - 8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006

Y2 - 4 September 2006 through 8 September 2006

ER -

Learning classifiers from distributed, ontology-extended data sources

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this