G-Diff: A grouping algorithm for RDF change detection on MapReduce

Jinhyun Ahn, Dong Hyuk Im, Jae Hong Eom, Nansu Zong, Hong Gee Kim

Abstract

Linked Data is a collection of RDF data that can grow exponentially and change over time. Detecting changes in RDF data is important to support Linked Data consuming applications with version management. Traditional approaches for change detection are not scalable. This has led researchers to devise algorithms on the MapReduce framework. Most works simply take a URI as a Map key. We observed that it is not efficient to handle RDF data with a large number of distinct URIs since many Reduce tasks have to be created. Even though the Reduce tasks are scheduled to run simultaneously, too many small Reduce tasks would increase the overall running time. In this paper, we propose G-Diff, an efficient MapReduce algorithm for RDF change detection. G-Diff groups triples by URIs during Map phase and sends the triples to a particular Reduce task rather than multiple Reduce tasks. Experiments on real datasets showed that the proposed approach takes less running time than previous works.

Original languageEnglish (US)
Title of host publicationSemantic Technology - 4th Joint International Conference, JIST 2014, Revised Selected Papers
EditorsJeff Z. Pan, Thepchai Supnithi, Marut Buranarach, Takahira Yamaguchi, Vilas Wuwongse
PublisherSpringer Verlag
Pages230-235
Number of pages6
ISBN (Electronic)9783319156149
DOIs
StatePublished - 2015
Event4th Joint International Conference on Semantic Technology, JIST 2014 - Chiang Mai, Thailand
Duration: Nov 9 2014Nov 11 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8943
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th Joint International Conference on Semantic Technology, JIST 2014
Country/TerritoryThailand
CityChiang Mai
Period11/9/1411/11/14

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'G-Diff: A grouping algorithm for RDF change detection on MapReduce'. Together they form a unique fingerprint.

Cite this