CedTMart Distributed RDF Store and Query Manager
Keywords
Big Data, Linked Data, Blinked Data, RDF, Triplestore, SPARQL,
Hadoop/MapReduce, Distributed Information Processing, Concurrent
Architecture Design, Implementation, and Optimization
Synopsis
As part of the CEDAR project, we have developed
a prototype triplestore management system called CedTMart which
guarantees high-performance in processing complex queries on RDF-encoded
information. This triplestore management system innovates with various
algorithms adapting the Hadoop/MapReduce framework to ensure
scalability. CedTMart processes "Blinked Data" (a new term we propose
to denote both "Big" and "Linked" data). Working with Blinked Data poses
serious challenges regarding processing, storing, and querying huge and
linked data sets. In addition, complex queries—such as,
specifically, SPARQL queries with many joins and variables—can
increase significantly the difficulty. There are solutions available,
which handle Blinked Data. However, the experiments performed in the
CEDAR project with several triplestores reveal that the current
technologies are not adequately efficient. This remains an issue that
must be addressed. CedTMart aims to palliate this shortcoming of the
state of the art. More specifically, it is meant to overcome the
challenges of
storing and retrieving RDF data of size that ranges from gigabytes
to petabytes.
Software
- CedTMart Preprocessor
- CedTMart Distributed RDF Store and Query Manager
Presentations/Videos/Demos
Technical References
-
CEDAR Technical Report Number 7
Title: CedTMart—A Triplestore for Storing and Querying Blinked Data
Author: Minwei Chen, Rafiqul Haque, and Mohand-Saïd Hacid
Date: July 2014
PDF:
-
CEDAR Technical Report Number 15
Title: Development of the CedTMart Query Processor
Author: Minwei Chen, Rafiqul Haque, and Mohand-Saïd Hacid
Date: November 2014
PDF: