FU Hai-dong, PENG Shen, HUANG Li, et al. HDVM:Compression & Query Model of Linked-Data Based on Relational Matrix[J]. Acta Electronica Sinica, 2018, 46(3): 721-729.
DOI:
FU Hai-dong, PENG Shen, HUANG Li, et al. HDVM:Compression & Query Model of Linked-Data Based on Relational Matrix[J]. Acta Electronica Sinica, 2018, 46(3): 721-729. DOI: 10.3969/j.issn.0372-2112.2018.03.030.
HDVM:Compression & Query Model of Linked-Data Based on Relational Matrix
a large number of RDF (Resource Description Framework) data is flooding the entire Web of Data. Since the indexes of these datasets cannot be fully loaded in main memory when the RDF engines manage these huge datasets
these systems need to perform slow disk accesses to solve SPARQL queries. In this paper
a method named HDVM is proposed to reduce the number of linked data repeated times by extracting the latent triplet relation matrix from the linked dataset
and storing them in the form of subject vector
predicate vector and object matrix
which allows SPARQL queries to be full-in-memory performed without decompression. The experimental results show that the HDVM (Header Dictionary Vector Matrix) model proposed in this paper can improve the compression rate by 3%~20% compared with HDT (Header-Dictionary Triples)
and the query time on billion-level-size dataset reaches average 400 milliseconds.