
HDVM:Compression & Query Model of Linked-Data Based on Relational Matrix
FU Hai-dong, PENG Shen, HUANG Li, GU Jin-guang
ACTA ELECTRONICA SINICA ›› 2018, Vol. 46 ›› Issue (3) : 721-729.
HDVM:Compression & Query Model of Linked-Data Based on Relational Matrix
With the arrival of big data era, a large number of RDF (Resource Description Framework) data is flooding the entire Web of Data. Since the indexes of these datasets cannot be fully loaded in main memory when the RDF engines manage these huge datasets, these systems need to perform slow disk accesses to solve SPARQL queries. In this paper, a method named HDVM is proposed to reduce the number of linked data repeated times by extracting the latent triplet relation matrix from the linked dataset, and storing them in the form of subject vector, predicate vector and object matrix, which allows SPARQL queries to be full-in-memory performed without decompression. The experimental results show that the HDVM (Header Dictionary Vector Matrix) model proposed in this paper can improve the compression rate by 3%~20% compared with HDT (Header-Dictionary Triples), and the query time on billion-level-size dataset reaches average 400 milliseconds.
relation matrix / linked-data / query / compression {{custom_keyword}} /
/
〈 |
|
〉 |