北京大学计算机科学技术系,北京,100871
纸质出版:2004
移动端阅览
肖明忠, 代亚非, 李晓明. 拆分型Bloom Filter[J]. 电子学报, 2004,32(2):241-245.
XIAO Ming-zhong, DAI Ya-fei, LI Xiao-ming. Split Bloom Filter[J]. Acta Electronica Sinica, 2004, 32(2): 241-245.
Bloom Filter对数据集合采用一个位串表示并能有效支持集合元素的哈希查找操作.在对Bloom Filter及其改进型进行综述性分析研究并探讨它们的实用性之后
本文提出了使用位矩阵表示数据集合的拆分型Bloom Filter并对其作了分析比较研究
以允许集合元素不断增加的分布式系统应用模型为例
证明它能缓解增长问题并能有效节省全局的集合表示空间需求量.
A Bloom Filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries
which uses an m-bit array to represent a data set and queries by hashing.The representation is the payoff for allowing a small rate of false positives in membership queries;that is
queries might wrongly regard an element as member of the set.However
for many applications
especially large-scale data set systems
the space savings and the locate time constantly outweigh this drawback when the probability of an error is sufficiently low and can suffer from by the application.The paper firstly surveys Bloom Filter and its variants in detail
and gives mathematical analysis behind them about space/time/error rate tradeoffs in order to explain their practicability.Then
we present a new kind of Bloom Filter-Split Bloom Filter
which uses a s×m-bit matrix to represent a set
and give analysis in detail as the formers.In distributed systems
each network node owns a data set
which is reasonable that some nodes are large number of data while the large number of nodes are a bit of data
if all nodes uses same parameters of algorithm
then it will give rise to a memory space wholly wasted.In addition
if the number of the elements of a node date set increases continually
the error rate will increasingly make the representation nonsensically.We prove that the Split Bloom Filter can efficiently solve or weaken the two problems.
0
浏览量
1605
下载量
12
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621