拆分型Bloom Filter

肖明忠; 代亚非; 李晓明

您当前的位置：

首页 >

文章列表页 >

拆分型Bloom Filter

论文 | 更新时间：2025-07-16

- 拆分型Bloom Filter
- Split Bloom Filter
- 电子学报 2004年32卷第2期页码：241-245
- 作者机构：
  
  北京大学计算机科学技术系,北京,100871
- 作者简介：
- 基金信息：
  
  国家重点基础研究发展规划973资助项目 (No.G1999032706);国家863高科技发展计划资助项目 (No.2001AA111013)
- DOI：
  中图分类号： TP311.12
- 纸质出版：2004
- 稿件说明：
移动端阅览
肖明忠, 代亚非, 李晓明. 拆分型Bloom Filter[J]. 电子学报, 2004,32(2):241-245.

XIAO Ming-zhong, DAI Ya-fei, LI Xiao-ming. Split Bloom Filter[J]. Acta Electronica Sinica, 2004, 32(2): 241-245.
肖明忠, 代亚非, 李晓明. 拆分型Bloom Filter[J]. 电子学报, 2004,32(2):241-245. DOI：

XIAO Ming-zhong, DAI Ya-fei, LI Xiao-ming. Split Bloom Filter[J]. Acta Electronica Sinica, 2004, 32(2): 241-245. DOI：

摘要

Bloom Filter对数据集合采用一个位串表示并能有效支持集合元素的哈希查找操作.在对Bloom Filter及其改进型进行综述性分析研究并探讨它们的实用性之后

本文提出了使用位矩阵表示数据集合的拆分型Bloom Filter并对其作了分析比较研究

以允许集合元素不断增加的分布式系统应用模型为例

证明它能缓解增长问题并能有效节省全局的集合表示空间需求量.

Abstract

A Bloom Filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries

which uses an m-bit array to represent a data set and queries by hashing.The representation is the payoff for allowing a small rate of false positives in membership queries;that is

queries might wrongly regard an element as member of the set.However

for many applications

especially large-scale data set systems

the space savings and the locate time constantly outweigh this drawback when the probability of an error is sufficiently low and can suffer from by the application.The paper firstly surveys Bloom Filter and its variants in detail

and gives mathematical analysis behind them about space/time/error rate tradeoffs in order to explain their practicability.Then

we present a new kind of Bloom Filter-Split Bloom Filter

which uses a s×m-bit matrix to represent a set

and give analysis in detail as the formers.In distributed systems

each network node owns a data set

which is reasonable that some nodes are large number of data while the large number of nodes are a bit of data

if all nodes uses same parameters of algorithm

then it will give rise to a memory space wholly wasted.In addition

if the number of the elements of a node date set increases continually

the error rate will increasingly make the representation nonsensically.We prove that the Split Bloom Filter can efficiently solve or weaken the two problems.

关键词

Keywords

references

浏览量

1605

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

Bloom Filters散列函数数目多阶段动态优化算法