文章/答案/技术大牛

发布

社区首页 >问答首页 >Bloom Filter散列返回太多的冲突。

问Bloom Filter散列返回太多的冲突。
EN

Stack Overflow用户

提问于 2017-03-15 21:38:07

回答 1查看 136关注 0票数 0

我一直在尝试实现我自己的(简单的) Bloom过滤器，但是被哈希困住了，我理解了多次散列项和用索引填充位数组的概念。

然而，我在哈希中看到了大量的碰撞，我使用了一个哈希算法(我已经尝试过FNV、murmurhash和现在的农哈希)和各种种子(基于当前的纳秒)。

我一定做错了什么，我是通过跟随k函数和设置相同数量的种子来计算这里的信息函数的。

任何帮助都会很好，谢谢。

const farmhash = require('farmhash');

class BloomFilter {
	constructor(items, input)
	{
		const BITS_PER_ITEM = 15; //~0.1% false positive rate
		this.m = Buffer.alloc(items.length * BITS_PER_ITEM); // setup bit array
		this.k = Math.ceil(BITS_PER_ITEM * 0.7); // amount of hash functions we need to use
		this.seeds = [];
		this.input = input;
		this.items = items;

		this.setSeeds();
		this.insertItems();
	}

	get time()
	{
		let hrTime = process.hrtime()
		return hrTime[1];
	}

	setSeeds()
	{
		for(let i = 0; i <= this.k; i++) this.seeds.push(this.time);
	}
	
	insertItems()
	{
		console.log('Total buffer size: ' + this.m.length);

		let collisions = 0;
		this.items.forEach(value => {			
			this.getBufferIndices(value).map(index => {
				if(this.m[index] === 1) collisions++;
				this.m[index] = 1;
			});
		});

		console.log('Total collisions: ' + collisions);
	}

	getBufferIndices(value)
	{
		let indicies = [];

		this.seeds.forEach(seed => indicies.push(farmhash.hash32WithSeed(value, seed) % this.m.length));

		return indicies;
	}
}

module.exports = BloomFilter;

algorithm

ecmascript-6

bloom-filter

javascript

node.js

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-03-15 21:51:12

据我从Bloom过滤器中所记得的，当特定值的所有k索引与不同值的索引匹配时，就会发生冲突。

看起来，您可以将先前设置为冲突的单个桶(this.m[index])计算在内。

以下(未经测试的)代码应该计算实际的冲突：

let collisions = 0;

this.items.forEach(value => {           
  let overlap = 0;
  this.getBufferIndices(value).map(index => {
    if(this.m[index] === 1) overlap++;
    this.m[index] = 1;
  });
  if (overlap === this.k) collisions++;
});

正如@Thomas在他的评论中正确地指出的那样，与其使用.map() (创建一个新数组)，不如使用.forEach()

this.getBufferIndices(value).forEach(index, ...);

在getBufferIndices()中，您可以使用.map()而不是.forEach()

getBufferIndices(value) {
  return this.seeds.map(seed => (farmhash.hash32WithSeed(value, seed) % this.m.length));
}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/42821183

复制

相似问题

问Bloom Filter散列返回太多的冲突。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Bloom Filter散列返回太多的冲突。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Bloom Filter散列返回太多的冲突。
EN