/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}以下是JDK 1.6的早期版本
/**
* Applies a supplemental hash function to a given hashCode, which
* defends against poor quality hash functions. This is critical
* because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}有人能解释一下应用这种散列比在早期版本的java中有什么好处吗?这将如何影响密钥分发的速度和质量,我指的是jdk 8中实现的新哈希函数,以及它是如何实现的,以减少冲突?
发布于 2016-04-12 00:24:04
在hashCode方法表现相当糟糕的情况下,HashMap的性能可能会急剧下降。例如,假设您的hashCode方法只生成了一个16位数。
这通过将散列代码xor自身向右移位16来解决这个问题。如果这个数字在此之前是均匀分布的,那么它仍然应该是。如果它是坏的,这应该会改善它。
发布于 2019-07-19 18:17:56
Here很好地解释了HashMap在Java8中是如何工作的。下面是同一篇博客中的一段代码。
要理解这一点,我们首先需要了解索引是如何计算的:
将哈希码映射到数组中的索引。在最简单的方法中,这可以通过对散列码和数组长度执行模运算来完成,例如hash(key) % n。使用模运算可确保索引i始终在0和n之间。
i=哈希%n;
对于Java索引中的HashMap,i通过以下表达式计算:
散列i= (n - 1) &
;
在这个表达式中,变量n指的是表的长度,hash指的是键的散列。
由于我们使用位掩码((n - 1) &哈希)来计算模数,所以模数将不会使用高于n-1的最高位的任何位。例如,给定n= 32和4个要计算的哈希码。在不进行哈希码转换的情况下直接取模时,所有索引都是1,冲突是100%。这是因为掩码31 (n - 1),0000 0000 0000 000011111,使得任何高于位置5的位在数字h中不可用。为了使用这些最高的位,HashMap将它们左移16个位置h >>> 16,并用最低的位(h ^ (h >>> 16))展开。因此,所获得的模数具有较少的冲突。
https://stackoverflow.com/questions/36554000
复制相似问题