HashMap 的数据存储结构是 table,而 table 是 Entry 内部类的数组,如下图所示:

HashMap 并不会在创建时为 map 分配空间,而仅仅会将相应的成员初始化为对应的值,只有在首次调用 put 方法插入参数时才会调用 inflateTable 方法分配空间,事实上,我们已经知道,HashMap 实际的存储结构是 Entry 的数组,为 HashMap 分配空间实际上是创建指定大小的 Entry 数组
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
/**
* Inflates the table.
*/
private void inflateTable(int toSize) {
// 返回的 capacity 值一定是大于等于 toSize 的最小 2 的幂
int capacity = roundUpToPowerOf2(toSize);
threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
table = new Entry[capacity];
initHashSeedAsNeeded(capacity);
}
这里使用 roundUpToPowerOf2 方法获取了大于等于 toSize 的最小 2 的幂
private static int roundUpToPowerOf2(int number) {
// assert number >= 0 : "number must be non-negative";
return number >= MAXIMUM_CAPACITY
? MAXIMUM_CAPACITY
: (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
}
上面已经提到了 put 方法,如果新加入的 key 已经存在了,则更新这个 key 对应的值,并返回旧的 value,如果不存在,则会调用 addEntry 添加新的元素:
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
/**
* Like addEntry except that this version is used when creating entries
* as part of Map construction or "pseudo-construction" (cloning,
* deserialization). This version needn't worry about resizing the table.
*
* Subclass overrides this to alter the behavior of HashMap(Map),
* clone, and readObject.
*/
void createEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<>(hash, key, value, e);
size++;
}
在 Entry 的构造方法中,实现了将新创建的 Entry 对象插入 table 数组对应索引下链表的首节点,并使用 next 指向原链表
Entry(int h, K k, V v, Entry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}
在 addEntry 方法中,在调用 createEntry 方法前,首先判断了是否已经超出 HashMap 容量,需要扩容,如果需要扩容,则会调用 resize 方法扩容到当前容量的两倍
在低版本 JDK 中,是否需要扩容是在调用 createEntry 方法插入元素后判断的,新版本 JDK 中改为了插入前扩容,这样的改动虽然很小,但是对于大量使用 HashMap 的场景中,他的好处是显而易见的,这是十分需要注意的一个细节问题
HashMap 中,默认的 loadFactor 值是 0.75,threshold 值是根据 loadFactory * table.length 计算的,也就是说,当容量达到负载因子时就会扩容,因为随着 HashMap 存储元素的增加,碰撞的几率也在上升,因此如果超过负载因子,那么碰撞几率就已经不可接受了
哈希表最重要的当然是哈希算法
java 的 HashMap 采用的方法是:让length为2的指数倍,然后用 hashCode(key) & (length-1) 的方法得到索引
/**
* Retrieve object hash code and applies a supplemental hash function to the
* result hash, which defends against poor quality hash functions. This is
* critical because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
final int hash(Object k) {
int h = hashSeed;
if (0 != h && k instanceof String) {
return sun.misc.Hashing.stringHash32((String) k);
}
h ^= k.hashCode();
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
首先,使用随机的 hashSeed,通过异或操作,降低了冲突的概率,然后通过一系列无符号右移操作,来把高位与低位进行异或操作,进一步降低碰撞的概率
技术帖
技术分享
源码
sourcecode
java
jdk
hash
map
hashmap
table