HashMap源码解读—Java8版本

一、HashMap简介

HashMap顶部有一段很长的注释，大概的介绍了HashMap。

1.1 原文

/**

* Hash table based implementation of the <tt>Map</tt> interface. This

* implementation provides all of the optional map operations, and permits

* <tt>null</tt> values and the <tt>null</tt> key. （The <tt>HashMap</tt>

* class is roughly equivalent to <tt>Hashtable</tt>, except that it is

* unsynchronized and permits nulls.） This class makes no guarantees as to

* the order of the map; in particular, it does not guarantee that the order

* will remain constant over time.

* This implementation provides constant-time performance for the basic

* operations （<tt>get</tt> and <tt>put</tt>）， assuming the hash function

* disperses the elements properly among the buckets. Iteration over

* collection views requires time proportional to the "capacity" of the

* <tt>HashMap</tt> instance （the number of buckets） plus its size （the number

* of key-value mappings）。 Thus, it's very important not to set the initial

* capacity too high （or the load factor too low） if iteration performance is

* important.

* An instance of <tt>HashMap</tt> has two parameters that affect its

* performance: initial capacity and load factor. The

* capacity is the number of buckets in the hash table, and the initial

* capacity is simply the capacity at the time the hash table is created. The

* load factor is a measure of how full the hash table is allowed to

* get before its capacity is automatically increased. When the number of

* entries in the hash table exceeds the product of the load factor and the

* current capacity, the hash table is rehashed （that is, internal data

* structures are rebuilt） so that the hash table has approximately twice the

* number of buckets.

* As a general rule, the default load factor （。75） offers a good

* tradeoff between time and space costs. Higher values decrease the

* space overhead but increase the lookup cost （reflected in most of

* the operations of the <tt>HashMap</tt> class, including

* <tt>get</tt> and <tt>put</tt>）。 The expected number of entries in

* the map and its load factor should be taken into account when

* setting its initial capacity, so as to minimize the number of

* rehash operations. If the initial capacity is greater than the

* maximum number of entries divided by the load factor, no rehash

* operations will ever occur.

* If many mappings are to be stored in a <tt>HashMap</tt>

* instance, creating it with a sufficiently large capacity will allow

* the mappings to be stored more efficiently than letting it perform

* automatic rehashing as needed to grow the table. Note that using

* many keys with the same {@code hashCode（）} is a sure way to slow

* down performance of any hash table. To ameliorate impact, when keys

* are {@link Comparable}, this class may use comparison order among

* keys to help break ties.

* Note that this implementation is not synchronized.

* If multiple threads access a hash map concurrently, and at least one of

* the threads modifies the map structurally, it must be

* synchronized externally. （A structural modification is any operation

* that adds or deletes one or more mappings; merely changing the value

* associated with a key that an instance already contains is not a

* structural modification.） This is typically accomplished by

* synchronizing on some object that naturally encapsulates the map.

* If no such object exists, the map should be "wrapped" using the

* {@link Collections#synchronizedMap Collections.synchronizedMap}

* method. This is best done at creation time, to prevent accidental

* unsynchronized access to the map:<pre>

* Map m = Collections.synchronizedMap（new HashMap（…））;</pre>

* The iterators returned by all of this class's "collection view methods"

* are fail-fast: if the map is structurally modified at any time after

* the iterator is created, in any way except through the iterator's own

* <tt>remove</tt> method, the iterator will throw a

* {@link ConcurrentModificationException}. Thus, in the face of concurrent

* modification, the iterator fails quickly and cleanly, rather than risking

* arbitrary, non-deterministic behavior at an undetermined time in the

* future.

* Note that the fail-fast behavior of an iterator cannot be guaranteed

* as it is, generally speaking, impossible to make any hard guarantees in the

* presence of unsynchronized concurrent modification. Fail-fast iterators

* throw <tt>ConcurrentModificationException</tt> on a best-effort basis.

* Therefore, it would be wrong to write a program that depended on this

* exception for its correctness: the fail-fast behavior of iterators

* should be used only to detect bugs.

* This class is a member of the

* <a href="{@docRoot}//technotes/guides/collections/index.html">

* Java Collections Framework</a>.

* @param <K> the type of keys maintained by this map

* @param <V> the type of mapped values

* @author Doug Lea

* @author Josh Bloch

* @author Arthur van Hoff

* @author Neal Gafter

* @see Object#hashCode（）

* @see Collection

* @see Map

* @see TreeMap

* @see Hashtable

* @since 1.2

public class HashMap<K,V> extends AbstractMap<K,V>

implements Map<K,V>, Cloneable, Serializable {

…

}

1.2 翻译

Map接口是基于哈希表实现的，这个实现为map提供了许多操作方法，允许key是null，或者values是null。HashMap大致和Hashtable一样，不同之处在于HashMap线程不同步，同时允许有null值。这个类不保证map的顺序，特别是它不能保证每次都顺序不变。

HashMap的实现为基础操作get和put方法提供了不变的性能表现，假设哈希函数在buckets中按比例分散元素，迭代器需要时间比例对hashmap实例（buckets的数量）的容量进行扩容（key-value映射的数量）。因此，如果迭代器性能很重要时，就不要给它的初始容量设置太高（或者加载因子太低）

一个Hashmap的实例会有两个参数影响它的性能表现：initial capacity初始值和load factor加载因子，capacity是哈希表中buckets的数量，初始值就是哈希表在被创建时的容量。加载因子是用来判断当哈希表到多满时才被允许自动扩容。当哈希表中的键值对超过了负载因子的乘积和当前的容量时，哈希表会通过rehashed的方法重新计算，（重建内部数据结构）这样一来哈希表大致会变成原有buckets的数量的两倍。

作为一条普遍的规则，默认的加载因子是0.75，它提供了很好的性能表现在时间和空间成本上。比0.75高就会降低整体空间，但是增加了查找的成本（这反应在绝大多数的HashMap操作，包括get和put方法）。当设置init capacity初始化值时，键值对的数量和它的负载因子应该被考虑进去，这样才能最小程度的降低重新计算哈希值的操作。如果初始化值比在被负载因子分发后的最大键值对数还要大时，重新计算哈希值的操作就会发生。

如果许多映射关系被存到一个hashMap实例中，创建足够大的容量让键值对被存储，会比让他们自动计算哈希值去扩容更高效。要注意的是，拥有同样哈希值的keys会降低所有哈希表的性能。为了改进这个影响，当keys在调用Comparable接口时，这个类利用比较顺序来帮助keys打破僵局。

需要注意的是这个方法不是同步的。如果多线程同时进入一个哈希映射时，并且至少有一个线程改变了map的结构时，它必须要在外面被同步。（结构性的修改可以时任何添加，删除更多键值对的操作，几乎不改变一个键的值，就说明这不是一个结构性的改变）。这些通常发生在当一些对象要被同步放进map时。

如果没有这样的对象存在，map必须要使用Collections.synchronizedMap方法包围，这是在创建时最好的做法，阻止意外的不同步map操作。

Map m = Collections.synchronizedMap（new HashMap（…））;

迭代器返回了几乎所有集合都有的fail-fast的方法：如果map在迭代器被创建后的任何时间被结构性的改变，那么除了迭代器使用自己的remove方法，其他情况下迭代器都被抛出一个一场ConcurrentModificationException。因此，在同时被改变的时候，迭代器会比随时可能发生的风险，或者在不确定的时间下发生的不确定行为，失败的更快更利落。

需要注意的是，一个迭代器的快速失败行为不能被保证，一般来说，不可能完全保证不同步情况下的的同时修改结构的事情发生。快速失败迭代器会在一个最好的基础上抛出ConcurrentModificationException异常。因此，如果一个程序通过这个异常来判断它对错的行为是错误的，迭代器的快速失败行为只能用于检测bugs

如需转载，请注明文章出处和来源网址：http://www.divcss5.com/html/h64829.shtml