一、HashMap简介
HashMap顶部有一段很长的注释,大概的介绍了HashMap。
1.1 原文
/**
* Hash table based implementation of the <tt>Map</tt> interface. This
* implementation provides all of the optional map operations, and permits
* <tt>null</tt> values and the <tt>null</tt> key. (The <tt>HashMap</tt>
* class is roughly equivalent to <tt>Hashtable</tt>, except that it is
* unsynchronized and permits nulls.) This class makes no guarantees as to
* the order of the map; in particular, it does not guarantee that the order
* will remain constant over time.
*
* <p>This implementation provides constant-time performance for the basic
* operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function
* disperses the elements properly among the buckets. Iteration over
* collection views requires time proportional to the "capacity" of the
* <tt>HashMap</tt> instance (the number of buckets) plus its size (the number
* of key-value mappings)。 Thus, it's very important not to set the initial
* capacity too high (or the load factor too low) if iteration performance is
* important.
*
* <p>An instance of <tt>HashMap</tt> has two parameters that affect its
* performance: <i>initial capacity</i> and <i>load factor</i>. The
* <i>capacity</i> is the number of buckets in the hash table, and the initial
* capacity is simply the capacity at the time the hash table is created. The
* <i>load factor</i> is a measure of how full the hash table is allowed to
* get before its capacity is automatically increased. When the number of
* entries in the hash table exceeds the product of the load factor and the
* current capacity, the hash table is <i>rehashed</i> (that is, internal data
* structures are rebuilt) so that the hash table has approximately twice the
* number of buckets.
*
* <p>As a general rule, the default load factor (。75) offers a good
* tradeoff between time and space costs. Higher values decrease the
* space overhead but increase the lookup cost (reflected in most of
* the operations of the <tt>HashMap</tt> class, including
* <tt>get</tt> and <tt>put</tt>)。 The expected number of entries in
* the map and its load factor should be taken into account when
* setting its initial capacity, so as to minimize the number of
* rehash operations. If the initial capacity is greater than the
* maximum number of entries divided by the load factor, no rehash
* operations will ever occur.
*
* <p>If many mappings are to be stored in a <tt>HashMap</tt>
* instance, creating it with a sufficiently large capacity will allow
* the mappings to be stored more efficiently than letting it perform
* automatic rehashing as needed to grow the table. Note that using
* many keys with the same {@code hashCode()} is a sure way to slow
* down performance of any hash table. To ameliorate impact, when keys
* are {@link Comparable}, this class may use comparison order among
* keys to help break ties.
*
* <p><strong>Note that this implementation is not synchronized.</strong>
* If multiple threads access a hash map concurrently, and at least one of
* the threads modifies the map structurally, it <i>must</i> be
* synchronized externally. (A structural modification is any operation
* that adds or deletes one or more mappings; merely changing the value
* associated with a key that an instance already contains is not a
* structural modification.) This is typically accomplished by
* synchronizing on some object that naturally encapsulates the map.
*
* If no such object exists, the map should be "wrapped" using the
* {@link Collections#synchronizedMap Collections.synchronizedMap}
* method. This is best done at creation time, to prevent accidental
* unsynchronized access to the map:<pre>
* Map m = Collections.synchronizedMap(new HashMap(…));</pre>
*
* <p>The iterators returned by all of this class's "collection view methods"
* are <i>fail-fast</i>: if the map is structurally modified at any time after
* the iterator is created, in any way except through the iterator's own
* <tt>remove</tt> method, the iterator will throw a
* {@link ConcurrentModificationException}. Thus, in the face of concurrent
* modification, the iterator fails quickly and cleanly, rather than risking
* arbitrary, non-deterministic behavior at an undetermined time in the
* future.
*
* <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
* as it is, generally speaking, impossible to make any hard guarantees in the
* presence of unsynchronized concurrent modification. Fail-fast iterators
* throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
* Therefore, it would be wrong to write a program that depended on this
* exception for its correctness: <i>the fail-fast behavior of iterators
* should be used only to detect bugs.</i>
*
* <p>This class is a member of the
* <a href="{@docRoot}//technotes/guides/collections/index.html">
* Java Collections Framework</a>.
*
* @param <K> the type of keys maintained by this map
* @param <V> the type of mapped values
*
* @author Doug Lea
* @author Josh Bloch
* @author Arthur van Hoff
* @author Neal Gafter
* @see Object#hashCode()
* @see Collection
* @see Map
* @see TreeMap
* @see Hashtable
* @since 1.2
*/
public class HashMap<K,V> extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable {
…
}
1.2 翻译
Map接口是基于哈希表实现的,这个实现为map提供了许多操作方法,允许key是null,或者values是null。HashMap大致和Hashtable一样,不同之处在于HashMap线程不同步,同时允许有null值。这个类不保证map的顺序,特别是它不能保证每次都顺序不变。
HashMap的实现为基础操作get和put方法提供了不变的性能表现,假设哈希函数在buckets中按比例分散元素,迭代器需要时间比例对hashmap实例(buckets的数量)的容量进行扩容(key-value映射的数量)。因此,如果迭代器性能很重要时,就不要给它的初始容量设置太高(或者加载因子太低)
一个Hashmap的实例会有两个参数影响它的性能表现:initial capacity初始值和load factor加载因子,capacity是哈希表中buckets的数量,初始值就是哈希表在被创建时的容量。加载因子是用来判断当哈希表到多满时才被允许自动扩容。当哈希表中的键值对超过了负载因子的乘积和当前的容量时,哈希表会通过rehashed的方法重新计算,(重建内部数据结构)这样一来哈希表大致会变成原有buckets的数量的两倍。
作为一条普遍的规则,默认的加载因子是0.75,它提供了很好的性能表现在时间和空间成本上。比0.75高就会降低整体空间,但是增加了查找的成本(这反应在绝大多数的HashMap操作,包括get和put方法)。当设置init capacity初始化值时,键值对的数量和它的负载因子应该被考虑进去,这样才能最小程度的降低重新计算哈希值的操作。如果初始化值比在被负载因子分发后的最大键值对数还要大时,重新计算哈希值的操作就会发生。
如果许多映射关系被存到一个hashMap实例中,创建足够大的容量让键值对被存储,会比让他们自动计算哈希值去扩容更高效。要注意的是,拥有同样哈希值的keys会降低所有哈希表的性能。为了改进这个影响,当keys在调用Comparable接口时,这个类利用比较顺序来帮助keys打破僵局。
需要注意的是这个方法不是同步的。如果多线程同时进入一个哈希映射时,并且至少有一个线程改变了map的结构时,它必须要在外面被同步。(结构性的修改可以时任何添加,删除更多键值对的操作,几乎不改变一个键的值,就说明这不是一个结构性的改变)。这些通常发生在当一些对象要被同步放进map时。
如果没有这样的对象存在,map必须要使用Collections.synchronizedMap方法包围,这是在创建时最好的做法,阻止意外的不同步map操作。
Map m = Collections.synchronizedMap(new HashMap(…));
迭代器返回了几乎所有集合都有的fail-fast的方法:如果map在迭代器被创建后的任何时间被结构性的改变,那么除了迭代器使用自己的remove方法,其他情况下迭代器都被抛出一个一场ConcurrentModificationException。因此,在同时被改变的时候,迭代器会比随时可能发生的风险,或者在不确定的时间下发生的不确定行为,失败的更快更利落。
需要注意的是,一个迭代器的快速失败行为不能被保证,一般来说,不可能完全保证不同步情况下的的同时修改结构的事情发生。快速失败迭代器会在一个最好的基础上抛出ConcurrentModificationException异常。因此,如果一个程序通过这个异常来判断它对错的行为是错误的,迭代器的快速失败行为只能用于检测bugs
如需转载,请注明文章出处和来源网址:http://www.divcss5.com/html/h64829.shtml