The HashMap
A hash table is made up of two parts: an array (the actual table where the data to be searched is stored) and a mapping function, known as a hash function.
The hash function is a mapping from the input space to the integer space that defines the indices of the array. In other words, the hash function provides a way for assigning numbers to the input data such that the data can then be stored at the array index corresponding to the assigned number.
For example, if I want to store <”Durant”>, I pass “Durant” into the hash function, and get (let’s say) number 3. So in the Hash Table, it will store table(3 -> “Durant”).
In this way, the searching of HashMap can almost achieve O(1) time in best case (like array access).
However,
For average case, It really is (as the wikipedia page says) O(1+n/k) where K is the hash table size. If K is large enough, then the result is effectively O(1). But suppose K is 10 and N is 100. In that case each bucket will have on average 10 entries, so the search time is definitely not O(1); it is a linear search through up to 10 entries.
In practise, we will just assume search in HashMap always O(1).
Below is a great conclusion.
Hash tables are
O(1)
average and amortized case complexity, however is suffers fromO(n)
worst case time complexity. [And I think this is where your confusion is]Hash tables suffer from
O(n)
worst time complexity due to two reasons:
- If too many elements were hashed into the same key: looking inside this key may take
O(n)
time.- Once a hash table has passed its load balance - it has to rehash [create a new bigger table, and re-insert each element to the table].
However, it is said to be
O(1)
average and amortized case because:
- It is very rare that many items will be hashed to the same key [if you chose a good hash function and you don't have too big load balance.
- The rehash operation, which is
O(n)
, can at most happen aftern/2
ops, which are all assumedO(1)
: Thus when you sum the average time per op, you get :(n*O(1) + O(n)) / n) = O(1)