Differences

This shows you the differences between two versions of the page.

--- data-structures-algorithms:hash-table-concept [2026/05/23 22:47] – created phong2018
+++ data-structures-algorithms:hash-table-concept [2026/05/23 22:55] (current) – phong2018
@@ Line 7: / Line 7: @@
 When we write:
-  set("abc") = "xyz"
+  set("abc", "xyz")
 Internally, it becomes:
@@ Line 20: / Line 20: @@
 - The value is stored together with the key at that index
-===== Why store (key, value)? =====
+===== Important Correction: Real Structure =====
-We store both key and value because different keys can produce the same index (collision).
+In real implementations, each slot is usually a "bucket", not a single value:
-Example:
+  A[index] = list of (key, value)
-  "abc" → index 2
+Even if there is only one element, it is still stored inside a bucket.
-  "def" → index 2
-If we store only values:
+===== Case 1: No Collision =====
-  A[2] = "xyz"   ❌ (data will be overwritten)
+If no other key maps to the same index:
-Instead, we store:
+  A[index] = [("abc", "xyz")]
+This is the simplest case.
+===== Case 2: Collision (Different Keys, Same Index) =====
+Different keys can map to the same index:
+  "abc" → index 2
+  "def" → index 2
+In this case:
   A[2] = [
@@ Line 40: / Line 50: @@
   ]
-This method is called "chaining".
+This is called "chaining".
-===== Correct Mental Model =====
+===== Case 3: Same Key (Update Value) =====
-The correct way to think about a hash table is:
+If the same key is inserted again:
-  key → hash(key) → index → A[index] = (key, value)
+  set("abc", "v1")
+  set("abc", "v2")
-or in real systems:
+The hash table does NOT create duplicates.
-  A[index] = list of (key, value)
+Instead, it updates the existing value:
+Before:
+  A[index] = [("abc", "v1")]
+After:
+  A[index] = [("abc", "v2")]
+Rule:
+  Same key → overwrite value
+===== Case 4: Lookup (get operation) =====
+When we do:
+  get("abc")
+Steps:
+. Compute index = hash("abc") % array_size
+. Go to A[index]
+. Search inside the bucket
+. Match the key exactly
+. Return its value
+Example:
+  A[index] = [("abc", "xyz"), ("def", "123")]
+Only return value where key == "abc"
+===== Correct Mental Model =====
+Two-level thinking:
+High-level:
+  key → hash(key) → index → value
+Real implementation:
+  key → hash(key) → index → A[index] = list of (key, value)
 ===== Summary =====
@@ Line 56: / Line 105: @@
 - Hash table maps keys to values using a hash function
 - Index is computed as: hash(key) % array_size
-- Each slot stores (key, value), not just value
+- Each slot is a bucket storing (key, value)
-- Collisions are handled using a list (chaining)
+- Same key overwrites old value
+- Different keys with same index are stored together (collision handling)
 - Average time complexity for get/set is O(1)
 ===== Key Insight =====
-Hash tables are fast because they avoid searching through all elements by directly jumping to a computed index.
+Hash tables are fast because they convert a key directly into an index, avoiding full scanning of data structures like arrays or lists.