Java之HashMap原理与实战

yuziyue

13 Jul 2023 · 4 min read

一. HashMap的原理

HashMap 由数组+链表的方式实现，一个 HashMap 对象主要维护了如下对象：

数组table：数组的元素是一个链表，key经过hash后取余，相同索引的数据存在同一个链表里面。
数据量size：已存入的 kv 对数量。
数组 table 初始化容量：数组的元素个数。
扩容因子：当数组达到一定的阈值后进行扩容。

二. 自定义HashMap实现代码

下面实现一个自定义的的HashMap，实现主要put、get、clear功能，主要步骤如下

初始化：创建 HashMap 时创建一个数组，数组的元素类似是链表。
插入数据：当得到一个 k v 时，将 k 计算哈希值，然后根据数组长度取余，确定该 k v 对放到哪个索引里面，然后把数据放到对应索引的链表里面。
获取数据：同样根据 k 获取 hash 值，然后获取数据。
扩容：当数组里面的某一个链表 > 数组长度 * 扩容因子时，新建一个长度为当前数组的长度的两倍的数组。把旧的数据按照新的数组长度，重新计算hash值，放到新的数组里面。
说明：下面的示例没有实现红黑树部分。

package org.example;

import java.util.Arrays;
import java.util.LinkedList;
import java.util.Objects;


public class CustomMap<K, V> {

    // table 是一个数组，数组的元素是一个链表，相同索引的数据存在同一个链表里面
    private LinkedList<Node<K, V>>[] table;

    // 记录 table 已存入数据量
    public int size = 0;

    // table 的初始化容量
    private static final int INIT_CAPACITY = 102400;

    // 扩容参数
    private static final double LOAD_FACTOR = 0.5;

    public CustomMap() {
        table = new LinkedList[INIT_CAPACITY];
    }

    public void put(K key, V value) {
        int hash = hash(key, table.length);
        LinkedList<Node<K, V>> oldNodes = table[hash];
        if (oldNodes == null) {
            LinkedList<Node<K, V>> newNodes = new LinkedList<Node<K, V>>() {{
                add(new Node<>(key, value));
            }};
            table[hash] = newNodes;
            size++;
            checkResize(newNodes);
        } else {
            for (Node<K, V> oldNode : oldNodes) {
                if (oldNode.getKey().equals(key)) {
                    oldNode.setValue(value);
                    return;
                }
            }
            oldNodes.add(new Node<K, V>(key, value));
            size++;
            checkResize(oldNodes);
        }
    }

    // 每个链表的长度 > (table的长度 * LOAD_FACTOR) 时扩容
    private void checkResize(LinkedList<Node<K, V>> nodes) {
        if (nodes.size() > table.length * LOAD_FACTOR) {
            resize(table.length * 2);
        }
    }

    // table 扩容
    private void resize(int newLength) {
        LinkedList<Node<K, V>>[] newTable = new LinkedList[newLength];
        for (LinkedList<Node<K, V>> nodeList : table) {
            if (nodeList == null) {
                continue;
            }
            for (Node<K, V> node : nodeList) {
                int hash = hash(node.getKey(), newLength);
                LinkedList<Node<K, V>> tmpNodes = newTable[hash];
                if (tmpNodes == null) {
                    newTable[hash] = tmpNodes = new LinkedList<>();
                }
                tmpNodes.add(node);
            }
        }
        table = newTable;
    }

    public V get(K key) {
        int hash = hash(key, table.length);
        LinkedList<Node<K, V>> nodes = table[hash];
        if (nodes == null) {
            return null;
        }
        for (Node<K, V> node : nodes) {
            if (node.getKey().equals(key)) {
                return node.getValue();
            }
        }
        return null;
    }

    private int hash(K key, int length) {
        return (Objects.hashCode(key) & Integer.MAX_VALUE) % length;
    }

    public void clear() {
        Arrays.fill(table, null);
    }

    @Override
    public String toString() {
        StringBuilder stringBuffer = new StringBuilder();
        int tmpSize = 0;
        for (LinkedList<Node<K, V>> nodes : table) {
            if (nodes == null) {
                continue;
            }
            if (tmpSize > 20) {
                break;
            }
            for (Node<K, V> node : nodes) {
                if (tmpSize > 20) {
                    break;
                }
                tmpSize++;
                stringBuffer.append(node.toString());
                if (tmpSize != size) {
                    stringBuffer.append(",");
                }
            }
        }
        stringBuffer.append("...");
        return stringBuffer.toString();
    }

    private static class Node<K, V> {
        K key;
        V value;

        Node(K key, V value) {
            this.key = key;
            this.value = value;
        }

        public K getKey() {
            return key;
        }

        public V getValue() {
            return value;
        }

        public void setValue(V newValue) {
            value = newValue;
        }

        public String toString() {
            return key + "=" + value;
        }
    }

}

三. 使用自定义HashMap

package org.example;

public class Main {
    public static void main(String[] args) throws Exception {
        CustomMap<String, String> map1 = new CustomMap<>();
        long startAt = System.currentTimeMillis();
        for (int i = 0; i < 1000000; i++) {
            map1.put(i + "aa", i + "bb");
        }
        System.out.println("take mill seconds = " + (System.currentTimeMillis() - startAt));
        System.out.println("map1.get = " + map1.get("22330aa"));
        System.out.println("take mill seconds = " + (System.currentTimeMillis() - startAt));
        System.out.println("map1.size = " + map1.size);
        System.out.println(map1);
        map1.clear();
        Thread.sleep(1000 * 10);
    }
}

四. hash冲突

所谓的 hash 冲突就是，不同的key，经过一个hash函数处理后，可能会得到相同的hash值。实际上，我们在插入数据、查询数据的时候即便没有hash冲突，大部分的情况下数组的一个元素会存储很多的数据节点。我们会根据 key 的不同来确定我们要查下或插入的数据。

比较重要的hash函数，key 就是我的数据key，length是数组的长度，取余可以保证我数据一定或落到数组的某个索引上。


    private int hash(K key, int length) {
        return (Objects.hashCode(key) & Integer.MAX_VALUE) % length;
    }

🔗

阅读原文：https://yuchaoshui.com/4fab4db/

Java之HashMap原理与实战

一. HashMap的原理

二. 自定义HashMap实现代码

三. 使用自定义HashMap

四. hash冲突

Scala发送邮件完整示例

Python多版本管理工具pyenv

Java之HashMap原理与实战

一. HashMap的原理

二. 自定义HashMap实现代码

三. 使用自定义HashMap

四. hash冲突

Scala发送邮件完整示例

Python多版本管理工具pyenv

文章推荐

到底什么是ChatGPT的函数调用function calling

使用Django开发简单的后台管理

Python线程池处理多任务并发

Python错误 certificate verify failed: unable to get local issuer certificate 解决

Java命令行解析jcommander总结

Java远程执行linux命令总结