当元素超过其大小的 1/2 时调整数组大小

Resizing array when elements are more than 1/2 of his size

当元素数量 N 大于 m/2 时,我正在尝试调整我的数组大小,m 是数组的初始大小,但它不起作用,我也没有明白为什么。这个数组应该像哈希表一样工作,所以我在每次插入之前都有一个哈希函数,并且在调整大小之后我想再次插入每个具有新哈希(m 值更改)的元素。这是错误:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at JumpHashing.resize(JumpHashing.java:55)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)

问题显然是调整大小,没有它(少于 23 个元素)它仍然有效。

m 的初始大小为 23,这是实际代码(Class "In" 用于从 algs4 读取文件):

public class JumpHashing{
    private int m;
    private int[] hashTable; 
    private static int id;
    private int N;

    public JumpHashing(){
        m = 23;
        hashTable = new int[m];
        N = 0;
    }

    public void hashing(int value) {
            int key = (value*11)%m;
            put(key, value);
    }

    public void put(int key, int value) {
        if(N <m/2) {
            hashTable[key] = value;
            N++;
        } else {
            m=m*2;
            N=0;
            resize(m);
        }
    }

    public void resize(int m) { 
        int[] temp = new int[m];
        for(int i=0; i<hashTable.length; i++) {
            temp[i] = hashTable[i];
        }
        hashTable = new int[m];
        for(int i=0; i<temp.length; i++) {
            hashing(temp[i]);
        }
    }

    public static void main(String[] args) {
        JumpHashing hashT1 = new JumpHashing();

        In file = new In("random.txt");
        while(file.hasNextLine()) {
            int value = Integer.parseInt(file.readLine());
            hashT1.hashing(value);
        }   
        for(int j=0; j<hashT1.hashTable.length; j++) {
            StdOut.println("Key: "+j+" Value: "+hashT1.hashTable[j]);
        }
    }
}

您最终会重复调用 resize,直到内存用完。问题出在这个函数中:

    public void resize(int m) { 
        int[] temp = new int[m];  // <-- this is the new double-size of m
        for(int i=0; i<hashTable.length; i++) {
            temp[i] = hashTable[i];
        }
        hashTable = new int[m];
        for(int i=0; i<temp.length; i++) {  // <-- here we go too far
            hashing(temp[i]);
        }
    }

您的第二个循环遍历全新的 'm' 大小数组,而不是原始的 m/2 大小数组。在循环中加一,你的 N 将再次大于 m/2,每次发生这种情况时它都会调用 resize。

这是您在该函数中应该拥有的内容:

public void resize(int m) {
    int[] oldHash = hashTable;
    hashTable = new int[m];
    for(int i=0; i<oldHash.length; i++) {
        if (oldHash[i] != 0) {     // <-- don't hash empty slots
            hashing(oldHash[i]);
        }
    }
}

这也提高了性能,因为您只循环一次且不超过 m/2 次。