如何计算 Java 中字符串列表中的冲突

How to count collisions in a list of strings in Java

如何使用每个字符串的哈希码计算字符串列表中的所有冲突对?

public class HashCollisions {
private static int strLength;
private static int colls;

public static void main(String[] args) {

    String[] strings ={"AaAaAa","AaAaBB","AaBBAa","AaBBBB"};

    strLength=strings.length;
    for (int i = 0; i < strLength - 1; i++) {
        for (int j = i + 1; j < strLength; j++) {
            if (hash(strings[i]) == hash(strings[j]) && !(strings[i].equals(strings[j]))) {
                    colls++;
            }
        }
    }

    System.out.println(colls);

}

private static byte hash(String s) {
    byte[] bytes = s.getBytes();
    byte result = bytes[0];

    for (int i = 1; i < bytes.length; i++) {
        result ^= bytes[i];
    }

    return result;
}

}

为什么不用Set,把你List中的每一个值都放到Set里,通过计算List.size() - Set.size()求出碰撞次数?

您可以按 hashCode 对字符串列表进行分组,然后使用生成的映射。一旦给定键有多个值,就会有 碰撞:

public static void main(String[] args) {
    List<String> strings = Arrays.asList("foo", "bar", "AaAa", "foobar",
            "BBBB", "AaBB", "FB", "Ea", "foo");
    Map<Integer, List<String>> stringsByHash = strings.stream()
            .collect(Collectors.groupingBy(String::hashCode));
    for (Entry<Integer, List<String>> entry : stringsByHash.entrySet()) {
        List<String> value = entry.getValue();
        int collisions = value.size() - 1;
        if (collisions > 0) {
            System.out.println(
                    "Got " + collisions + " collision(s) for strings "
                            + value + " (hash: " + entry.getKey() + ")");
        }
    }
}

这会打印:

Got 1 collision(s) for strings [foo, foo] (hash: 101574)
Got 1 collision(s) for strings [FB, Ea] (hash: 2236)
Got 2 collision(s) for strings [AaAa, BBBB, AaBB] (hash: 2031744)