测量执行时间的常用技术提供不同的值 (java)

Question

我试图测量不同类型集合上不同操作的时间并想比较它们，但是我得到的值在相同类型的集合上有很大差异，比如因子 1000。我使用通用我在这里阅读的技巧：How do I time a method's execution in Java?

我比较了Hashset、TreeSet和LinkedHashSet。我用 1 000 000 个整数填充集合，使用 methode contains() 并遍历集合。我测量了每个操作的时间，值相差很大。所以我第二次用相同类型的新集做了这个，我得到的执行时间似乎不合法。

同类型的集合需要1400毫秒填充一次，然后300毫秒填充一次。这是为什么？

这是一个代码示例，它可以使我的意思更清楚：

    public static void main(String[] args){

    HashSet<Integer> firstHashSet = new HashSet<>(predefinedSize);
    HashSet<Integer> secondHashSet = new HashSet<>(predefinedSize);
    LinkedHashSet<Integer> firstLinkedHashSet = new LinkedHashSet<>(predefinedSize);
    LinkedHashSet<Integer> secondLinkedHashSet = new LinkedHashSet<>(predefinedSize);
    TreeSet<Integer> firstTreeSet = new TreeSet<>();
    TreeSet<Integer> secondTreeSet = new TreeSet<>();
    int x = 9432;
    System.out.println("filling hashSet:        <" + fillSet(firstHashSet) + "> milliSeconds");
    System.out.println("filling linkedSet:      <" + fillSet(firstLinkedHashSet) + "> milliSeconds");
    System.out.println("filling treeSet:        <" + fillSet(firstTreeSet) + "> milliSeconds");
    System.out.println("-------------------------------------------------------------");
    System.out.println("filling hashSet:        <" + fillSet(secondHashSet) + "> milliSeconds");
    System.out.println("filling linkedSet:      <" + fillSet(secondLinkedHashSet) + "> milliSeconds");
    System.out.println("filling treeSet:        <" + fillSet(secondTreeSet) + "> milliSeconds");

这是我的一个填充集的样子：

private static int size = 1000000;
private static int predefinedSize = 2000000;

public static double fillSet(LinkedHashSet<Integer> myHashSet){
    double timeStart = System.nanoTime();
    for(int i=0; i<size; i++){
        myHashSet.add(i);
    }
    double time = (System.nanoTime() - timeStart)/ Math.pow(10, 6);
    return time;
}

输出是这样的：

filling hashSet:        <52.14022> milliSeconds
filling linkedSet:      <95.599435> milliSeconds
filling treeSet:        <2172.773956> milliSeconds
-------------------------------------------------------------
filling hashSet:        <59.096929> milliSeconds
filling linkedSet:      <1006.638126> milliSeconds
filling treeSet:        <241.36395> milliSeconds

你看输出差别很大，我假设这取决于我的电脑的计算能力，但我没有运行后台有任何其他程序。有人可以给我一个解释 and/or 解决方案吗？

Answer 1

正如@kan 的评论所提到的，使用系统计时器并执行某项操作一百万次将提供截然不同的结果。您正在寻找的是微基准测试：

How do I write a correct micro-benchmark in Java?

至于为什么你的计时到处都是，你必须阅读计算机体系结构和 Java JVM。一些可能性：

处理器中的动态时钟速度技术 https://electronics.stackexchange.com/questions/62353/how-can-a-cpu-dynamically-change-its-clock-frequency - 您可以通过关闭 CPU 更改时钟速度的功能来消除这种可能性。
您的集合有 100 万个 Int 类型的元素，即 4 MiB。考虑到非服务器 CPUs 将具有 1 到 8 MiB 的缓存，该大小正好处于它是否适合处理器缓存的极限。如果在一次执行中，您的 100 万个元素在缓存中的停留时间比在另一次执行中的时间长，那么您将获得截然不同的执行时间。您可以通过使您的集合小到绝对适合缓存（最多数十 KB）或大到根本无法使用缓存（可能是一百兆字节）来消除这种可能性。
您可能没有运行任何其他应用程序，但您计算机的后台运行仍有其他内容。（防病毒、更新服务、10-20 项与操作系统内部工作相关的其他任务）
Java 虚拟机的行为可能有所不同（对此我不能太确定，因为我不是 JIT、GC 和其他可能影响执行时间的内部工作机制的专家).微基准库将在很大程度上消除这种可能的差异。

测量执行时间的常用技术提供不同的值 (java)

common techniques to measure execution time provide different values (java)

java

time

set

execution