break 语句会使我的代码更快吗？

Question

我应该写一段代码来说明数组是否有重复项。运行时间并不重要。我想我下面的代码会有 O(n²) 因为我使用了嵌套的 for 循环。我知道有比我写的代码更好更快的代码，但是 我的问题是我在 if 语句中所做的 break 语句是否会使我的代码（有点）更快？ 它应该使它更快，因为程序知道 "hey, we found a duplicate and we can stop searching for more"。我曾经听一位同学说，如果我避免像 return 或 break 这样的语句，代码会更好/更稳定。太糟糕了，当时我没有足够关心去问 为什么 。也许你可以告诉我这是不是真的？

如果他是对的并且这些陈述 "hurt" 我的代码，还有更好的解决方法吗？

public class FindDuplicate{
    public static void main(String[] args){
        int[] A={1,2,3,4,5,6,7,8,4};
        boolean bool=false;
        for(int i=0; i<A.length; i++){
            for(int j=0; j<A.length; j++){
                if(A[i]==A[j] && i!=j){
                    bool=true;
                    break;
                }
            }
        }
        if(bool==true){
            System.out.print("Duplicate found");
        }else{
            System.out.print("No duplicate found");
        }
    }
}

Answer 1

my question is if the break statement I made inside the if-statement will make my code (a bit) faster?

并非在所有情况下，但是，在大多数情况下，考虑到您在找到所需内容后不必继续迭代，它确实会使您的代码更快。

下面的算法包含两个嵌套循环。外层循环遍历数组的所有 N 项，因此需要 O(N) 步。对于外循环的每次行程，内循环也会迭代数组中的 N 项，因此它也有 takes O(N) 步。因为一个循环嵌套在另一个循环中，综合性能为 O(N × N) = O(N2).

for(int i = 0; i < A.length; i++){
    for(int j=0; j < A.length; j++){
       if(A[i] == A[j] && i != j){
           bool = true;
           break;
       }
    }
}

我们可以通过在外循环的每次迭代中不返回 j = 0 来使您的算法更快一些。

for(int i = 0; i < A.length; i++){
  for(int j = i+1; j < A.length; j++){
     if(A[i] == A[j]){
         bool = true;
         break;
     }
  }
}

请注意，在这种情况下我们不需要检查 && i != j，因为它们永远不会相等。

I once heard from a fellow student that the code is better / more stable if I avoid statements like return or break

JVM 规范并未说明使用 break 时是否存在性能损失。简而言之，没有任何证据表明使用 break 或 return 会使您的代码不稳定（反正我不知道）。我会说 "oh this is not a good practise" 的唯一情况是当您过度使用 break 这个词时。但是，在许多情况下，break 是更快完成任务的唯一可能性，例如您当前的解决方案。基本上，当你找到你想要的东西时，为什么还要继续迭代呢？我认为 return 也不是 "a bad practise"，因为类似于 break，为什么在不需要时继续执行代码，这肯定会使您的代码更快。

我们能否使查找重复算法更快？

当然可以，考虑到 Set interface in java which doesn't allow duplicates and it's based upon hash table data structure so insertion take O(1) time in average case. By using HashSet, a general purpose Set implementation, we can find duplicates in O(n) time. Since HashSet allows only unique elements, the add() 方法会失败，并且 return false 当您尝试添加重复项时。

解法：

public static boolean hasDuplicate(int[] array) {
      Set<Integer> dupes = new HashSet<Integer>();
      for (Integer i : array) {
          if (!dupes.add(i)) {
             return true; // we have found a duplicate
          }
      }
      return false; // no duplicate
}

Answer 2

实际上您不需要 bool 标志变量，也不需要使用 break。 return 将停止迭代，如果没有找到重复项，则 return false:

private static boolean findDuplicateOriginal(int[] A) {
    for(int i=0; i<A.length; i++){
        for(int j=0; j<A.length; j++){
            if(A[i]==A[j] && i!=j){
                return true;
            }
        }
    }
    return false;
}

只是指出性能不应该是您编码时的唯一目标。您应该像担心性能一样担心可维护性或编写 less/clean 代码。这取决于上下文（调用该函数的频率、它应该执行多少次迭代、它会运行使用 paralelStream 吗？...）您的应用运行选择一种或另一种方式做事

有很多帖子讨论循环性能与流性能以及支持和反对的意见：

https://blog.jooq.org/2015/12/08/3-reasons-why-you-shouldnt-replace-your-for-loops-by-stream-foreach/
Java 8 Iterable.forEach() vs foreach loop
http://endoflineblog.com/benchmarking-java8-streams

我只是想向您展示（1 行！）使用 Java8 语法达到相同目的是多么干净：

import java.util.Arrays;

public class test {

public static void main(String[] args) {
    int[] A = {1,2,3,4,5,6,7,8,9};
    System.out.println(Arrays.toString(A) + " using findDuplicate >> " + findDuplicate(A));
    System.out.println(Arrays.toString(A) + " using findDuplicateOriginal >>" + findDuplicateOriginal(A));

    int[] B = {1,1,3,4,5,6,7,8,4};
    System.out.println(Arrays.toString(B) + " using findDuplicate >> " + findDuplicate(B));
    System.out.println(Arrays.toString(B) + " using findDuplicateOriginal >> " + findDuplicateOriginal(B));
}

// using streams
private static boolean findDuplicate(int[] items) {
    return !(Arrays.stream(items).distinct().count() == items.length);
}

// refactored original version
private static boolean findDuplicateOriginal(int[] A) {
    for(int i=0; i<A.length; i++){
        for(int j=0; j<A.length; j++){
            if(A[i]==A[j] && i!=j){
                return true;
            }
        }
    }
    return false;
}
}

输出：

[1, 2, 3, 4, 5, 6, 7, 8, 9] using findDuplicate >> false
[1, 2, 3, 4, 5, 6, 7, 8, 9] using findDuplicateOriginal >>false
[1, 1, 3, 4, 5, 6, 7, 8, 4] using findDuplicate >> true
[1, 1, 3, 4, 5, 6, 7, 8, 4] using findDuplicateOriginal >> true

break 语句会使我的代码更快吗？

Does break statement make my code faster?

java

loops

for-loop

runtime

break

我们能否使查找重复算法更快？