通过多个线程并行计算哈希并将输出添加到 ArrayList<String>

Calculating hash in parallel by multiple threads and adding the outputs in an ArrayList<String>

我编写了以下代码来计算 String 的哈希值(基于 SHA-256),然后将所有输出插入 ArrayList<String> 中:

        ArrayList<Thread> threadList = new ArrayList<Thread>();
        ArrayList<String> threadListStr = new ArrayList<String>();
        int threadNumber = 100;
        for (int i = 0; i < threadNumber; i++) {
            String tId = String.valueOf(i);
            Thread thr = new Thread(() -> {
                threadListStr.add(calculateHash(tId));
            });
            threadList.add(thr);
        }

        // START the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).start();
        }
        // STOP the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).interrupt();
        }

        System.out.println("Size of ArrayList<String> is: " + threadListStr.size());
        System.out.println("Size of ArrayList<Thread> is: " + threadList.size());
        
        /////////////////////
        
        public static String calculateHash(String tId) {
        String tIdStr = org.apache.commons.codec.digest.DigestUtils.sha256Hex(tId);
        return tIdStr;
        }

但是,ArrayList<String> 并没有变得完整,正如您在 运行 代码 5 次后看到的那样,每次 ArrayList<String> 都有不同的大小(尽管 ArrayList<Thread> threadList 总是完整的,因为线程数是 100。)

//1th run
Size of ArrayList<String> is: 60
Size of ArrayList<Thread> is: 100

//2nd run
Size of ArrayList<String> is: 30
Size of ArrayList<Thread> is: 100

//3rd run
Size of ArrayList<String> is: 10
Size of ArrayList<Thread> is: 100

//4th run
Size of ArrayList<String> is: 61
Size of ArrayList<Thread> is: 100

//5th
Size of ArrayList<String> is: 69
Size of ArrayList<Thread> is: 100

应该如何修改代码以便ArrayList<String>存储所有输出完全

EDITE:我改了代码如下,但是输出是一样的。

        ArrayList<Thread> threadList = new ArrayList<Thread>();
        //ArrayList<String> threadListStr = new ArrayList<String>();
        List<String> threadListStrSync = Collections.synchronizedList(new ArrayList<>());
        int threadNumber = 100;
        for (int i = 0; i < threadNumber; i++) {
            String tId = String.valueOf(i);
            Thread thr = new Thread(() -> {
                threadListStrSync.add(calculateHash(tId));
            });
            threadList.add(thr);
        }

        // START the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).start();
        }
        // STOP the threads
        for (int i = 0; i < threadNumber; i++) {
            threadList.get(i).interrupt();
        }

        System.out.println("Size of ArrayList<String> is: " + threadListStrSync.size());
        System.out.println("Size of ArrayList<Thread> is: " + threadList.size());

注意:我评论了interrupt();但是输出还是一样。

存在多个问题

  1. 使用线程安全集合或手动同步访问 - 一个简单的选择是用 Collections.synchronizedList()
  2. 包装您的列表 不需要
  3. interrupt(),当线程到达其 run()-method
  4. 的末尾时无论如何都会终止
  5. 您需要等待所有线程终止才能打印结果 - 为此,请调用 join() 而不是 interrupt()

您有两个问题:1) 可能有些线程在执行过程中永远无法将它们的哈希 ID 添加到集合中,以及 2) 哈希 ID 的集合正被多个线程访问,因此您应该使用线程安全集合。

    ArrayList<Thread> threadList = new ArrayList<Thread>();
    Collection<String> threadListStr = Collections.synchronizedCollection( new ArrayList<String>() );
    int threadNumber = 100;
    for (int i = 0; i < threadNumber; i++) {
        String tId = String.valueOf(i);
        Thread thr = new Thread(() -> {
            threadListStr.add(calculateHash(tId));
        });
        threadList.add(thr);
    }

    // START the threads
    for (int i = 0; i < threadNumber; i++) {
        threadList.get(i).start();
    }
    // STOP the threads
    for (int i = 0; i < threadNumber; i++) {
       try {
           threadList.get(i).join();
       } catch( InterruptedException exc ) {
           // handle interrupted exception
       }
    }

    System.out.println("Size of ArrayList<String> is: " + threadListStr.size());
    System.out.println("Size of ArrayList<Thread> is: " + threadList.size());