如何回答 find duplicates in array extension 问题?

How to answer find duplicates in array extension questions?

我参加了一次技术面试,给出了 "find duplicates in an array" 这个问题,我在 O(n) 时间内用哈希表解决了没问题,然后我收到了一连串的后续问题。

Orig: Determine if an array contains duplicate entries.

  F1: Now what if the array is very large, and had to be distributed across multiple machines.

  F2: What if the network connection between these machines are prone to failure?

  F3: What if the hardware themselves are not 100% reliable and may occasionally give off wrong answers?

  F4: Design a system so that multiple simultaneous users may need to update this array, while you need to maintain uniqueness of its entries.

我想到了 F1,然后说使用一个巨大的哈希表并不明智,我们可以将运行时间换成 O(n²) 以补偿 O(1) 内存,但不确定其余部分.有什么帮助吗?

F2:要在不同的机器上复制数据,可以选择全部数据或部分数据。

F3:在机器之间传输数据时使用校验和值。

F4:使用某种同步(如信号量)来确保更新不会同时进行。