Return 一个数组，其中包含的数组元素数小于或等于给定数组中的元素数

Question

我遇到了这个问题，想知道是否有更好的复杂性来解决这个问题。

例如

数组 a = [1,4,2,4]
数组 b = [3,5]

期望输出 ==> [2, 4]

编辑：举另一个例子

数组 a = [1,4,2,4]
数组 b = [3, 1000000]

期望输出 ==> [2,4]

到目前为止，我发现和尝试的运行时间为 O(nlogn) + O(blogn) 和 O(n)。

O(nlogn) + O(blogn) 方法：

 int binarysearch(vector<int>arr, int l, int r, int target) 
 {
      int mid;
      while(l <= r)
      {
          mid = (r+l) / 2;

          if(arr[mid] > target)
              r = mid - 1;
          else
              l = mid + 1;
      }

      return r; 
 }

vector<int> counts(vector<int> a, vector<int> b)
{
    vector<int> result;
    sort(a.begin(), a.end()); // O(nlogn)

    for(auto i : b){
        int count = binarysearch(a, 0, a.size()-1, b); // b*O(log n) times
        result.push_back(count)
    }
    return result;
}

O(n) 方法：

vector<int> counts(vector<int> a, vector<int> b)
{
    vector<int> result;
    int maxi = *max_element(b.begin(), b.end()) + 1;
    int mymap[maxi] = {0};

    for(auto i : a) mymap[i]++;

    for(int i = 1; i < maxi; i++){
        mymap[i] = mymap[i] + mymap[i-1];
    }

    for(auto i : b){
        result.push_back(mymap[i]);
    }
    return result;
}

Answer 1

[I am] wondering if there could be a better complexity to solve the problem.

Time complexity.

O(n) approach:

不，不存在小于线性时间复杂度的解。

也就是说，您的线性解是不正确的。如果输入数组包含 1000000 或更大的值，或者负数，那么您访问 mymap 的范围之外并且行为未定义。此外，i <= 1000000 还在最后一次迭代时访问边界外的 mymap。此外，int[1000000] 对于局部变量来说太大了。在某些系统上，即使是一个这样的变量也可能导致堆栈溢出。

Answer 2

没有比 O(n) 更好的方法了。

所以这也是 O(n)，但采用了 STL 风格：

template <class Iter1, class Iter2>
std::vector<std::size_t> counts(const Iter1 beg_a, const Iter1 end_a, const Iter2 beg_b, const Iter2 end_b)
{
    std::vector<std::size_t> result;
    const auto& max = *std::max_element(beg_b, end_b);
    std::vector<std::size_t> mymap(max + 1, 0);
    for (auto iter = beg_a; iter != end_a; iter++)
    {
        if (*iter <= max)
        {
            mymap[*iter]++;
        }
    }
    for (std::size_t i = 1; i < mymap.size(); i++)
    {
        mymap[i] = mymap[i] + mymap[i - 1];
    }
    for (auto iter = beg_b; iter != end_b; iter++)
    {
        result.push_back(mymap[*iter]);
    }
    return result;
}

Answer 3

好的，事实证明有一种更快的方法来计算地图索引。例如给定 a = {1,4,2,4,5,8,80} 和 b = {3,1000000}。 期望输出 将是 [2,7]。

使用我以前的方法，我需要计算 mymap[4]、mymap[5]..mymap[9999]..mymap[1000000]。这就是程序崩溃和 returns 运行时间错误的原因。

我们处理这个问题的方法是使用 for(auto& entry:mymap) 来访问所有 dictionary/map。然后，我们使用upper_boundSTL C++来return正确映射。

    vector<int> counts(vector<int> nums, vector<int> maxes){


    vector<int> result;

    map<int,unsigned int> mymap;
    for(auto i : nums) mymap[i]++;

    // doesn't need to run 1000000 times
    int temp = 0;
    for(auto& entry: mymap){
        entry.second = entry.second + temp;
        temp = entry.second;
        //cout << "first : " << entry.first << "second: " << entry.second << endl;
    }

    map<int,unsigned int>::iterator itl;
    for(auto i : maxes){
        itl = --mymap.upper_bound(i); // points to the correct map
        result.push_back(itl->second);
    }

    return result;
 }

Answer 4

首先让我们采用更好的表示法：让我们称 A 为 a 中的元素数，B 为 B 中的元素数，假设元素是 M 位值。正如其他人所说，您的解决方案是 A log(A) 构建地图加上 B log(A) 获取返回值。

使用 https://en.wikipedia.org/wiki/Y-fast_trie 您可以获得 (A+B) log M，如果 A >> M 则速度更快（但对于大多数用例而言在实践中可能更慢）。

Return 一个数组，其中包含的数组元素数小于或等于给定数组中的元素数

Return an array which contains number of elements in an array that is lesser or equal to elements in a given array

c++

algorithm

complexity-theory

hashmap

binary-search