确定两个字符串是否互为排列的程序的时间复杂度

Question

我编写了一个程序来确定两个字符串是否是彼此的排列。我正在尝试使用散列 table 来这样做。这是我的代码：

bool permutation(string word1, string word2) {

    unordered_map<char, int> myMap1;
    unordered_map<char, int> myMap2;
    int count1 = 0;
    int count2 = 0;

    if (word1.length() == word2.length()) {
        for (int i = 0; i < word1.length(); i++) {
            count1++;
            count2++;
            for (int j = 0; j < word1.length(); j++) {
                if (word1[i] == word1[j] && myMap1.find(word1[i]) == myMap1.end()) {
                    count1++;
                }
                if (word2[i] == word2[j] && myMap2.find(word1[i]) == myMap2.end()) {
                    count2++;
                }
            }
            myMap1.insert({word1[i], count1});
            myMap2.insert({word2[i], count2});
        }
    }
    else {
        return false;
    }
    return (myMap1.size() == myMap2.size());
}

int main() {

    string word1;
    string word2;
    getline(cin, word1);
    getline(cin, word2);

    bool result = permutation(word1, word2);

    return 0;
}

我相信上面代码的时间复杂度是 O(n^2)。我想不出不涉及使用嵌套循环的算法。有没有更快的方法使用散列 table 来做到这一点？

Answer 1

是的。

#include <climits>
#include <iostream>
#include <unordered_map>

namespace {

bool permutation(const std::string& word1, const std::string& word2) {
  std::unordered_map<char, std::size_t> freqdiff;
  // alternatively, std::size_t freqdiff[UCHAR_MAX + 1] = {};
  for (char c : word1) {
    // alternatively, freqdiff[(unsigned char)c]++;
    freqdiff[c]++;
  }
  for (char c : word2) {
    // alternatively, freqdiff[(unsigned char)c]--;
    freqdiff[c]--;
  }
  for (auto i : freqdiff) {
    // alternatively, i != 0
    if (i.second != 0) {
      return false;
    }
  }
  return true;
}

bool permutation_with_array(const std::string& word1,
                            const std::string& word2) {
  std::size_t freqdiff[UCHAR_MAX + 1] = {};
  for (char c : word1) {
    freqdiff[static_cast<unsigned char>(c)]++;
  }
  for (char c : word2) {
    freqdiff[static_cast<unsigned char>(c)]--;
  }
  for (std::size_t i : freqdiff) {
    if (i != 0) {
      return false;
    }
  }
  return true;
}
}

int main() {
  std::string word1;
  std::string word2;
  std::getline(std::cin, word1);
  std::getline(std::cin, word2);
  std::cout << permutation(word1, word2) << '\n';
  std::cout << permutation_with_array(word1, word2) << '\n';
}

Answer 2

TL;DR 我想测试解决方案（包括我自己的解决方案）：David 基于地图的解决方案表现不错（更通用），他基于数组的解决方案表现非常好，我自己的解决方案只是略微更快但可读性稍差（可能不值得）。

老实说，当我看到这个时，我不敢相信大卫用无序地图的答案可能具有最低的时间复杂度。（很可能在理论上，但在实践中并非如此）

我通常用 C 编写，所以我不知道 C++ 为这些数据结构提供了什么样的优化，或者它们在现实生活中的表现如何。所以我决定测试一下。

所以我在我的 i7 上设置了一些测试，以测试各种解决方案的性能，并稍作调整 (source code here)

我运行程序在 1) 2 个排列和 2) 2 个不同的单词上运行了 100000 次

结果如下：

PERM original
======================
PERMUTATIONS OF SAME WORD
real 104.73
user 104.61
sys 0.06

DIFFERENT WORDS
real 104.24
user 104.16
sys 0.02

PERM David map
======================
PERMUTATIONS OF SAME WORD
real 2.46
user 2.44
sys 0.00

DIFFERENT WORDS
real 2.45
user 2.42
sys 0.02

PERM David array
======================
PERMUTATIONS OF SAME WORD
real 0.15
user 0.14
sys 0.00

DIFFERENT WORDS
real 0.14
user 0.14
sys 0.00

PERM Me
======================
PERMUTATIONS OF SAME WORD
real 0.13
user 0.13
sys 0.00

DIFFERENT WORDS
real 0.14
user 0.12
sys 0.01

确定两个字符串是否互为排列的程序的时间复杂度

Time complexity of program determining if two strings are permutations of each other

c++

algorithm

hashtable

permutation

time-complexity