C# 在数组中查找相似的字符串

C# Find Like Strings In Array

问题: 我有一个字符串数组,我正在尝试找到与提供的字符串最接近的匹配项。我在下面做了一些尝试,并检查了一些其他解决方案,例如 Levenshtein Distance,它似乎只有在所有字符串都具有相似大小的情况下才有效。

期望: 如果我使用 "two are better" 作为匹配字符串,它将与 "Two are better than one" 匹配。

思想: 我想知道是否在有空格的地方拆开 stringToMatch 字符串,然后查看是否在数组的当前迭代( arrayOfStrings[i] )中找到 stringToMatch 字符串的每个部分是否会有帮助?

// Test array and string to search
string[] arrayOfStrings = new string[] { "A hot potato", "Two are better than one", "Best of both worlds", "Curiosity killed the cat", "Devil's Advocate", "It takes two to tango", "a twofer" };
string stringToMatch = "two are better";


// Contains attempt
List<string> likeNames = new List<string>();
for (int i = 0; i < arrayOfStrings.Count(); i++)
{
    if (arrayOfStrings[i].Contains(stringToMatch))
    {
        Console.WriteLine("Hit1");
        likeNames.Add(arrayOfStrings[i]);                    
    }

    if (stringToMatch.Contains(arrayOfStrings[i]))
    {
        Console.WriteLine("Hit2");
        likeNames.Add(arrayOfStrings[i]);
    }
}


// StringComparison attempt
var matches = arrayOfStrings.Where(s => s.Equals(stringToMatch, StringComparison.InvariantCultureIgnoreCase)).ToList();



// Display matched array items
Console.WriteLine("List likeNames");
likeNames.ForEach(Console.WriteLine);

Console.WriteLine("\n");

Console.WriteLine("var matches");
matches.ForEach(Console.WriteLine);

您可以试试下面的代码。

I have created List<string> based on your stringToMatch and checked if strings in array of strings contains every string present in toMatch, if yes then selected that string into match.

List<string> toMatch = stringToMatch.Split(' ').ToList();
List<string> match = arrayOfStrings.Where(x => 
                                   !toMatch.Any(ele => !x.ToLower()
                                   .Contains(ele.ToLower())))
                                   .ToList();

为了您的实施,我拆分了 stringToMatch,然后计算匹配项的数量。

下面的代码将为您提供带有最高字符串匹配计数的订单列表。

string[] arrayOfStrings = new string[] { "A hot potato", "Two are better than one", "Best of both worlds", "Curiosity killed the cat", "Devil's Advocate", "It takes two to tango", "a twofer" };
            string stringToMatch = "two are better";

            var matches = arrayOfStrings
                  .Select(s =>
                  {
                      int count = 0;
                      foreach (var item in stringToMatch.Split(' '))
                      {
                          if (s.Contains(item))
                              count++;
                      }
                      return new { count, s };
                  }).OrderByDescending(d => d.count);

我使用了非常简单的字符串比较来验证。该算法可以根据具体要求而有所不同(如匹配字符串的顺序等)