Java: 比较不同顺序关键字的字符串

Java: Compare Strings with keywords in different order

我有两个字符串,如下所示:

String str1 = "[0.7419,0.7710,0.2487]";
String str2 = "[\"0.7710\",\"0.7419\",\"0.2487\"]";

我想比较它们,尽管顺序不同,但它们是相等的...

哪种方法最快最简单?

我是否应该将每个数组拆分成数组并比较两个数组?或不? 我想我必须删除“[”,“]”,“””字符以使其更清楚,所以我这样做了。我也用“”替换了“,”但我不知道这是否有帮助......

提前致谢:)

编辑:我的字符串不会总是一组双打或浮点数。它们也可能是实际的单词或一组字符。

对于使用 HashSet 的您来说,这是一个非常简单的解决方案。

套装的好处:-

  • 不能重复。
  • Insertion/deletion 的元素是 O(1)。
  • 比 Array 快很多。这里保持元素顺序也是 不重要所以没关系。

    String str1 = "[0.7419,0.7710,0.2487]";
    String str2 = "[\"0.7710\",\"0.7419\",\"0.2487\"]";
    
    Set<String> set1 = new HashSet<>();
    Set<String> set2 = new HashSet<>();
    
    String[] split1 = str1.replace("[", "").replace("]", "").split(",");
    String[] split2 = str2.replace("[", "").replace("]", "").replace("\"", "").split(",");
    set1.addAll(Arrays.asList(split1));
    set2.addAll(Arrays.asList(split2));
    
    System.out.println("set1: "+set1);
    System.out.println("set2: "+set2);
    
    boolean isEqual = false;
    if(set1.size() == set2.size()){
        set1.removeAll(set2);
        if(set1.size() ==0){
            isEqual = true;
        }
    }
    
    System.out.println("str1 and str2 "+( isEqual ? "Equal" : "Not Equal") );
    

输出:

set1: [0.7710, 0.2487, 0.7419]
set2: [0.7710, 0.2487, 0.7419]
str1 and str2 Equal

像这样:

    String[] a1 = str1.replaceAll("^\[|\]$", "").split(",", -1);
    String[] a2 = str2.replaceAll("^\[|\]$", "").split(",", -1);
    for (int i = 0; i < a2.length; i++)
        a2[i] = a2[i].replaceAll("^\\"|\\"$", "");
    Arrays.sort(a1);
    Arrays.sort(a2);
    boolean stringsAreEqual = Arrays.equals(a1, a2);

或者您可以使用功能齐全的方法(可能效率稍低):

    boolean stringsAreEqual = Arrays.equals(
            Arrays.stream(str1.replaceAll("^\[|\]$", "").split(",", -1))
                    .sorted()
                    .toArray(),
            Arrays.stream(str2.replaceAll("^\[|\]$", "").split(",", -1))
                    .map(s -> s.replaceAll("^\\"|\\"$", ""))
                    .sorted()
                    .toArray()
    );

使用数组优于使用集合(如其他人所建议的那样)的优点是数组通常使用较少的内存并且它们可以保存重复项。如果您的问题域可以在每个字符串中包含重复元素,则不能使用集合。

因为你有一个混合结果类型,你需要先把它作为一个混合输入来处理

下面是我将如何替换它,特别是对于较长的字符串。

private Stream<String> parseStream(String in) {
    //we'll skip regex for now and can simply hard-fail bad input later
    //you can also do some sanity checks outside this method
    return Arrays.stream(in.substring(1, in.length() - 1).split(",")) //remove braces
        .map(s -> !s.startsWith("\"") ? s : s.substring(1, s.length() - 1)); //remove quotes
}

接下来,我们现在有一个字符串流,需要将其解析为原始类型或字符串(因为我假设我们没有某种奇怪的对象序列化形式):

private Object parse(String in) {
    //attempt to parse as number first. Any number can be parsed as a double/long
    try {
        return in.contains(".") ? Double.parseDouble(in) : Long.parseLong(in);
    } catch (NumberFormatException ex) {
        //it's not a number, so it's either a boolean or unparseable
        Boolean b = Boolean.parseBoolean(in); //if not a boolean, #parseBoolean is false
        b = in.toLowerCase().equals("false") && !b ? b : null; //so we map non-false to null
        return b != null ? b : in; //return either the non-null boolean or the string
    }
}

使用它,我们可以将混合流转换为混合集合:

Set<Object> objs = this.parseStream(str1).map(this::parse).collect(Collectors.toSet());
Set<Object> comp = this.parseStream(str2).map(this::parse).collect(Collectors.toSet());
//we're using sets, keep in mind the nature of different collections and how they compare their elements here
if (objs.equals(comp)) {
    //we have a matching set
}

最后,一些健全性检查的一个例子是确保输入字符串上有适当的大括号等。尽管其他人怎么说,我还是将集合语法学习为 {a, b, ...c} 和 series/list语法为[a, b, ...c],两者在这里有不同的比较。

这可以通过下面的方法来完成,该方法使用 TreeSet 实现一组字符串,因此排序可以在内置处理。它只是一个简单的转换字符串集合和使用 equals 方法进行比较。 试试下面的代码:

String str1 = "[0.7419,0.7710,0.2487]";
        String str2 = "[\"0.7710\",\"0.7419\",\"0.2487\"]";
        String jsonArray = new JSONArray(str2).toString();
        Set<String> set1 = new TreeSet<String>(Arrays.asList(str1.replace("[", "").replace("]", "").split(",")));
        Set<String> set2 = new TreeSet<String>(Arrays.asList(jsonArray.replace("[", "").replace("]", "").replace("\"", "").split(",")));
        if(set1.equals(set2)){
             System.out.println(" str1 and str2 are equal");
       }

在上面的代码中,我借助 jsonArray,删除了 "\" 字符。

注:

But this will not work if duplicate element in one string and other string are different in number because set does not keep duplicates.

尝试使用保留重复元素的 list 并解决您的问题。

String str1 = "[0.7419,0.7710,0.2487]";
            String str2 = "[\"0.7710\",\"0.7419\",\"0.2487\"]";
            String jsonArray = new JSONArray(str2).toString();
            List<String> list1=new ArrayList<String>(Arrays.asList(str1.replace("[", "").replace("]", "").split(",")));
            List<String> list2=new ArrayList<String>(Arrays.asList(jsonArray.replace("[", "").replace("]", "").replace("\"", "").split(",")));
            Collections.sort(list1);
            Collections.sort(list2);
            if(list1.equals(list2)){
                  System.out.println("str1 and str2 are equal");
            }

Google GSON 可以通过将值读取为 Set<String>:

来非常巧妙地处理此任务
    final String str1 = "[0.7419,0.7710,0.2487]";
    final String str2 = "[\"0.7710\",\"0.7419\",\"0.2487\"]";
    final String str3 = "[\"0.3310\",\"0.7419\",\"0.2487\"]";
    final Gson gson = new Gson();
    final Type setOfStrings = new TypeToken<Set<String>>() {}.getType();
    final Set<String> set1 = gson.fromJson(str1, setOfStrings);
    final Set<String> set2 = gson.fromJson(str2, setOfStrings);
    final Set<String> set3 = gson.fromJson(str3, setOfStrings);

    System.out.println("Set #1:" + set1);
    System.out.println("Set #2:" + set2);
    System.out.println("Set #3:" + set3);
    System.out.println("Set #1 is equivalent to Set #2: " + set1.equals(set2));
    System.out.println("Set #1 is equivalent to Set #3: " + set1.equals(set3));

输出为:

Set #1:[0.7419, 0.7710, 0.2487]
Set #2:[0.7710, 0.7419, 0.2487]
Set #3:[0.3310, 0.7419, 0.2487]
Set #1 is equivalent to Set #2: true
Set #1 is equivalent to Set #3: false