Unicode字符使用C#从字符串中替换

Unicode characters replace from string using C#

string str = "our guests will experience \u001favor in an area";
 bool exists = str.IndexOf("\u001", StringComparison.CurrentCultureIgnoreCase) > -1;

想在string.I中找到\u001这个字符并替换掉string.I努力解决还是无能为力

请解决这个问题。在此先感谢您的宝贵帮助。

在 C# 规范的深处,您可以找到以下内容:

[Note: The use of the \x hexadecimal-escape-sequence production can be error-prone and hard to read due to the variable number of hexadecimal digits following the \x. For example, in the code:

string good = "\x9Good text";

string bad = "\x9Bad text";

it might appear at first that the leading character is the same (U+0009, a tab character) in both strings. In fact the second string starts with U+9BAD as all three letters in the word "Bad" are valid hexadecimal digits. As a matter of style, it is recommended that \x is avoided in favour of either specific escape sequences (\t in this example) or the fixed-length \u escape sequence. end note]

还有:

unicode-escape-sequence::

\u hex-digit hex-digit hex-digit hex-digit

\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

为了进一步简化,\u 后跟 4 或 8 个十六进制符号 - 不是 3。您的字符串被解释为 "our guests will experience \u001favor in an area".

如果我们查看 C# 语言规范 ECMA-334,在第 7.4.2 节“Unicode 字符转义序列”中,我们会发现

A Unicode escape sequence represents a Unicode code point. Unicode escape sequences are processed in identifiers (§7.4.3), character literals (§7.4.5.5), and regular string literals (§7.4.5.6). A Unicode escape sequence is not processed in any other location (for example, to form an operator, punctuator, or keyword).

unicode-escape-sequence:: \u hex-digit hex-digit hex-digit hex-digit
                                         \U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

因此您必须使用 四个 十六进制数字和 \u

在您的示例中,它采用“001f”作为这四个十六进制数字。

您示例中的 "\u001" 应该按照 "Unrecognized escape sequence."

在 Visual Studio 中给出错误

使用正则表达式:

var unicodeRegexp = new Regex(@"\x1f");
var testWord = "our guests will experience \u001favor in an area";
var newWord = unicodeRegexp.Replace(testWord, "text for replacement");

\x1f 是 \uoo1f 的替代品,应跳过前导零 https://www.regular-expressions.info/unicode.html#codepoint