Unicode字符使用C#从字符串中替换

Question

string str = "our guests will experience \u001favor in an area";
 bool exists = str.IndexOf("\u001", StringComparison.CurrentCultureIgnoreCase) > -1;

想在string.I中找到\u001这个字符并替换掉string.I努力解决还是无能为力

请解决这个问题。在此先感谢您的宝贵帮助。

Answer 1

在 C# 规范的深处，您可以找到以下内容：

[Note: The use of the \x hexadecimal-escape-sequence production can be error-prone and hard to read due to the variable number of hexadecimal digits following the \x. For example, in the code:

string good = "\x9Good text";

string bad = "\x9Bad text";

it might appear at first that the leading character is the same (U+0009, a tab character) in both strings. In fact the second string starts with U+9BAD as all three letters in the word "Bad" are valid hexadecimal digits. As a matter of style, it is recommended that \x is avoided in favour of either specific escape sequences (\t in this example) or the fixed-length \u escape sequence. end note]

还有：

unicode-escape-sequence::

\u hex-digit hex-digit hex-digit hex-digit

\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

为了进一步简化，\u 后跟 4 或 8 个十六进制符号 - 不是 3。您的字符串被解释为 "our guests will experience \u001favor in an area".

Answer 2

如果我们查看 C# 语言规范 ECMA-334，在第 7.4.2 节“Unicode 字符转义序列”中，我们会发现

A Unicode escape sequence represents a Unicode code point. Unicode escape sequences are processed in identifiers (§7.4.3), character literals (§7.4.5.5), and regular string literals (§7.4.5.6). A Unicode escape sequence is not processed in any other location (for example, to form an operator, punctuator, or keyword).

unicode-escape-sequence:: \u hex-digit hex-digit hex-digit hex-digit
\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

因此您必须使用四个十六进制数字和 \u。

在您的示例中，它采用“001f”作为这四个十六进制数字。

您示例中的 "\u001" 应该按照 "Unrecognized escape sequence."

在 Visual Studio 中给出错误

Answer 3

使用正则表达式：

var unicodeRegexp = new Regex(@"\x1f");
var testWord = "our guests will experience \u001favor in an area";
var newWord = unicodeRegexp.Replace(testWord, "text for replacement");

\x1f 是 \uoo1f 的替代品，应跳过前导零 https://www.regular-expressions.info/unicode.html#codepoint

Unicode字符使用C#从字符串中替换

Unicode characters replace from string using C#

c#

asp.net-mvc

asp.net-mvc-4