如何修复代理对 (0xD83D, 0x27) 无效
How to fix The surrogate pair (0xD83D, 0x27) is invalid
我有一个字节数组,文本是xml和"Hei "。我做
var bodyText = Encoding.UTF8.GetString(transportMessage.Body);
var bodyXml = XElement.Parse(bodyText);
获取字符串将表情符号编码为 &# xD83D;&#x DE0A;
所以 XElement.Parse 抛出:
System.InvalidOperationException: There was an error generating the XML document. ---> System.ArgumentException: The surrogate pair (0xD83D, 0x27) is invalid. A high surrogate character (0xD800 - 0xDBFF) must always be paired with a low surrogate character (0xDC00 - 0xDFFF).
如何删除此表情符号(或任何其他表情符号)我尝试使用具有无效 xml 个字符的正则表达式 [^\x09\x0A\x0D\x20-\xD7FF\xE000-\xFFFD\x10000-x10FFFF]
但它与该表情符号不匹配。
我用这段代码删除了这个:
private static readonly Regex EmojiRegex = new Regex("&#x?[A-Fa-f0-9]+;");
private static string ReplaceInvalidXmlCharacterReferences(string input)
{
if (input.IndexOf("&#") == -1)
return input;
return EmojiRegex.Replace(input, match =>
{
string ncr = match.Value;
uint num;
var frmt = NumberFormatInfo.InvariantInfo;
bool isParsed =
ncr[2] == 'x' ? // the x must be lowercase in XML documents
uint.TryParse(ncr.Substring(3, ncr.Length - 4), NumberStyles.AllowHexSpecifier, frmt, out num) :
uint.TryParse(ncr.Substring(2, ncr.Length - 3), NumberStyles.Integer, frmt, out num);
return isParsed && !XmlConvert.IsXmlChar((char)num) ? "" : ncr;
});
}
我有一个字节数组,文本是xml和"Hei "。我做
var bodyText = Encoding.UTF8.GetString(transportMessage.Body);
var bodyXml = XElement.Parse(bodyText);
获取字符串将表情符号编码为 &# xD83D;&#x DE0A;
所以 XElement.Parse 抛出:
System.InvalidOperationException: There was an error generating the XML document. ---> System.ArgumentException: The surrogate pair (0xD83D, 0x27) is invalid. A high surrogate character (0xD800 - 0xDBFF) must always be paired with a low surrogate character (0xDC00 - 0xDFFF).
如何删除此表情符号(或任何其他表情符号)我尝试使用具有无效 xml 个字符的正则表达式 [^\x09\x0A\x0D\x20-\xD7FF\xE000-\xFFFD\x10000-x10FFFF]
但它与该表情符号不匹配。
我用这段代码删除了这个:
private static readonly Regex EmojiRegex = new Regex("&#x?[A-Fa-f0-9]+;");
private static string ReplaceInvalidXmlCharacterReferences(string input)
{
if (input.IndexOf("&#") == -1)
return input;
return EmojiRegex.Replace(input, match =>
{
string ncr = match.Value;
uint num;
var frmt = NumberFormatInfo.InvariantInfo;
bool isParsed =
ncr[2] == 'x' ? // the x must be lowercase in XML documents
uint.TryParse(ncr.Substring(3, ncr.Length - 4), NumberStyles.AllowHexSpecifier, frmt, out num) :
uint.TryParse(ncr.Substring(2, ncr.Length - 3), NumberStyles.Integer, frmt, out num);
return isParsed && !XmlConvert.IsXmlChar((char)num) ? "" : ncr;
});
}