ReadAsStringAsync returns 虚线描述
ReadAsStringAsync returns dashed description
我有一个方法 ReadJsonUrl,它获取一个 url(字符串地址(例如:https://www.ah.nl/service/rest/delegate?url=%2Fproducten%2Fproduct%2Fwi224732%2Fsmiths-nibb-it-happy-ones-kruis-rond-paprika ))到一个 JSON 文件。
此方法读取 JSON 并在控制台中输出一些数据。
但问题是产品的描述输出为
Smiths Nibb-it hap-py on-es kruis-rond pa-pri-ka
但如果我在浏览器中检查 JSON,它会显示
Smiths Nibb-it happy ones kruis-rond paprika
这就是我想要的打印方式。
我认为问题在于,请求是使用 0px x 0px 分辨率浏览器完成的,因此 returns 将单词分开以保持其可读性。如果我让我的浏览器非常小,那么它也会显示带有破折号的描述。
我在我的代码中添加了一个用户代理,但是没有用。
有人知道如何解决这个问题吗?
我的代码:
public static async Task<object> ReadJsonUrl(string address)
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36");
HttpResponseMessage response = await client.GetAsync(address);
var content = await response.Content.ReadAsStringAsync();
//JObject obj = JObject.Parse(content);
var data = Empty.FromJson(content);
var product = data.Embedded.Lanes[4].Embedded.Items[0].Embedded.Product;
Console.WriteLine(product.Id);
Console.WriteLine(product.Description);
Console.WriteLine(product.PriceLabel.Now);
Console.WriteLine(product.Availability.Label);
Console.WriteLine("-------------------------------------");
System.Threading.Thread.Sleep(5000);
//the return value is for later use
return product;
}
}
如果您将第二个字符串(预期输出)复制并粘贴到十六进制编辑器中,它会告诉您它有 0xAD
个字符。这些是 soft hyphens.
Internet Explorer 或 Firefox 等浏览器只会在必要时(在换行符处)显示这些软连字符,但控制台每次都会显示。
为了补充 Thomas Weller 的回答,它很好地解释了问题,这里有一个函数可以从 string
中删除所有软连字符。它被写成一个扩展方法,所以你可以像这样轻松地使用它:
Console.WriteLine(product.Description.RemoveSoftHyphens());
扩展方法:
public static class StringExtensions
{
public static string RemoveSoftHyphens(this string input)
{
var output = new StringBuilder(input.Length);
foreach (char c in input)
{
if (c != 0xAD)
{
output.Append(c);
}
}
return output.ToString();
}
}
作为一些附加信息,这里是 HTML4 对软连字符使用的描述:
In HTML, there are two types of hyphens: the plain hyphen and the soft hyphen. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur. Those browsers that interpret soft hyphens must observe the following semantics. If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored.
我有一个方法 ReadJsonUrl,它获取一个 url(字符串地址(例如:https://www.ah.nl/service/rest/delegate?url=%2Fproducten%2Fproduct%2Fwi224732%2Fsmiths-nibb-it-happy-ones-kruis-rond-paprika ))到一个 JSON 文件。
此方法读取 JSON 并在控制台中输出一些数据。
但问题是产品的描述输出为
Smiths Nibb-it hap-py on-es kruis-rond pa-pri-ka
但如果我在浏览器中检查 JSON,它会显示
Smiths Nibb-it happy ones kruis-rond paprika
这就是我想要的打印方式。
我认为问题在于,请求是使用 0px x 0px 分辨率浏览器完成的,因此 returns 将单词分开以保持其可读性。如果我让我的浏览器非常小,那么它也会显示带有破折号的描述。 我在我的代码中添加了一个用户代理,但是没有用。
有人知道如何解决这个问题吗?
我的代码:
public static async Task<object> ReadJsonUrl(string address)
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36");
HttpResponseMessage response = await client.GetAsync(address);
var content = await response.Content.ReadAsStringAsync();
//JObject obj = JObject.Parse(content);
var data = Empty.FromJson(content);
var product = data.Embedded.Lanes[4].Embedded.Items[0].Embedded.Product;
Console.WriteLine(product.Id);
Console.WriteLine(product.Description);
Console.WriteLine(product.PriceLabel.Now);
Console.WriteLine(product.Availability.Label);
Console.WriteLine("-------------------------------------");
System.Threading.Thread.Sleep(5000);
//the return value is for later use
return product;
}
}
如果您将第二个字符串(预期输出)复制并粘贴到十六进制编辑器中,它会告诉您它有 0xAD
个字符。这些是 soft hyphens.
Internet Explorer 或 Firefox 等浏览器只会在必要时(在换行符处)显示这些软连字符,但控制台每次都会显示。
为了补充 Thomas Weller 的回答,它很好地解释了问题,这里有一个函数可以从 string
中删除所有软连字符。它被写成一个扩展方法,所以你可以像这样轻松地使用它:
Console.WriteLine(product.Description.RemoveSoftHyphens());
扩展方法:
public static class StringExtensions
{
public static string RemoveSoftHyphens(this string input)
{
var output = new StringBuilder(input.Length);
foreach (char c in input)
{
if (c != 0xAD)
{
output.Append(c);
}
}
return output.ToString();
}
}
作为一些附加信息,这里是 HTML4 对软连字符使用的描述:
In HTML, there are two types of hyphens: the plain hyphen and the soft hyphen. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur. Those browsers that interpret soft hyphens must observe the following semantics. If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored.