如何解析包含 JSON 的 csv 列?
How to parse a csv column that contains JSON?
假设我有一个 csv 文件,example.csv,看起来像这样(excel 添加的双引号):
Id,Name,requestJson
12345,Albert,"{
""latitude"": -43.518703,
""longitude"": -71.69634,
""tags"": [
""aliqua"",
""ad"",
""dolor"",
""culpa"",
""sunt"",
""consequat"",
""irure""
],
""friends"": [
{
""id"": 0,
""name"": ""Bryan Montoya""
},
{
""id"": 1,
""name"": ""Marcella Tillman""
},
{
""id"": 2,
""name"": ""Leola Calderon""
}
],
""greeting"": ""Hello, undefined! You have 7 unread messages."",
""favoriteFruit"": ""strawberry""
}"
RequestJson
将反序列化为以下 objects:
public class Friend
{
public int id { get; set; }
public string name { get; set; }
}
public class Request
{
public double latitude { get; set; }
public double longitude { get; set; }
public List<string> tags { get; set; }
public List<Friend> friends { get; set; }
public string greeting { get; set; }
public string favoriteFruit { get; set; }
}
我的尝试从读取 example.csv
开始,跳过 headers,然后将字符串数组 values
传递给 FromCsv
以将其拆分为 Request
object.
public static List<Request> LoadFiles()
{
List<Request> requests = File.ReadAllLines("./example.csv")
.Skip(1)
.Select(v => FromCsv(v))
.ToList();
return requests;
}
这里我使用数组索引,因为我知道前两个元素在哪里。问题是,当我尝试检索 values[2]
时,拆分定界符失败,因为 json.
中存在转义字符和逗号
public static Request FromCsv(string csvLine)
{
string[] values = csvLine.Split(',');
Request request = new Request
{
Id = values[0],
Name = values[1],
Request = JsonConvert.DeserializeObject<Request>(values[2])
};
return request;
}
如何将 RequestJson
列解析为我想要的 Request
Object?
谢谢@luuk。找到了可行的解决方案。
解决方案是提供最大拆分数。然后去掉csv输入的所有双引号(CsvColumnToJson)
解决方案:
public static string GetJson (string csvLine)
{
string[] values = csvLine.Split(',', 3);
return CsvColumnToJson(values[2]);
}
/* Csv json formatting escapes double quotes with more double quotes and adds double quotes to the beginnning and end of the Json */
public static string CsvColumnToJson(string csvColumn)
{
var duplicatedDoubleQuotesRemoved = csvColumn.Replace("\"\"", "\"");
return duplicatedDoubleQuotesRemoved.Substring(0, duplicatedDoubleQuotesRemoved .Length - 1).Substring(1);
}
假设我有一个 csv 文件,example.csv,看起来像这样(excel 添加的双引号):
Id,Name,requestJson
12345,Albert,"{
""latitude"": -43.518703,
""longitude"": -71.69634,
""tags"": [
""aliqua"",
""ad"",
""dolor"",
""culpa"",
""sunt"",
""consequat"",
""irure""
],
""friends"": [
{
""id"": 0,
""name"": ""Bryan Montoya""
},
{
""id"": 1,
""name"": ""Marcella Tillman""
},
{
""id"": 2,
""name"": ""Leola Calderon""
}
],
""greeting"": ""Hello, undefined! You have 7 unread messages."",
""favoriteFruit"": ""strawberry""
}"
RequestJson
将反序列化为以下 objects:
public class Friend
{
public int id { get; set; }
public string name { get; set; }
}
public class Request
{
public double latitude { get; set; }
public double longitude { get; set; }
public List<string> tags { get; set; }
public List<Friend> friends { get; set; }
public string greeting { get; set; }
public string favoriteFruit { get; set; }
}
我的尝试从读取 example.csv
开始,跳过 headers,然后将字符串数组 values
传递给 FromCsv
以将其拆分为 Request
object.
public static List<Request> LoadFiles()
{
List<Request> requests = File.ReadAllLines("./example.csv")
.Skip(1)
.Select(v => FromCsv(v))
.ToList();
return requests;
}
这里我使用数组索引,因为我知道前两个元素在哪里。问题是,当我尝试检索 values[2]
时,拆分定界符失败,因为 json.
public static Request FromCsv(string csvLine)
{
string[] values = csvLine.Split(',');
Request request = new Request
{
Id = values[0],
Name = values[1],
Request = JsonConvert.DeserializeObject<Request>(values[2])
};
return request;
}
如何将 RequestJson
列解析为我想要的 Request
Object?
谢谢@luuk。找到了可行的解决方案。
解决方案是提供最大拆分数。然后去掉csv输入的所有双引号(CsvColumnToJson)
解决方案:
public static string GetJson (string csvLine)
{
string[] values = csvLine.Split(',', 3);
return CsvColumnToJson(values[2]);
}
/* Csv json formatting escapes double quotes with more double quotes and adds double quotes to the beginnning and end of the Json */
public static string CsvColumnToJson(string csvColumn)
{
var duplicatedDoubleQuotesRemoved = csvColumn.Replace("\"\"", "\"");
return duplicatedDoubleQuotesRemoved.Substring(0, duplicatedDoubleQuotesRemoved .Length - 1).Substring(1);
}