读取文本文件并检查列数对于所有行都是相同的
Reading text file and checking column count is the same for all rows
我有一个包含 n 行的 txt 文件,每行有 n 列和一个分隔符。
如何逐行读取该 txt 文件并检查所有行是否具有相同的列。如果任何行有额外的列,则用文本显示行号。
假设我的 txt 文件中有以下行
147789-00,67,KB08,2007,12,0.000 ,0.000 ,0.000
A22951,67,RN3W,2007,12,0.000 ,0.000 ,0.000
946106-00,67,RN1W,2007,12,0.000 ,0.000 ,0.000,000
A22951,67,RN3W,2007,12,0.000 ,0.000 ,0.000
4 行中的第 3 行有额外的列,同样的方式可能有一些行可以有额外的列。我想找到的额外列。或者我可以说,如果任何行有额外的分隔符,那么它将显示带有文本的行号。
foreach (string line in File.ReadLines(@"c:\file.txt", Encoding.UTF8))
{
// how to match the columns
}
我走的路对吗..请有人帮助我。
System.IO.StreamReader file = new System.IO.StreamReader("c:\file.txt");
while((line = file.ReadLine()) != null)
{
}
现在,您可以使用任何分隔符拆分您的行。
char delimiter = ','; // This can be modified
int numberOfCols = 6; // this will be the number of columns per row
var lines = File
.ReadLines("Path here")
.Where(l => l.Split(delimiter).Count() == numberOfCols);
这将为您提供一个集合,其中包含具有指定列数的行;要收集无效行,您可以使用以下命令:
var invalidLines = File
.ReadLines("Path here")
.Select((l, lineNumber) => new { key = lineNumber, value = l })
.Where(l => l.value.Split(delimiter).Count() != numberOfCols);
如果您不知道实际的列数,但想确保这个未知数对于所有行都是相同的:
char delimiter = ',';
int columnCount = -1; // or put the number if it's known
var errors = File
.ReadLines(@"c:\file.txt", Encoding.UTF8) // UTF-8 is default and can be skipped
.Select((line, index) => {
int count = line.Split(delimiter).Length;
if (columnCount < 0)
columnCount = count;
return new {
line = line,
count = count,
index = index
};
})
.Where(chunk => chunk.count != columnCount)
.Select(chunk => String.Format("Line #{0} \"{1}\" has {2} items when {3} expected",
chunk.index + 1, chunk.line, chunk.count, columnCount));
// To check if file has any wrong lines:
if (errors.Any()) {
...
}
// To print out a report on wrong lines
Console.Write(String.Join(Envrironment.NewLine, errors));
我有一个包含 n 行的 txt 文件,每行有 n 列和一个分隔符。
如何逐行读取该 txt 文件并检查所有行是否具有相同的列。如果任何行有额外的列,则用文本显示行号。
假设我的 txt 文件中有以下行
147789-00,67,KB08,2007,12,0.000 ,0.000 ,0.000
A22951,67,RN3W,2007,12,0.000 ,0.000 ,0.000
946106-00,67,RN1W,2007,12,0.000 ,0.000 ,0.000,000
A22951,67,RN3W,2007,12,0.000 ,0.000 ,0.000
4 行中的第 3 行有额外的列,同样的方式可能有一些行可以有额外的列。我想找到的额外列。或者我可以说,如果任何行有额外的分隔符,那么它将显示带有文本的行号。
foreach (string line in File.ReadLines(@"c:\file.txt", Encoding.UTF8))
{
// how to match the columns
}
我走的路对吗..请有人帮助我。
System.IO.StreamReader file = new System.IO.StreamReader("c:\file.txt");
while((line = file.ReadLine()) != null)
{
}
现在,您可以使用任何分隔符拆分您的行。
char delimiter = ','; // This can be modified
int numberOfCols = 6; // this will be the number of columns per row
var lines = File
.ReadLines("Path here")
.Where(l => l.Split(delimiter).Count() == numberOfCols);
这将为您提供一个集合,其中包含具有指定列数的行;要收集无效行,您可以使用以下命令:
var invalidLines = File
.ReadLines("Path here")
.Select((l, lineNumber) => new { key = lineNumber, value = l })
.Where(l => l.value.Split(delimiter).Count() != numberOfCols);
如果您不知道实际的列数,但想确保这个未知数对于所有行都是相同的:
char delimiter = ',';
int columnCount = -1; // or put the number if it's known
var errors = File
.ReadLines(@"c:\file.txt", Encoding.UTF8) // UTF-8 is default and can be skipped
.Select((line, index) => {
int count = line.Split(delimiter).Length;
if (columnCount < 0)
columnCount = count;
return new {
line = line,
count = count,
index = index
};
})
.Where(chunk => chunk.count != columnCount)
.Select(chunk => String.Format("Line #{0} \"{1}\" has {2} items when {3} expected",
chunk.index + 1, chunk.line, chunk.count, columnCount));
// To check if file has any wrong lines:
if (errors.Any()) {
...
}
// To print out a report on wrong lines
Console.Write(String.Join(Envrironment.NewLine, errors));