如何读取包含一行和多条记录的文件 - C#

How to read file that contains one row with multiple records- C#

我有这个只有一行的文本文件。每个文件包含一个客户名称,但包含多个项目和描述。
以 00(公司名称)开头的记录的字符长度为 10
01 (Item#) - 字符长度为 10
02(描述)- 字符长度为 50

我知道如何读取文件,但我不知道如何只循环遍历一行,找到记录 00、01、02 并根据长度抓取文本,最后从最后记录的位置并再次开始循环。有人可以告诉我如何读取这样的文件吗?

输出:

companyName     16622        Description
companyName     15522        Description

输入文本文件示例

00Init    0115522   02Description                                     0116622   02Description                                    

如果您将整个文件读入一个字符串,您有几个选择。

一个,使用string.split可能会有用。

另一种选择是使用 string.indexof。一旦有了索引,就可以使用 string.substring

您必须根据分隔符拆分行。在您的情况下,您似乎使用空格作为分隔符。

您正在寻找的方法是 String.Split(),它应该可以满足您的需求:) 文档位于 https://msdn.microsoft.com/en-us/library/system.string.split(v=vs.110).aspx - 它还包括示例。

我会这样做:

string myLineOfText = "MyCompany    12345    The description of my company";
string[] partsOfMyLine = myLineOfText.Split(new string[] { "    " }, StringSplitOptions.RemoveEmptyEntries);

祝你好运! :)

假设 fixed-width 是指定的,让我们创建两个简单的 classes 来将客户端及其相关数据保存为列表:

    // can hold as many items (data) as there are in the line
    public class Client
    {
        public string name;
        public List<ClientData> data;
    };

    // one single item in the client data
    public class ClientData
    {
        public string code;
        public string description;
    };

要解析一行(假设有一个客户端和一个连续的 item/description 列表),我们可以这样做(注意:为了简化,我只是创建一个静态的 class 里面有一个静态方法):

    // this parser will read as many itens as there are in the line
    // and return a Client instance with those inside.
    public static class Parser
    {
        public static Client ParseData(string line)
        {
            Client client = new Client ();
            client.data = new List<ClientData> ();
            client.name = line.Substring (2, 10);

            // remove the client name
            line = line.Substring (12);

            while (line.Length > 0)
            {
                // create new item
                ClientData data = new ClientData ();
                data.code = line.Substring (2, 10);
                data.description = line.Substring (14, 50);
                client.data.Add (data);

                // next item
                line = line.Substring (64);
            }

            return client;
        }
    }

因此,在您的主循环中,刚从文件中读取了新的一行之后,您就可以调用上述方法来接收新的客户端。像这样:

        // should be from a file but this is just an example
        string[] lines = {
            "00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
            "00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
            "00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
            "00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
            "00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
        };

        // loop through each line
        // (lines can have multiple items)
        foreach (string line in lines)
        {
            Client client = Parser.ParseData (line);
            Console.WriteLine ("Read: " + client.name);
        }

此解决方案假定数据是固定宽度的,并且项目编号将在描述之前(01 在 02 之前)。该解决方案会在每次遇到描述记录时发出一条记录,并处理同一公司的多个产品。

首先,定义一个 class 来保存您的数据:

public class Record
{
    public string CompanyName { get; set; }
    public string ItemNumber { get; set; }
    public string Description { get; set; }
}

然后,遍历您的字符串,当您获得描述时返回一条记录:

public static IEnumerable<Record> ReadFile(string input)
{
    // Alter these as appropriate
    const int RECORDTYPELENGTH = 2;
    const int COMPANYNAMELENGTH = 41;
    const int ITEMNUMBERLENGTH = 8;
    const int DESCRIPTIONLENGTH = 48;

    int index = 0;
    string companyName = null;
    string itemNumber = null;

    while (index < input.Length)
    {
        string recordType = input.Substring(index, RECORDTYPELENGTH);
        index += RECORDTYPELENGTH;

        if (recordType == "00")
        {
            companyName = input.Substring(index, COMPANYNAMELENGTH).Trim();
            index += COMPANYNAMELENGTH;
        }
        else if (recordType == "01")
        {
            itemNumber = input.Substring(index, ITEMNUMBERLENGTH).Trim();
            index += ITEMNUMBERLENGTH;
        }
        else if (recordType == "02")
        {
            string description = input.Substring(index, DESCRIPTIONLENGTH).Trim();
            index += DESCRIPTIONLENGTH;

            yield return new Record
            {
                CompanyName = companyName,
                ItemNumber = itemNumber,
                Description = description
            };
        }
        else 
        {
            throw new FormatException("Unexpected record type " + recordType);
        }
    }
}

请注意,您在问题中的字段长度与示例数据不匹配,因此我对其进行了调整,以便解决方案适用于您提供的数据。您可以通过调整常量来调整字段长度。

像下面这样使用:

string input = "00CompanyName                              0115522   02Description                                     0116622   02Description                                     ";

foreach (var record in ReadFile(input))
{
    Console.WriteLine("{0}\t{1}\t{2}", record.CompanyName, record.ItemNumber, record.Description);
}

Sample.txt 的内容:

00Company1  0115522     02This is a description for company 1.              00Company2  0115523     02This is a description for company 2.              00Company3  0115524     02This is a description for company 3               

请注意,在下面的代码中,字段比原始问题中指定的字段长 2 个字符。这是因为我在每个字段的长度中包含了标题,因此长度为 10 的字段通过包含标题中的 00 实际上是 12。如果这是不可取的,请调整 fieldLengths 数组中条目的偏移量。

String directory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
String file = "Sample.txt";
String path = Path.Combine(directory, file);
Int32[] fieldLengths = new Int32[] { 12, 12, 52 };

List<RowData> rows = new List<RowData>();

Byte[] buffer = new Byte[fieldLengths.Sum()];
using (var stream = File.OpenRead(path))
{
    while (stream.Read(buffer, 0, buffer.Length) > 0)
    {
        List<String> fieldValues = new List<String>();

        Int32 offset = 0;
        for (int i = 0; i < fieldLengths.Length; i++)
        {
            var value = Encoding.UTF8.GetString(buffer, offset, fieldLengths[i]);
            fieldValues.Add(value);
            offset += fieldLengths[i];
        }

        String companyName = fieldValues[0];
        String itemNumber = fieldValues[1];
        String description = fieldValues[2];

        var row = new RowData(companyName, itemNumber, description);
        rows.Add(row);
    }
}

Class RowData 的定义:

public class RowData
{
    public String Company { get; set; }
    public String Number { get; set; }
    public String Description { get; set; }

    public RowData(String company, String number, String description)
    {
        Company = company;
        Number = number;
        Description = description;
    }
}

结果将在 rows 变量中。