使用 fstream 读取 csv，并且仅使用 fstream

Question

我看过关于这个问题的其他答案，但所有答案都涉及 std::stringstream，或临时 char 或 std::string 数组，各种其他类型的外部库，但我想尝试只使用 fstream header 来读取只有数字的文件，包括 char 和 short，以及 float, 以逗号分隔，组成多行文字；有些可能是数组或向量。示例：

1,1.1,11.1,11
2,2.2,22.2,22
3,3.3,33.3,33
...

顺序是已知的，因为每一行都遵循 struct 中的变量。行数可能会有所不同，但现在，我们假设它也是已知的。同样为了举例，我们只考虑这个顺序和这些类型：

int, double, double, int

根据我看过的一段代码，我尝试了这种简单的（而且很可能是幼稚的）方法：

int a, d;
double b, c;
char fileName {"file.txt"};
std::fstream fs {fileName};
if(!fs.is_open())
    // open with fs.out, write some defaults; this works, no need to mention
else
{
    char comma;
    while(fs.getline(fileName, 100, '\n'))
    {
        fs >> a >> comma >> b >> comma >> c >> comma >> d;
        std::cout << 2*a << ", " << 2*b << ", " << 2*c << ", " << 2*d << '\n';
    }
}

如果文件有上面的三行，加上终止符 \n，它输出：

4, 4.4, 44.4, 44
6, 6.6, 66.6, 66
6, 6.6, 66.6, 66
*** stack smashing detected ***: <unknown> terminated
Aborted (core dumped)

如果我在文件开头添加一个\n，它会输出：

2, 2.2, 22.2, 22
4, 4.4, 44.4, 44
6, 6.6, 66.6, 66
6, 6.6, 66.6, 66

如果我删除最后一个 \n，它会按预期工作。我有几个问题：

写入文件时，除了添加开头 \n 和不插入结尾之外，我还能做什么才能按预期工作？
如果变量的数量更长，比如每行 100 个，我该怎么做才能避免 fs >> a >> c >> ... 绕地球转？
如果我只需要阅读特定的一行，或者只有几行，一种方法可能是计算 \n 或这些行的出现次数，以某种方式。我该怎么做？

（编辑）

最后，如标题所述，是否可以在不涉及其他 header 的情况下仅使用 fstream（例如，目前的情况）？

Answer 1

The order is known, since each line follows the variables from a struct. The number of lines may vary, but, for now, let's assume it is also known. Also for the sake of example, let's only consider this order, and these types:
int, double, double, int

如果字段的数量和顺序是已知的，那么您可以根据需要使用 ',' 或 '\n' 分隔符简单地使用 >> 或 getline 进行阅读.虽然使用 line-oriented input 读取整行然后 stringstream 解析字段更明智，但没有理由不这样做正如您所指出的，仅使用 fstream 是您的目标。它不是一个优雅的解决方案，但仍然是一个有效的解决方案。

使用 >> 运算符

您的数据有 4 个字段，前 3 个由 comma 分隔，最后一个由 newline 分隔。您可以简单地连续循环并使用 >> 运算符读取并在每次读取后测试 fail() 或 eof()，例如

#include <iostream>
#include <fstream>

#define NFIELD 4
#define MAXW 128

int main (int argc, char **argv) {

    int a, d;
    double b, c;
    char comma;

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    for (;;) {          /* loop continually */
        f >> a >> comma >> b >> comma >> c >> comma >> d;
        if (f.fail() || f.eof())   
            break;
        std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
        f.ignore (MAXW, '\n');
    }
    f.close();
}

保留一个简单的字段计数器n，可以根据字段编号使用简单的switch语句将正确的值读入对应的变量，当所有字段都读完后输出(或以其他方式存储）构成结构的所有 4 个值。（显然，您也可以在阅读每个成员时填写他们）。没有什么特别的要求，例如

#include <iostream>
#include <fstream>

#define NFIELD 4

int main (int argc, char **argv) {

    int a, d, n = 0;
    double b, c;
    char comma;

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    for (;;) {          /* loop continually */
        switch (n) {    /* coordinate read based on field number */
            case 0: f >> a >> comma; if (f.eof()) goto done; break;
            case 1: f >> b >> comma; if (f.eof()) goto done; break;
            case 2: f >> c >> comma; if (f.eof()) goto done; break;
            case 3: f >> d; if (f.eof()) goto done; break;
        }
        if (++n == NFIELD) {    /* if all fields read */
            std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
            n = 0;      /* reset field number */
        }
    }
    done:;
    f.close();
}

示例输入文件

使用您提供的示例输入。

$ cat dat/mixed.csv
1,1.1,11.1,11
2,2.2,22.2,22
3,3.3,33.3,33

例子Use/Output

只需将输出中的每个字段加倍即可获得所需的输出：

$ ./bin/csv_mixed_read dat/mixed.csv
2,2.2,22.2,22
4,4.4,44.4,44
6,6.6,66.6,66

（以上两者的输出相同）

使用 getline 由 ',' 和 '\n'

分隔

您可以对逻辑稍作改动以使用 getline。在这里，您使用 f.getline(buf, MAXC, ',') 读取前 3 个字段，当找到第 3 个字段时，您使用 f.getline(buf, MAXC) 读取最后一个字段。例如，

#include <iostream>
#include <fstream>

#define NFIELD  4
#define MAXC  128

int main (int argc, char **argv) {

    int a = 0, d = 0, n = 0;
    double b = 0.0, c = 0.0;
    char buf[MAXC];

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    while (f.getline(buf, MAXC, ',')) { /* read each field */
        switch (n) {    /* coordinate read based on field number */
            case 0: a = std::stoi (buf); break;
            case 1: b = std::stod (buf); break;
            case 2: c = std::stod (buf); 
                if (!f.getline(buf, MAXC))  /* read d with '\n' delimiter */
                    goto done;
                d = std::stoi (buf);
                break;
        }
        if (++n == NFIELD - 1) {    /* if all fields read */
            std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
            n = 0;      /* reset field number */
        }
    }
    done:;
    f.close();
}

(注意：不像使用>>运算符，像上面那样使用getline时，不能有whitespace 跟随每个 comma.)

例子Use/Output

输出相同。

$ ./bin/csv_mixed_read2 dat/mixed.csv
2,2.2,22.2,22
4,4.4,44.4,44
6,6.6,66.6,66

无论您是使用上述示例还是 stringstream，您都必须知道字段的数量和顺序。无论您使用循环和 if..else if..else 还是 switch，逻辑都是相同的。你需要一些方法来协调你的阅读与正确的领域。保持一个简单的字段计数器与其他任何事情一样简单。查看所有内容，如果您还有其他问题，请告诉我。

使用 fstream 读取 csv，并且仅使用 fstream

Reading csv with fstream, and only fstream

c++

fstream