为什么使用 python 在 CSV 文件中写入制表符 \t

Question

假设我有一个包含制表符的列表：

mylist = [['line 1', '<a href="//<% serverNames[0].getHostname() %>:'],
          ['line 2', '     <% master.getConfiguration()>']]

当我将列表保存到CSV文件中时，第2行代码中的tab将被写入\t。

line | code
-----------------------------------------------------
   1 | <a href="//<% serverNames[0].getHostname() %>:
   2 | \t   <% master.getConfiguration()>

我需要这个，因为我想将代码与其他列表进行比较。所以，我不想用空格等其他字符替换制表符。

我写的代码：

with open('codelist.csv', 'w') as file:
   header = ['line','code']
   writers = csv.writer(file)
   writers.writerow(header)
   for row in mylist:
      writers.writerow(row)

如何解决这种问题？

Answer 1

我无法重现 Python2 或 Python3 中的确切错误，但我猜测可能会发生什么。

根据 csv.writer、located here、

的文档

All other non-string data are stringified with str() before being written.

此外请注意，如果您提供包含实际制表符的字符串，python str 函数会精确地引发您所描述的行为：

 >>> str('  ')
 '\t'

当然，您拥有的是字符串数据，但是，但是上面的文档并没有真正说明 other 的含义。这是我在 _csv.c、located here:

的 writerows 实现中发现的

    if (PyUnicode_Check(field)) {
        append_ok = join_append(self, field, quoted);
        Py_DECREF(field);
    }
    else if (field == Py_None) {
        append_ok = join_append(self, NULL, quoted);
        Py_DECREF(field);
    }
    else {
        PyObject *str;

        str = PyObject_Str(field);
        Py_DECREF(field);
        if (str == NULL) {
            Py_DECREF(iter);
            return NULL;
        }
        append_ok = join_append(self, str, quoted);
        Py_DECREF(str);
    }

所以我怀疑这里发生的事情是您的列表以某种方式包含无法识别为 unicode 字符串的格式的字符串数据，因此在测试中未通过 PyUnicode_Check 分支，通过 str（在 C 代码中称为 PyObject_Str），因此得到嵌入的转义序列。

所以您可能想检查这些数据是如何进入您的列表的。

或者，也许我正在查看的源代码与您正在使用的 Python 版本不对应，而您使用的版本只是运行一切到str。

为什么使用 python 在 CSV 文件中写入制表符 \t

Why tabs written \t in CSV file using python

python

csv

tabs

space

list