python csv 将所有行格式化为一行

Question

我有一个 csv 文件，我想将所有行都放在一列中。我试过导入 MS Excel 或使用 Notedpad++ 格式化。但是，每次尝试都会将一段数据视为新行。我如何使用 pythons csv 模块格式化文件，以便它删除字符串 "BRAS" 并更正格式。每行都位于引号 " 和分隔符之间，分隔符是竖线 |。更新：

 "aa|bb|cc|dd|
 ee|ff"
 "ba|bc|bd|be|
 bf"
 "ca|cb|cd|
 ce|cf"

上面应该是 3 行，但我的编辑认为它们是 5 行或 6 等等。

import csv
import fileinput


with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
    for line in f:
        if 'BRAS' not in line:
            w.write(line)

N.B 我尝试在 python.

中使用时出现 unicode 错误

 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
 UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 18: character maps to <undefined>

Answer 1

为了解决这个问题，你甚至不需要去编写代码。 1：只需在记事本++中打开文件 2：第一行select来自|符号到下一行 3：去替换，将selected格式替换成|

搜索模式可以是正常的或扩展的:)

Answer 2

好吧，由于换行符是一致的，您可以按照建议进入并执行 find/replace，但您也可以使用 python 脚本进行快速转换：

import csv
import fileinput

linecount = 0
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
    for line in f: 
        line = line.rstrip()

# remove unwanted breaks by concatenating pairs of rows            

        if linecount%2 == 0:
            line1 = line
        else:
            full_line = line1 + line

            full_line = full_line.replace(' ','')
# remove spaces from front of 2nd half of line

# if you want comma delimiters, uncomment next line:
#           full_line = full_line.replace('|',',')

        if 'BRAS' not in full_line:
            w.write(full_line + '\n')
    linecount += 1

这对我来说适用于测试数据，如果你想在写入文件时更改分隔符，你可以。用代码做的好处是：1. 你可以用代码来做（总是很有趣）和 2. 你可以同时删除换行符和过滤写入文件的内容。

Answer 3

这是针对小输入文件的快速破解（内容被读取到内存中）。

#!python2

fnameIn = 'ventoya.csv'
fnameOut = 'ventoya2.csv'
with open(fnameIn) as fin, open(fnameOut, 'w') as fout:
    data = fin.read()              # content of the input file
    data = data.replace('\n', '')  # make it one line
    data = data.replace('""', '|') # split char instead of doubled ""
    data = data.replace('"', '')   # remove the first and last "
    print data
    for x in data.split('|'):      # split by bar
        fout.write(x + '\n')       # write to separate lines

或者如果目标只是修复额外的（不需要的）换行符以形成一个 single-column CSV 文件，可以先修复该文件，然后通过 csv 模块读取：

#!python2
import csv

fnameIn = 'ventoya.csv'
fnameFixed = 'ventoyaFixed.csv'
fnameOut = 'ventoya2.csv'

# Fix the input file.
with open(fnameIn) as fin, open(fnameFixed, 'w') as fout:
    data = fin.read()                   # content of the file
    data = data.replace('\n', '')       # remove the newlines
    data = data.replace('""', '"\n"')   # add the newlines back between the cells
    fout.write(data)

# It is an overkill, but now the fixed file can be read using
# the csv module.
with open(fnameFixed, 'rb') as fin, open(fnameOut, 'wb') as fout:
    reader = csv.reader(fin)
    writer = csv.writer(fout)
    for row in reader:
        writer.writerow(row)

python csv 将所有行格式化为一行

python csv format all rows to one line

csv

excel

python-2.7

python-3.x

export-to-csv