python csv 将所有行格式化为一行
python csv format all rows to one line
我有一个 csv 文件,我想将所有行都放在一列中。我试过导入 MS Excel 或使用 Notedpad++ 格式化。但是,每次尝试都会将一段数据视为新行。
我如何使用 pythons csv 模块格式化文件,以便它删除字符串 "BRAS" 并更正格式。每行都位于引号 " 和分隔符之间,分隔符是竖线 |。
更新:
"aa|bb|cc|dd|
ee|ff"
"ba|bc|bd|be|
bf"
"ca|cb|cd|
ce|cf"
上面应该是 3 行,但我的编辑认为它们是 5 行或 6 等等。
import csv
import fileinput
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
if 'BRAS' not in line:
w.write(line)
N.B 我尝试在 python.
中使用时出现 unicode 错误
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 18: character maps to <undefined>
为了解决这个问题,你甚至不需要去编写代码。
1:只需在记事本++中打开文件
2:第一行select来自|符号到下一行
3:去替换,将selected格式替换成|
搜索模式可以是正常的或扩展的:)
好吧,由于换行符是一致的,您可以按照建议进入并执行 find/replace,但您也可以使用 python 脚本进行快速转换:
import csv
import fileinput
linecount = 0
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
line = line.rstrip()
# remove unwanted breaks by concatenating pairs of rows
if linecount%2 == 0:
line1 = line
else:
full_line = line1 + line
full_line = full_line.replace(' ','')
# remove spaces from front of 2nd half of line
# if you want comma delimiters, uncomment next line:
# full_line = full_line.replace('|',',')
if 'BRAS' not in full_line:
w.write(full_line + '\n')
linecount += 1
这对我来说适用于测试数据,如果你想在写入文件时更改分隔符,你可以。用代码做的好处是:1. 你可以用代码来做(总是很有趣)和 2. 你可以同时删除换行符和过滤写入文件的内容。
这是针对小输入文件的快速破解(内容被读取到内存中)。
#!python2
fnameIn = 'ventoya.csv'
fnameOut = 'ventoya2.csv'
with open(fnameIn) as fin, open(fnameOut, 'w') as fout:
data = fin.read() # content of the input file
data = data.replace('\n', '') # make it one line
data = data.replace('""', '|') # split char instead of doubled ""
data = data.replace('"', '') # remove the first and last "
print data
for x in data.split('|'): # split by bar
fout.write(x + '\n') # write to separate lines
或者如果目标只是修复额外的(不需要的)换行符以形成一个 single-column CSV 文件,可以先修复该文件,然后通过 csv 模块读取:
#!python2
import csv
fnameIn = 'ventoya.csv'
fnameFixed = 'ventoyaFixed.csv'
fnameOut = 'ventoya2.csv'
# Fix the input file.
with open(fnameIn) as fin, open(fnameFixed, 'w') as fout:
data = fin.read() # content of the file
data = data.replace('\n', '') # remove the newlines
data = data.replace('""', '"\n"') # add the newlines back between the cells
fout.write(data)
# It is an overkill, but now the fixed file can be read using
# the csv module.
with open(fnameFixed, 'rb') as fin, open(fnameOut, 'wb') as fout:
reader = csv.reader(fin)
writer = csv.writer(fout)
for row in reader:
writer.writerow(row)
我有一个 csv 文件,我想将所有行都放在一列中。我试过导入 MS Excel 或使用 Notedpad++ 格式化。但是,每次尝试都会将一段数据视为新行。 我如何使用 pythons csv 模块格式化文件,以便它删除字符串 "BRAS" 并更正格式。每行都位于引号 " 和分隔符之间,分隔符是竖线 |。 更新:
"aa|bb|cc|dd|
ee|ff"
"ba|bc|bd|be|
bf"
"ca|cb|cd|
ce|cf"
上面应该是 3 行,但我的编辑认为它们是 5 行或 6 等等。
import csv
import fileinput
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
if 'BRAS' not in line:
w.write(line)
N.B 我尝试在 python.
中使用时出现 unicode 错误 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 18: character maps to <undefined>
为了解决这个问题,你甚至不需要去编写代码。 1:只需在记事本++中打开文件 2:第一行select来自|符号到下一行 3:去替换,将selected格式替换成|
搜索模式可以是正常的或扩展的:)
好吧,由于换行符是一致的,您可以按照建议进入并执行 find/replace,但您也可以使用 python 脚本进行快速转换:
import csv
import fileinput
linecount = 0
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
line = line.rstrip()
# remove unwanted breaks by concatenating pairs of rows
if linecount%2 == 0:
line1 = line
else:
full_line = line1 + line
full_line = full_line.replace(' ','')
# remove spaces from front of 2nd half of line
# if you want comma delimiters, uncomment next line:
# full_line = full_line.replace('|',',')
if 'BRAS' not in full_line:
w.write(full_line + '\n')
linecount += 1
这对我来说适用于测试数据,如果你想在写入文件时更改分隔符,你可以。用代码做的好处是:1. 你可以用代码来做(总是很有趣)和 2. 你可以同时删除换行符和过滤写入文件的内容。
这是针对小输入文件的快速破解(内容被读取到内存中)。
#!python2
fnameIn = 'ventoya.csv'
fnameOut = 'ventoya2.csv'
with open(fnameIn) as fin, open(fnameOut, 'w') as fout:
data = fin.read() # content of the input file
data = data.replace('\n', '') # make it one line
data = data.replace('""', '|') # split char instead of doubled ""
data = data.replace('"', '') # remove the first and last "
print data
for x in data.split('|'): # split by bar
fout.write(x + '\n') # write to separate lines
或者如果目标只是修复额外的(不需要的)换行符以形成一个 single-column CSV 文件,可以先修复该文件,然后通过 csv 模块读取:
#!python2
import csv
fnameIn = 'ventoya.csv'
fnameFixed = 'ventoyaFixed.csv'
fnameOut = 'ventoya2.csv'
# Fix the input file.
with open(fnameIn) as fin, open(fnameFixed, 'w') as fout:
data = fin.read() # content of the file
data = data.replace('\n', '') # remove the newlines
data = data.replace('""', '"\n"') # add the newlines back between the cells
fout.write(data)
# It is an overkill, but now the fixed file can be read using
# the csv module.
with open(fnameFixed, 'rb') as fin, open(fnameOut, 'wb') as fout:
reader = csv.reader(fin)
writer = csv.writer(fout)
for row in reader:
writer.writerow(row)