读取 csv 文件的 header 并查看它是否与字典键匹配,然后将该键的值写入行
Reading header of csv file and seeing if it matches a dictionary key, then write value of that key to row
基本上我会有一堆小词典,像这样:
dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}
]
然后我有一个 csv 文件,其中包含一大堆在 header 中包含单词的列,如下所示:
可能有 500 列,每列有 1 个单词,我不知道列出现的顺序。但是,我确实知道我的小词典中的任何单词都应该与列中的单词匹配。
我想遍历文件的 headers(首先跳到第 5 列 headers),每次查看是否可以在中找到 header 名称字典,如果是,则将值添加到该行,如果不是,则添加 "no"。这将逐行完成,其中每一行对应一个小词典。此文件使用上述字典的结果为:
到目前为止,我已经能够尝试以下方法,但实际上不起作用:
f = open("file.csv", "r")
writer = csv.DictWriter(f)
for dict in dictionary_list: # this is the collection of little dictionaries
# do some other stuff
for r in writer:
#not sure how to skip 10 columns here. next() seems to work on rows
for col in r:
if col in dict.keys():
writer.writerow(dict.values())
else:
writer.writerow("no")
‘Pandas’或许能帮到你。
这是网站http://pandas.pydata.org/pandas-docs/stable/。
您可以使用 pandas.read_csv()
方法处理 csv 文件,并使用 Dataframe.append()
方法根据需要添加一些数据。
希望这些对您有所帮助。
您的问题似乎是要求确保您的 dictionary_list 中的字段存在于记录中。如果该字段原先存在于记录中则将该字段值设置为yes,否则将该字段添加到记录中并设置该字段值为no。
#!/usr/bin/env python3
import csv
dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","them": "yes", "nineteen": "yes"}
]
"""
flatten all the dicionary keys into a uniq list as the
key names will be used for field names and can't be duplicated
"""
field_check = set([k for d in dictionary_list for k in d.keys()])
if __name__ == "__main__":
with open("file.csv", "r") as f:
reader = csv.DictReader(f)
# do not consider the first 10 colums
field_tail = set(reader.fieldnames[10:])
"""
initialize yes and no fields as they
should be the same for every row in the file
"""
yes_fields = set(field_check & field_tail)
no_fields = field_check.difference(yes_fields)
yes_dict = {k:"yes" for k in yes_fields}
no_dict = {k:"no" for k in no_fields}
for row in reader:
row.update(yes_dict)
row.update(no_dict)
print(row)
给定一个输入文件 headers.csv
:
row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
以下代码生成您的输出:
import csv
dictionary_list = [{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}]
# Read the input header line as a list
with open('headers.csv',newline='') as f:
reader = csv.reader(f)
headers = next(reader)
# Generate the fixed values for the first 5 rows.
rowvals = dict(zip(headers[:5],['x'] * 5))
with open('file.csv', 'w', newline='') as f:
# When writing a row, restval is the default value when it isn't in the dict row.
# extrasaction='ignore' prevents complaining if all columns are not present in dict row.
writer = csv.DictWriter(f,headers,restval='no',extrasaction='ignore')
writer.writeheader()
for dictionary in dictionary_list:
D = dictionary.copy() # needed if the original shouldn't be modified.
D.update(rowvals)
writer.writerow(D)
输出:
row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
x,x,x,x,x,no,no,yes,no,yes,no,yes,no,no,no,no,yes,no,no,no
x,x,x,x,x,no,no,no,yes,no,no,no,yes,yes,no,no,yes,no,no,no
基本上我会有一堆小词典,像这样:
dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}
]
然后我有一个 csv 文件,其中包含一大堆在 header 中包含单词的列,如下所示:
我想遍历文件的 headers(首先跳到第 5 列 headers),每次查看是否可以在中找到 header 名称字典,如果是,则将值添加到该行,如果不是,则添加 "no"。这将逐行完成,其中每一行对应一个小词典。此文件使用上述字典的结果为:
到目前为止,我已经能够尝试以下方法,但实际上不起作用:
f = open("file.csv", "r")
writer = csv.DictWriter(f)
for dict in dictionary_list: # this is the collection of little dictionaries
# do some other stuff
for r in writer:
#not sure how to skip 10 columns here. next() seems to work on rows
for col in r:
if col in dict.keys():
writer.writerow(dict.values())
else:
writer.writerow("no")
‘Pandas’或许能帮到你。
这是网站http://pandas.pydata.org/pandas-docs/stable/。
您可以使用 pandas.read_csv()
方法处理 csv 文件,并使用 Dataframe.append()
方法根据需要添加一些数据。
希望这些对您有所帮助。
您的问题似乎是要求确保您的 dictionary_list 中的字段存在于记录中。如果该字段原先存在于记录中则将该字段值设置为yes,否则将该字段添加到记录中并设置该字段值为no。
#!/usr/bin/env python3
import csv
dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","them": "yes", "nineteen": "yes"}
]
"""
flatten all the dicionary keys into a uniq list as the
key names will be used for field names and can't be duplicated
"""
field_check = set([k for d in dictionary_list for k in d.keys()])
if __name__ == "__main__":
with open("file.csv", "r") as f:
reader = csv.DictReader(f)
# do not consider the first 10 colums
field_tail = set(reader.fieldnames[10:])
"""
initialize yes and no fields as they
should be the same for every row in the file
"""
yes_fields = set(field_check & field_tail)
no_fields = field_check.difference(yes_fields)
yes_dict = {k:"yes" for k in yes_fields}
no_dict = {k:"no" for k in no_fields}
for row in reader:
row.update(yes_dict)
row.update(no_dict)
print(row)
给定一个输入文件 headers.csv
:
row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
以下代码生成您的输出:
import csv
dictionary_list = [{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}]
# Read the input header line as a list
with open('headers.csv',newline='') as f:
reader = csv.reader(f)
headers = next(reader)
# Generate the fixed values for the first 5 rows.
rowvals = dict(zip(headers[:5],['x'] * 5))
with open('file.csv', 'w', newline='') as f:
# When writing a row, restval is the default value when it isn't in the dict row.
# extrasaction='ignore' prevents complaining if all columns are not present in dict row.
writer = csv.DictWriter(f,headers,restval='no',extrasaction='ignore')
writer.writeheader()
for dictionary in dictionary_list:
D = dictionary.copy() # needed if the original shouldn't be modified.
D.update(rowvals)
writer.writerow(D)
输出:
row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
x,x,x,x,x,no,no,yes,no,yes,no,yes,no,no,no,no,yes,no,no,no
x,x,x,x,x,no,no,no,yes,no,no,no,yes,yes,no,no,yes,no,no,no