基于列表中的其他属性对属性求和
Summing an Attribute Based on other Attributes in a list
基本上我有一个包含数据的 csv 文件,如下所示:
['Store A', '2015-03-04', '00948', 'Red','A','AA']
['Store C', '2015-05-06', '00948', 'Blue','A','BB']
['Store B', '2015-07-08', '101130', 'Red','B','CC']
['Store A', '2015-09-10', '111011', 'Blue','C','DD']
['Store C', '2015-10-11', '101510', 'Red','A','EE']
['Store B', '2015-11-12', '101459', 'Red','B','FF']
['Store C', '2015-15-04', '01836', 'Blue','C','GG']
['Store B', '2015-30-05', '02201', 'Blue','A','HH']
['Store A', '2015-18-06', '04022', 'Red','C','II']
['Store C', '2015-07-07', '11056', 'Blue','B','JJ']
['Store C', '2015-08-05', '10149', 'Red','D','KK']
['Store A', '2015-10-04', '113569', 'Red','A','LL']
['Store B', '2015-12-03', '005410', 'Blue','C','MM']
['Store A', '2015-15-02', '053410', 'Blue','E','NN']
['Store A', '2015-16-04', '113410', 'Red','J','OO']
我想确定单词 'Blue' 在每个列表中出现了多少次,这样输出基本上就是单词 'Blue' 给定的第一个属性即商店 A 的总和, B 和 C, 需要的输出应该是:
['Store A','Blue','2']
['Store B','Blue','2']
['Store c','Blue','3']
我的代码如下:
csvReader = csv.reader(open('count.csv','rb'), delimiter=',', quotechar='"')
for line in csvReader:
print line.count('Blue')
显然结果是:
>>>
0
0
0
.
.
.
.
0
0
我也试过代码:
csvReader = csv.reader(open('count.csv','rb'), delimiter=',', quotechar='"')
for line in csvReader:
count_blue= [[x, line.count('Blue')] for x in set(line)]
print count_blue
它也没有给我所需的输出。什么似乎是我的错误?感谢您的帮助。
这看起来不像 CSV 文件,它看起来像每行一个 Python 列表。使用 literal_eval
阅读它并将其提供给 Counter
:
from ast import literal_eval
from collections import Counter
blues = Counter()
with open("count.csv") as f:
for line in f:
ls = literal_eval(line)
if ls[3] == 'Blue':
blues[ls[0]] += 1
如果您想以所需的输出格式打印它:
for key in blues:
print("['{}', 'Blue', {}]".format(key, blues[key]))
我假设您的 CSV 文件实际上是 CSV 文件。逗号是分隔符,quotechar 是单引号字符 '
.
计算第 0 列中每个商店的第 3 列(从零开始)出现的次数需要按第 0 列对数据进行分组。一种方法是使用字典。 collections.defaultdict
是一种字典,可以很容易地收集具有公共键的值列表。一旦你有了它,你就可以计算 "Blue" 个项目,或者 "Red",或者你可能拥有的任何其他项目。
import csv
from collections import defaultdict
d = defaultdict(list)
with open('count.csv') as f:
for row in csv.reader(f, quotechar="'"):
d[row[0]].append(row[3])
for k in sorted(d):
print('{},{}'.format(k, d[k].count('Blue')))
输出
Store A,2
Store B,2
Store C,3
基本上我有一个包含数据的 csv 文件,如下所示:
['Store A', '2015-03-04', '00948', 'Red','A','AA']
['Store C', '2015-05-06', '00948', 'Blue','A','BB']
['Store B', '2015-07-08', '101130', 'Red','B','CC']
['Store A', '2015-09-10', '111011', 'Blue','C','DD']
['Store C', '2015-10-11', '101510', 'Red','A','EE']
['Store B', '2015-11-12', '101459', 'Red','B','FF']
['Store C', '2015-15-04', '01836', 'Blue','C','GG']
['Store B', '2015-30-05', '02201', 'Blue','A','HH']
['Store A', '2015-18-06', '04022', 'Red','C','II']
['Store C', '2015-07-07', '11056', 'Blue','B','JJ']
['Store C', '2015-08-05', '10149', 'Red','D','KK']
['Store A', '2015-10-04', '113569', 'Red','A','LL']
['Store B', '2015-12-03', '005410', 'Blue','C','MM']
['Store A', '2015-15-02', '053410', 'Blue','E','NN']
['Store A', '2015-16-04', '113410', 'Red','J','OO']
我想确定单词 'Blue' 在每个列表中出现了多少次,这样输出基本上就是单词 'Blue' 给定的第一个属性即商店 A 的总和, B 和 C, 需要的输出应该是:
['Store A','Blue','2']
['Store B','Blue','2']
['Store c','Blue','3']
我的代码如下:
csvReader = csv.reader(open('count.csv','rb'), delimiter=',', quotechar='"')
for line in csvReader:
print line.count('Blue')
显然结果是:
>>>
0
0
0
.
.
.
.
0
0
我也试过代码:
csvReader = csv.reader(open('count.csv','rb'), delimiter=',', quotechar='"')
for line in csvReader:
count_blue= [[x, line.count('Blue')] for x in set(line)]
print count_blue
它也没有给我所需的输出。什么似乎是我的错误?感谢您的帮助。
这看起来不像 CSV 文件,它看起来像每行一个 Python 列表。使用 literal_eval
阅读它并将其提供给 Counter
:
from ast import literal_eval
from collections import Counter
blues = Counter()
with open("count.csv") as f:
for line in f:
ls = literal_eval(line)
if ls[3] == 'Blue':
blues[ls[0]] += 1
如果您想以所需的输出格式打印它:
for key in blues:
print("['{}', 'Blue', {}]".format(key, blues[key]))
我假设您的 CSV 文件实际上是 CSV 文件。逗号是分隔符,quotechar 是单引号字符 '
.
计算第 0 列中每个商店的第 3 列(从零开始)出现的次数需要按第 0 列对数据进行分组。一种方法是使用字典。 collections.defaultdict
是一种字典,可以很容易地收集具有公共键的值列表。一旦你有了它,你就可以计算 "Blue" 个项目,或者 "Red",或者你可能拥有的任何其他项目。
import csv
from collections import defaultdict
d = defaultdict(list)
with open('count.csv') as f:
for row in csv.reader(f, quotechar="'"):
d[row[0]].append(row[3])
for k in sorted(d):
print('{},{}'.format(k, d[k].count('Blue')))
输出
Store A,2 Store B,2 Store C,3