如何统计每行某列的特定单词 python pandas
How to count specific words in each row of a column python pandas
我在 pandas 中有以下数据框:
test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'], 'Type' : ['Fruit Dessert', 'Fruit Veggie', 'Veggie Fruit', 'Dessert Fruit', 'Veggie Fruit']})
test
Food Type
0 Apple Cake Apple Fruit Dessert Fruit
1 Orange Tomato Fruit Veggie Dessert
2 Broccoli Apple Orange Veggie Fruit Fruit
3 Cake Orange Cake Dessert Fruit Dessert
4 Tomato Apple Orange Veggie Fruit Fruit
我想创建一个新列来计算 "Type" 列中的值,并根据食物类型从大到小对它们进行排序。例如,这正是我要找的:
test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'],
'Type' : ['Fruit Dessert Fruit', 'Fruit Veggie Dessert', 'Veggie Fruit Fruit', 'Dessert Fruit Dessert', 'Veggie Fruit Fruit'],
'Count': ['2 1', '1 1 1 ', '2 1', '2 1', '2 1']})
test
Food Type Count
0 Apple Cake Apple Fruit Dessert Fruit 2 1
1 Orange Tomato Cake Fruit Veggie Dessert 1 1 1
2 Broccoli Apple Orange Veggie Fruit Fruit 2 1
3 Cake Orange Cake Dessert Fruit Dessert 2 1
4 Tomato Apple Orange Veggie Fruit Fruit 2 1
我该怎么做?非常感谢!
IIUC
s=test.Type.str.split().explode()
s=s.groupby([s.index,s]).size().sort_values(ascending=False).groupby(level=0).agg(lambda x : ' '.join(x.astype(str)))
df['C']=s
0 2 1
1 1 1 1
2 2 1
3 2 1
4 2 1
Name: Type, dtype: object
我在 pandas 中有以下数据框:
test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'], 'Type' : ['Fruit Dessert', 'Fruit Veggie', 'Veggie Fruit', 'Dessert Fruit', 'Veggie Fruit']})
test
Food Type
0 Apple Cake Apple Fruit Dessert Fruit
1 Orange Tomato Fruit Veggie Dessert
2 Broccoli Apple Orange Veggie Fruit Fruit
3 Cake Orange Cake Dessert Fruit Dessert
4 Tomato Apple Orange Veggie Fruit Fruit
我想创建一个新列来计算 "Type" 列中的值,并根据食物类型从大到小对它们进行排序。例如,这正是我要找的:
test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'],
'Type' : ['Fruit Dessert Fruit', 'Fruit Veggie Dessert', 'Veggie Fruit Fruit', 'Dessert Fruit Dessert', 'Veggie Fruit Fruit'],
'Count': ['2 1', '1 1 1 ', '2 1', '2 1', '2 1']})
test
Food Type Count
0 Apple Cake Apple Fruit Dessert Fruit 2 1
1 Orange Tomato Cake Fruit Veggie Dessert 1 1 1
2 Broccoli Apple Orange Veggie Fruit Fruit 2 1
3 Cake Orange Cake Dessert Fruit Dessert 2 1
4 Tomato Apple Orange Veggie Fruit Fruit 2 1
我该怎么做?非常感谢!
IIUC
s=test.Type.str.split().explode()
s=s.groupby([s.index,s]).size().sort_values(ascending=False).groupby(level=0).agg(lambda x : ' '.join(x.astype(str)))
df['C']=s
0 2 1
1 1 1 1
2 2 1
3 2 1
4 2 1
Name: Type, dtype: object