如何统计每行某列的特定单词 python pandas

Question

我在 pandas 中有以下数据框：

test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'], 'Type' : ['Fruit Dessert', 'Fruit Veggie', 'Veggie Fruit', 'Dessert Fruit', 'Veggie Fruit']})
test

          Food                    Type
0   Apple Cake Apple       Fruit Dessert Fruit
1   Orange Tomato          Fruit Veggie Dessert
2   Broccoli Apple Orange  Veggie Fruit Fruit
3   Cake Orange Cake       Dessert Fruit Dessert
4   Tomato Apple Orange    Veggie Fruit Fruit

我想创建一个新列来计算 "Type" 列中的值，并根据食物类型从大到小对它们进行排序。例如，这正是我要找的：

test = pd.DataFrame({'Food': ['Apple Cake Apple', 'Orange Tomato Cake', 'Broccoli Apple Orange', 'Cake Orange Cake', 'Tomato Apple Orange'],
                     'Type' : ['Fruit Dessert Fruit', 'Fruit Veggie Dessert', 'Veggie Fruit Fruit', 'Dessert Fruit Dessert', 'Veggie Fruit Fruit'],
                     'Count': ['2 1', '1 1 1 ', '2 1', '2 1', '2 1']})
test

    Food                             Type          Count
0   Apple Cake Apple        Fruit Dessert Fruit     2 1
1   Orange Tomato Cake      Fruit Veggie Dessert    1 1 1
2   Broccoli Apple Orange   Veggie Fruit Fruit      2 1
3   Cake Orange Cake        Dessert Fruit Dessert   2 1
4   Tomato Apple Orange     Veggie Fruit Fruit      2 1

我该怎么做？非常感谢！

Answer 1

IIUC

s=test.Type.str.split().explode()
s=s.groupby([s.index,s]).size().sort_values(ascending=False).groupby(level=0).agg(lambda x : ' '.join(x.astype(str)))
df['C']=s
0      2 1
1    1 1 1
2      2 1
3      2 1
4      2 1
Name: Type, dtype: object

如何统计每行某列的特定单词 python pandas

How to count specific words in each row of a column python pandas

python

regex

string

string-formatting

pandas