如何根据字典条件重新排序 pandas 数据框
how to re order a pandas dataframe based on a dictionary condition
我有一个这样的df,
case step deep value
0 case 1 1 ram in India ram,cricket
1 NaN 2 ram plays cricket NaN
2 case 2 1 ravi played football ravi
3 NaN 2 ravi works welll NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN
还有一本字典,my_dict = {ram:1,cricket:1,ravi:2.5,sri:1}
我正在尝试根据字典的值重新排序数据帧,我使用 tfidf 方法实现了这个字典。我在重新排序时遇到困难,因为我们需要重新排序包含值的行。
我的预期输出是,
case step deep value
2 case 2 1 ravi played football ravi
3 NaN 2 ravi works welll NaN
0 case 1 1 ram in India ram,cricket
1 NaN 2 ram plays cricket NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN
请帮忙,提前致谢!
您可以创建 MultiIndex
进行排序,只有 value
列的必要值在 my_dict
:
中
my_dict = {'ram':1,'cricket':1,'ravi':2.5,'sri':1}
#create DataFrame from value column, replace and sum columns
a = df['value'].str.split(',', expand=True).replace(my_dict).sum(axis=1)
#create groups
b = df['step'].diff().le(0).cumsum()
#create Series by summing per groups
c = a.groupby(b).transform('sum')
#create MultiIndex
df.index = [c,b]
print (df)
case step deep value
step
2.0 0 case 1 1 ram in India ram,cricket
0 NaN 2 ram plays cricket NaN
2.5 1 case 2 1 ravi played football ravi
1 NaN 2 ravi works welll NaN
1.0 2 case 3 1 Sri bought a car sri
2 NaN 2 sri went out NaN
#sorting MultiIndex and removing
df = df.sort_index(ascending=False).reset_index(drop=True)
print (df)
case step deep value
0 case 2 1 ravi played football ravi
1 NaN 2 ravi works welll NaN
2 case 1 1 ram in India ram,cricket
3 NaN 2 ram plays cricket NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN
我有一个这样的df,
case step deep value
0 case 1 1 ram in India ram,cricket
1 NaN 2 ram plays cricket NaN
2 case 2 1 ravi played football ravi
3 NaN 2 ravi works welll NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN
还有一本字典,my_dict = {ram:1,cricket:1,ravi:2.5,sri:1}
我正在尝试根据字典的值重新排序数据帧,我使用 tfidf 方法实现了这个字典。我在重新排序时遇到困难,因为我们需要重新排序包含值的行。
我的预期输出是,
case step deep value
2 case 2 1 ravi played football ravi
3 NaN 2 ravi works welll NaN
0 case 1 1 ram in India ram,cricket
1 NaN 2 ram plays cricket NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN
请帮忙,提前致谢!
您可以创建 MultiIndex
进行排序,只有 value
列的必要值在 my_dict
:
my_dict = {'ram':1,'cricket':1,'ravi':2.5,'sri':1}
#create DataFrame from value column, replace and sum columns
a = df['value'].str.split(',', expand=True).replace(my_dict).sum(axis=1)
#create groups
b = df['step'].diff().le(0).cumsum()
#create Series by summing per groups
c = a.groupby(b).transform('sum')
#create MultiIndex
df.index = [c,b]
print (df)
case step deep value
step
2.0 0 case 1 1 ram in India ram,cricket
0 NaN 2 ram plays cricket NaN
2.5 1 case 2 1 ravi played football ravi
1 NaN 2 ravi works welll NaN
1.0 2 case 3 1 Sri bought a car sri
2 NaN 2 sri went out NaN
#sorting MultiIndex and removing
df = df.sort_index(ascending=False).reset_index(drop=True)
print (df)
case step deep value
0 case 2 1 ravi played football ravi
1 NaN 2 ravi works welll NaN
2 case 1 1 ram in India ram,cricket
3 NaN 2 ram plays cricket NaN
4 case 3 1 Sri bought a car sri
5 NaN 2 sri went out NaN