如何引用pandas数据框的索引字段?
How to refer to the index field of pandas data frame?
我有以下数据框:
payment_method_id payment_plan_days plan_list_price actual_amount_paid date
msno
YyO+tlZtAXYXoZhNr3Vg3+dfVQvrBVGO8j1mfqe4ZHc= 41 30 129 129 2015-01-01
AZtu6Wl0gPojrEQYB8Q3vBSmE2wnZ3hi1FbK1rQQ0A4= 41 30 149 149 2015-01-01
UkDFI97Qb6+s2LWcijVVv4rMAsORbVDT2wNXF0aVbns= 41 30 129 129 2015-01-02
关键是"msno",我需要查明是否大多数"msno"在不同的日期只使用一个payment_method_id。
所以我尝试按 "msno"、"payment_method_id" 分组,使用
transactions.groupby(['msno', 'payment_method_id']).count()
但出现错误:KeyError:'msno'
使用其他字段进行分组工作正常,例如:
transactions.groupby(['payment_plan_days', 'payment_method_id']).count()
然后对于 msno
,我什至可以使用 groupby level=0
transactions.groupby(level=0)
但我无法将包含第一列的两个级别分组。
这就是它在 transactions.columns
中的样子
Index(['payment_method_id', 'payment_plan_days', 'plan_list_price',
'actual_amount_paid', 'date']
dtype='object')
有什么建议吗?
我觉得你需要reset_index
for convert index to column, because your pandas version is bellow 0.20.1
:
Strings passed to DataFrame.groupby() as the by parameter may now reference either column names or index level names. Previously, only column names could be referenced. This allows to easily group by a column and index level at the same time.
transactions.reset_index().groupby(['msno', 'payment_method_id']).count()
因此升级后您的代码应该可以正常工作:
transactions.groupby(['msno', 'payment_method_id']).count()
通知:
我有以下数据框:
payment_method_id payment_plan_days plan_list_price actual_amount_paid date
msno
YyO+tlZtAXYXoZhNr3Vg3+dfVQvrBVGO8j1mfqe4ZHc= 41 30 129 129 2015-01-01
AZtu6Wl0gPojrEQYB8Q3vBSmE2wnZ3hi1FbK1rQQ0A4= 41 30 149 149 2015-01-01
UkDFI97Qb6+s2LWcijVVv4rMAsORbVDT2wNXF0aVbns= 41 30 129 129 2015-01-02
关键是"msno",我需要查明是否大多数"msno"在不同的日期只使用一个payment_method_id。
所以我尝试按 "msno"、"payment_method_id" 分组,使用
transactions.groupby(['msno', 'payment_method_id']).count()
但出现错误:KeyError:'msno'
使用其他字段进行分组工作正常,例如:
transactions.groupby(['payment_plan_days', 'payment_method_id']).count()
然后对于 msno
,我什至可以使用 groupby level=0
transactions.groupby(level=0)
但我无法将包含第一列的两个级别分组。
这就是它在 transactions.columns
Index(['payment_method_id', 'payment_plan_days', 'plan_list_price',
'actual_amount_paid', 'date']
dtype='object')
有什么建议吗?
我觉得你需要reset_index
for convert index to column, because your pandas version is bellow 0.20.1
:
Strings passed to DataFrame.groupby() as the by parameter may now reference either column names or index level names. Previously, only column names could be referenced. This allows to easily group by a column and index level at the same time.
transactions.reset_index().groupby(['msno', 'payment_method_id']).count()
因此升级后您的代码应该可以正常工作:
transactions.groupby(['msno', 'payment_method_id']).count()
通知: