通过 .loc returns NaN 从聚合中添加新列

Question

我很难理解为什么我的代码无法按预期工作。

我有一个数据名结构： Screenshot to dataframe （抱歉，我没有足够高的声誉来 post 图片）

我按如下方式汇总以获得 testBytes 的总和：

aggregation = {'testBytes' : ['sum']}
tests_DL_groupped = tests_DL_short.groupby(['measDay','_p_live','_p_compositeId','Latitude','Longitude','testType']).agg(aggregation).reset_index()

现在真正的问题是 为什么这段代码不能按预期工作生成 NaN:

tests_DL_groupped.loc[:,'testMBytes'] = tests_DL_groupped['testBytes']/1000/1000

a not working

虽然这个 工作正常:

tests_DL_groupped['testMBytes'] = tests_DL_groupped['testBytes']/1000/1000

a working

哪个应该是首选的pandas方式...

非常感谢！

Answer 1

列中存在问题 MultiIndex。

解决方案是更改：

aggregation = {'testBytes' : ['sum']}

至：

aggregation = {'testBytes' : 'sum'}

避免它。

或使用GroupBy.sum:

cols = ['measDay','_p_live','_p_compositeId','Latitude','Longitude','testType']
tests_DL_groupped = tests_DL_short.groupby(cols)['testBytes'].sum().reset_index()

tests_DL_groupped = tests_DL_short.groupby(cols, as_index=False)['testBytes'].sum()

通过 .loc returns NaN 从聚合中添加新列

Adding new column from aggregated via .loc returns NaN

python

nan

dataframe

pandas