sort_values() 和 'key' 对数据框中的一列元组进行排序

Question

我有以下数据框：

df = pd.DataFrame({'Params': {0: (400, 30),
  1: (2000, 10),
  2: (1200, 10),
  3: (2000, 30),
  4: (1600, None)},
 'mean_test_score': {0: -0.6197478578718253,
  1: -0.6164605619489576,
  2: -0.6229674626212879,
  3: -0.7963084775995496,
  4: -0.7854265341671137}})

我希望根据第一列中元组的第一个元素对其进行排序。

所需输出的第一列：

{'Params': {0: (400, 30),
  2: (1200, 10),
  4: (1600, 10),
  1: (2000, 10),
  3: (2000, 30),

我尝试像使用列表一样使用 df.sort_values(by=('Params'), key=lambda x:x[0]) 和 .sort 但我收到以下值错误：ValueError: User-provided key function must not change the shape of the array.

我查看了 sort_values() 的文档，但对于 lambda 为何不起作用的原因并没有多大帮助。

编辑：按照@DeepSpace 的建议，我做不到 df.sort_values(by='Params') 给出 '<' not supported between instances of 'NoneType' and 'int'

Answer 1

sort_values()的文档说

key should expect a Series and return a Series with the same shape as the input.

在df.sort_values(by=('Params'), key=lambda x:x[0])中，x实际上是Params列。通过使用 x[0] 访问 x，您将返回 x 系列的第一个元素，它与输入系列的形状不同。因此给你错误。

如果你想按元组的第一个元素排序，你可以这样做

df.sort_values(by='Params', key=lambda col: col.map(lambda x: x[0]))

sort_values() 和 'key' 对数据框中的一列元组进行排序

sort_values() with 'key' to sort a column of tuples in a dataframe

python

sorting

tuples

key

pandas