在 Pandas 中，是否有其他方法可以创建从所有其他行派生的按行数据？

Question

作为大学数据科学课程的一部分，我们被要求计算出最偏远的首都城市。我在这里问这个问题是因为我对我的回答不满意，但是提交后我没有得到更好的选择。

据我了解，该任务需要 3 个部分：

获取首都位置数据
为 lat/long 对创建距离函数
使用 pandas 找出首都到任何其他城市的最小距离

前 2 个任务微不足道。然而，我努力寻找解决第三个任务的方法 without resorting to iterators。距离函数需要一对 lat/long 值。我需要找出一种方法将此函数应用于每一行，每一行。

capitals['closest'] = inf
for idx, row_x in capitals.iterrows():
    capitals.at[idx,'closest'] = capitals.apply(lambda row_y: 
                                 haversine(row_x['lat'],row_x['lng'],row_y['lat'],row_y['lng'])
                                 if row_x['city'] != row_y['city']
                                 else inf
                                , axis=1).min()

有没有办法嵌套调用 DataFrame apply 方法？有没有其他方法可以创建从所有其他行派生的按行数据？

编辑： 这是我的最终答案，之前使用了一个迭代器（参见提交历史），但后来更新为更好的解决方案： https://github.com/maccaroo/worldcities/blob/main/world_cities.ipynb

Answer 1

我在 'Similar questions' 搜索中找到了解决方案，因为我正要 post，但我觉得我的答案是不同的，足以保证 post。

首先，这是 post which (mostly) answered by question。但是我一直收到这个错误：KeyError: ('city', 'occurred at index city', 'occurred at index city')

This article 让我完成了任务。解决方案是 axis=1 参数，它告诉 apply 在处理数据时使用列索引而不是行索引。

这是我的最终代码：

capitals['closest'] = inf
capitals['closest'] = capitals.apply(lambda row:
    capitals.apply(lambda x: 
                   haversine(row['lat'],row['lng'],x['lat'],x['lng']) 
                   if row['city'] != x['city'] 
                   else inf
              ,axis=1)
    ,axis=1).min()

在 Pandas 中，是否有其他方法可以创建从所有其他行派生的按行数据？

Is there some other way to create row-wise data that's derived from all other rows, in Pandas?

python

apply

pandas