为 haversine 地理定位创建循环内许多循环的数据框

Creating a dataframe of many loops within loops for haversine geolocation

我有一个 3 人(“成员”)的 df,我想测量这些人与 3 个位置的距离。最终结果将是一个 df 对所有 3 个人从最近到更远的 3 个位置进行排名。我有所有 3 个人和所有 3 个位置的地理坐标,这是我迄今为止尝试过的,但我不知道如何完成循环以将帧连接到主帧。请帮忙!:

df = []
df_2 = []

for m in range(len(members)):

    df_temp_member = pd.DataFrame({'member_id': members.iloc[[m]]['member_id']
                                   })

    for s in range(len(locations)):
        dist = haversine(lon1 = members.iloc[[m]]['longitude']
                        ,lat1 = members.iloc[[m]]['latitude']
                        ,lon2 = locations.iloc[[s]]['Longitude']
                        ,lat2 = locations.iloc[[s]]['Latitude'])

        df_temp = pd.DataFrame({'location_name': locations.iloc[[s]]['location_name'],
                                'Distance': dist,
                                })

        df.append(df_temp)

    df = pd.concat(df)
    df = df.sort_values(by='Distance', ascending=True, na_position='first').reset_index(drop = True).reset_index(drop = True)

    df_temp_1 = pd.DataFrame({'location_1': df.iloc[[0]]['location_name'],
                              'Distance_1': df.iloc[[0]]['Distance'],
                               })

    df_temp_2 = pd.DataFrame({'location_2': df.iloc[[1]]['location_name'].reset_index(drop = True),
                              'Distance_2': df.iloc[[1]]['Distance'].reset_index(drop = True),
                               })

    df_temp_3 = pd.DataFrame({'location_3': df.iloc[[2]]['location_name'].reset_index(drop = True),
                              'Distance_3': df.iloc[[2]]['Distance'].reset_index(drop = True),
                               })
    
    frames = [df_temp_1, df_temp_2, df_temp_3]

    df_2 = pd.concat(frames, axis = 1)

示例数据*:

>>> members
      Name   Longitude   Latitude
0   Sherie   16.196499  44.040776
1    Cathi  107.000799  -7.018167
2  Grissel  118.152148  24.722747

>>> locations
       Location   Longitude   Latitude
0     Quarteira   -8.098960  37.102928
1       Weishan  100.307174  25.227212
2  Šuto Orizare   21.429841  41.992429

Haversine 函数** 特别修改为 Series:

def haversine_series(sr):
    lon1, lat1, lon2, lat2 = sr[["Longitude1", "Latitude1", "Longitude2", "Latitude2"]]
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = np.sin(dlat / 2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2.0)**2
    c = 2 * np.arcsin(np.sqrt(a))
    km = 6371 * c
    return km

交叉 memberslocations 数据帧并计算距离:

distances = members.merge(locations, how="cross", suffixes=('1', '2'))
distances["Distance"] = distances.apply(haversine_series, axis="columns")
>>> distances
      Name  Longitude1  Latitude1      Location  Longitude2  Latitude2      Distance
0   Sherie   16.196499  44.040776     Quarteira   -8.098960  37.102928   2182.362810
1   Sherie   16.196499  44.040776       Weishan  100.307174  25.227212   7640.729330
2   Sherie   16.196499  44.040776  Šuto Orizare   21.429841  41.992429    482.470815
3    Cathi  107.000799  -7.018167     Quarteira   -8.098960  37.102928  12695.443489
4    Cathi  107.000799  -7.018167       Weishan  100.307174  25.227212   3657.950305
5    Cathi  107.000799  -7.018167  Šuto Orizare   21.429841  41.992429  10165.429008
6  Grissel  118.152148  24.722747     Quarteira   -8.098960  37.102928  11135.298789
7  Grissel  118.152148  24.722747       Weishan  100.307174  25.227212   1798.285195
8  Grissel  118.152148  24.722747  Šuto Orizare   21.429841  41.992429   8719.611566

排名:

>>> distances.pivot(index="Location", columns="Name", values="Distance") \
             .rank(axis="columns").astype(int)
Name          Cathi  Grissel  Sherie
Location
Quarteira         3        2       1
Weishan           2        1       3
Šuto Orizare      3        2       1

学分:

* Mockaroo

生成的数据

** 来源自