使用 haversine 地理定位创建两个循环的数据框

Creating a dataframe of two loops using haversine geolocation

我有一个 3 人(“成员”)的 df,我想测量这些人与 3 个位置的距离。最终结果将是一个 df 对所有 3 个人从最近到更远的 3 个位置进行排名。这是我正在处理的内容和我想要的结果:

Out[154]: members
Out[154]: 
   member_id  latitude  longitude
0          1    7.1899    52.2080
1          2   -5.9209    37.4827
2          3   83.1072    54.8490

In[155]: locations
Out[155]: 
   location  latitude  longitude
0   theater   36.8381    -2.4597
1       bar   41.6561    -0.8773
2  car_shop   37.2829    -5.9209

In[156]: results
Out[156]: 
  member location_1  distance_1 location_2  distance_2 location_3  distance_3
0      1    theater           9        bar          15   car_shop          17
1      2   car_shop          13        bar          25    theater          35
2      3        bar          16    theater          25   car_shop          41

这是我迄今为止尝试过的方法,但我不知道如何完成循环以将帧连接到主帧。请帮忙!:

df = []
df_2 = []

for m in range(len(members)):

df_temp_member = pd.DataFrame({'member_id': members.iloc[[m]]['member_id']
                               })

for s in range(len(locations)):
    dist = haversine(lon1 = members.iloc[[m]]['longitude']
                    ,lat1 = members.iloc[[m]]['latitude']
                    ,lon2 = locations.iloc[[s]]['longitude']
                    ,lat2 = locations.iloc[[s]]['latitude'])

    df_temp = pd.DataFrame({'location_name': locations.iloc[[s]]['location_name'],
                            'Distance': dist,
                            })

    df.append(df_temp)

df = pd.concat(df)
df = df.sort_values(by='Distance', ascending=True, na_position='first').reset_index(drop = True).reset_index(drop = True)

df_temp_1 = pd.DataFrame({'location_1': df.iloc[[0]]['location'],
                          'Distance_1': df.iloc[[0]]['Distance'],
                           })

df_temp_2 = pd.DataFrame({'location_2': df.iloc[[1]]['location'].reset_index(drop = True),
                          'Distance_2': df.iloc[[1]]['Distance'].reset_index(drop = True),
                           })

df_temp_3 = pd.DataFrame({'location_3': df.iloc[[2]]['location'].reset_index(drop = True),
                          'Distance_3': df.iloc[[2]]['Distance'].reset_index(drop = True),
                           })

frames = [df_temp_1, df_temp_2, df_temp_3]

df_2 = pd.concat(frames, axis = 1)

merge to associate all rows in members with all others in locations, use haversine_vector to calculate distances, sort_values to order from closest to farthest, then pivot_table从长格式到宽格式,最后折叠MultiIndex:

import pandas as pd
from haversine import haversine_vector, Unit

# Cross Merge To Associate Every Row in Members with Every Other in Locations
df3 = pd.merge(members, locations, how='cross')

# Calculate the haversine distance
df3['distance'] = haversine_vector(df3.filter(like='_x'),
                                   df3.filter(like='_y'),
                                   Unit.KILOMETERS)

# Use a Pivot Table to go from long to wide format
df3 = (
    df3.pivot_table(index='member_id',
                    columns=(
                        # Create Groups based on Sorted Distance
                            df3.sort_values('distance', ascending=True)
                            .groupby('member_id').cumcount() + 1
                    ),
                    values=['location', 'distance'],
                    aggfunc='first')
        .sort_index(level=[1, 0], axis=1, ascending=(True, False))
)

# Collapse MultiIndex
df3.columns = df3.columns.map(lambda x: '_'.join(map(str, x)))
df3 = df3.reset_index()

df3:

   member_id location_1   distance_1 location_2   distance_2 location_3   distance_3
0          1    theater  6416.753469        bar  6460.611645   car_shop  6725.829125
1          2    theater  6308.847913        bar  6566.958894   car_shop  6579.375371
2          3        bar  4974.492016   car_shop  5516.266902    theater  5523.801936

这里的关键是交叉合并才能在行中进行计算:

df3 = pd.merge(members, locations, how='cross')

df3:

   member_id  latitude_x  longitude_x  location  latitude_y  longitude_y
0          1      7.1899      52.2080   theater     36.8381      -2.4597
1          1      7.1899      52.2080       bar     41.6561      -0.8773
2          1      7.1899      52.2080  car_shop     37.2829      -5.9209
3          2     -5.9209      37.4827   theater     36.8381      -2.4597
4          2     -5.9209      37.4827       bar     41.6561      -0.8773
5          2     -5.9209      37.4827  car_shop     37.2829      -5.9209
6          3     83.1072      54.8490   theater     36.8381      -2.4597
7          3     83.1072      54.8490       bar     41.6561      -0.8773
8          3     83.1072      54.8490  car_shop     37.2829      -5.9209