如何对多列 pandas 应用一个函数?
How to apply a function for multiple columns pandas?
我需要计算 pandas 数据框中 2 个邮政编码之间的距离。
我正在使用 pgeocode 库来计算 2 个英国邮政编码之间的距离
enter image description here
dist = samp.apply(lambda x:dist.query_postal_code(x['a'],x['b']), axis=1)
-
它不工作(未正确调用 DataFrame 构造函数!)错误
您需要参考数据库中的 column-names
才能访问基础值。
改为
dist = samp.apply(lambda x:dist.query_postal_code(x['Postcode.x'],x['Postcode.y']), axis=1)
一切尽在你的准备
- 从网络服务获取地理数据
- 将其准备为元组并生成您的组合
- 计算距离
import geopy.distance, requests
import pandas as pd
# a few post codes..
df = pd.DataFrame({'Name': ['BROAD STREET DENTAL SURGERY',
'KINGTON SURGERY',
'ALTON STREET SURGERY'],
'PostCode': ['HR4 9AR', 'HR5 3EA', 'HR9 5AB']})
# get geo data for postcodes
dfgeo = (pd.json_normalize(requests.post("http://api.postcodes.io/postcodes",
json={"postcodes":df.PostCode.unique().tolist()}).json()["result"])
# make long / lat a tuple
.assign(geo=lambda d: d.loc[:,["result.longitude","result.latitude"]].apply(lambda r: tuple(r), axis=1))
.loc[:,["result.postcode","geo"]]
.rename(columns={"result.postcode":"PostCode"})
)
# merge geo data with original postcode data
df = df.merge(dfgeo, on="PostCode")
# generate all combinations, remove to self...
dfd = (df.assign(foo=1).merge(df.assign(foo=1), on="foo", suffixes=("_addr1","_addr2"))
.drop(columns=["foo"]).loc[lambda d: d.PostCode_addr1!=d.PostCode_addr2]
# all prep done, let's calc distance
.assign(miles=lambda d: d.apply(lambda r: geopy.distance.distance(r["geo_addr1"], r["geo_addr2"]).miles, axis=1))
)
Name_addr1
PostCode_addr1
geo_addr1
Name_addr2
PostCode_addr2
geo_addr2
miles
1
BROAD STREET DENTAL SURGERY
HR4 9AR
(-2.717611, 52.055269)
KINGTON SURGERY
HR5 3EA
(-3.022556, 52.199361)
23.1971
2
BROAD STREET DENTAL SURGERY
HR4 9AR
(-2.717611, 52.055269)
ALTON STREET SURGERY
HR9 5AB
(-2.582971, 51.911934)
13.5525
3
KINGTON SURGERY
HR5 3EA
(-3.022556, 52.199361)
BROAD STREET DENTAL SURGERY
HR4 9AR
(-2.717611, 52.055269)
23.1971
5
KINGTON SURGERY
HR5 3EA
(-3.022556, 52.199361)
ALTON STREET SURGERY
HR9 5AB
(-2.582971, 51.911934)
36.1468
6
ALTON STREET SURGERY
HR9 5AB
(-2.582971, 51.911934)
BROAD STREET DENTAL SURGERY
HR4 9AR
(-2.717611, 52.055269)
13.5525
7
ALTON STREET SURGERY
HR9 5AB
(-2.582971, 51.911934)
KINGTON SURGERY
HR5 3EA
(-3.022556, 52.199361)
36.1468
我需要计算 pandas 数据框中 2 个邮政编码之间的距离。
我正在使用 pgeocode 库来计算 2 个英国邮政编码之间的距离
enter image description here
dist = samp.apply(lambda x:dist.query_postal_code(x['a'],x['b']), axis=1)
-
它不工作(未正确调用 DataFrame 构造函数!)错误
您需要参考数据库中的 column-names
才能访问基础值。
改为
dist = samp.apply(lambda x:dist.query_postal_code(x['Postcode.x'],x['Postcode.y']), axis=1)
一切尽在你的准备
- 从网络服务获取地理数据
- 将其准备为元组并生成您的组合
- 计算距离
import geopy.distance, requests
import pandas as pd
# a few post codes..
df = pd.DataFrame({'Name': ['BROAD STREET DENTAL SURGERY',
'KINGTON SURGERY',
'ALTON STREET SURGERY'],
'PostCode': ['HR4 9AR', 'HR5 3EA', 'HR9 5AB']})
# get geo data for postcodes
dfgeo = (pd.json_normalize(requests.post("http://api.postcodes.io/postcodes",
json={"postcodes":df.PostCode.unique().tolist()}).json()["result"])
# make long / lat a tuple
.assign(geo=lambda d: d.loc[:,["result.longitude","result.latitude"]].apply(lambda r: tuple(r), axis=1))
.loc[:,["result.postcode","geo"]]
.rename(columns={"result.postcode":"PostCode"})
)
# merge geo data with original postcode data
df = df.merge(dfgeo, on="PostCode")
# generate all combinations, remove to self...
dfd = (df.assign(foo=1).merge(df.assign(foo=1), on="foo", suffixes=("_addr1","_addr2"))
.drop(columns=["foo"]).loc[lambda d: d.PostCode_addr1!=d.PostCode_addr2]
# all prep done, let's calc distance
.assign(miles=lambda d: d.apply(lambda r: geopy.distance.distance(r["geo_addr1"], r["geo_addr2"]).miles, axis=1))
)
Name_addr1 | PostCode_addr1 | geo_addr1 | Name_addr2 | PostCode_addr2 | geo_addr2 | miles | |
---|---|---|---|---|---|---|---|
1 | BROAD STREET DENTAL SURGERY | HR4 9AR | (-2.717611, 52.055269) | KINGTON SURGERY | HR5 3EA | (-3.022556, 52.199361) | 23.1971 |
2 | BROAD STREET DENTAL SURGERY | HR4 9AR | (-2.717611, 52.055269) | ALTON STREET SURGERY | HR9 5AB | (-2.582971, 51.911934) | 13.5525 |
3 | KINGTON SURGERY | HR5 3EA | (-3.022556, 52.199361) | BROAD STREET DENTAL SURGERY | HR4 9AR | (-2.717611, 52.055269) | 23.1971 |
5 | KINGTON SURGERY | HR5 3EA | (-3.022556, 52.199361) | ALTON STREET SURGERY | HR9 5AB | (-2.582971, 51.911934) | 36.1468 |
6 | ALTON STREET SURGERY | HR9 5AB | (-2.582971, 51.911934) | BROAD STREET DENTAL SURGERY | HR4 9AR | (-2.717611, 52.055269) | 13.5525 |
7 | ALTON STREET SURGERY | HR9 5AB | (-2.582971, 51.911934) | KINGTON SURGERY | HR5 3EA | (-3.022556, 52.199361) | 36.1468 |