将邮政编码 API 调用应用于数据框中的每一行
Apply postcode API call to each row in dataframe
在下面的代码块中,我有一个数据框 geo
,我想对其进行迭代以获得 geo
中每个英国邮政编码的东距、北距、经度和纬度。我写了一个函数来调用 API 和另一个函数来调用 return 这四个变量。
我已经用邮政编码测试了 get_data
调用以证明它有效(这是 public API 任何人都可以使用的):
import requests
import pandas as pd
geo = spark.table('property_address').toPandas()
def call_api(url: str) -> dict:
postcode_response =requests.get(url)
return postcode_response.json()
def get_data(postcode):
url = f"http://api.getthedata.com/postcode/{postcode}"
req = r.get(url)
results = req.json()['data']
easting = results['easting']
northing = results['northing']
latitude = results['latitude']
longitude = results ['longitude']
return easting ,northing,latitude, longitude
get_data('SW1A 1AA')
其中 return 个:
Out[108]: (529090, 179645, '51.501009', '-0.141588')
我想要做的是 运行 将 geo
和 return 中的每一行作为数据集。我的研究让我找到了 apply
,我的尝试基于 this guide。
我试图在 geo
中传递一个名为 property_postcode
的列,并将每一行迭代到 return 值,这是我的尝试:
def get_columns(row):
column_name = 'property_postcode'
api_param = row[column_name]
easting,northing,latitude,longitude = get_data(api_param)
row['east'] = easting
row['north'] = northing
row['lat'] = latitude
row['long'] = longitude
return row
geo= geo.apply(get_columns, axis=1)
display(geo)
我得到的错误是
`JSONDecodeError: Expecting value: line 1 column 1 (char 0)`
并没有告诉我很多。寻找 assistance\pointers.
而不是尝试在函数中设置东、北、纬度和经度列的值 return 它们来自函数。
from numpy import result_type
import requests
import pandas as pd
# geo = spark.table('property_address').toPandas()
def call_api(url: str) -> dict:
postcode_response = requests.get(url)
return postcode_response.json()
def get_data(postcode):
url = f"http://api.getthedata.com/postcode/{postcode}"
req = requests.get(url)
if req.json()["status"] == "match":
results = req.json()["data"]
easting = results.get("easting")
northing = results.get("northing")
latitude = results.get("latitude")
longitude = results.get("longitude")
else:
easting = None
northing = None
latitude = None
longitude = None
return easting, northing, latitude, longitude
def get_columns(code):
api_param = code
return get_data(api_param)
df = pd.DataFrame(
{
"property_postcode": [
"BE21 6NZ",
"SW1A 1AA",
"W1A 1AA",
"DE21",
"B31",
"ST16 2NY",
"S65 1EN",
]
}
)
df[["east", "north", "lat", "long"]] = df.apply(
lambda row: get_columns(row["property_postcode"]), axis=1, result_type="expand"
)
print(df)
property_postcode
east
north
lat
long
BE21 6NZ
NaN
NaN
None
None
SW1A 1AA
529090
179645
51.501009
-0.141588
W1A 1AA
528887
181593
51.518561
-0.143799
DE21
NaN
NaN
None
None
B31
NaN
NaN
None
None
ST16 2NY
391913
323540
52.809346
-2.121413
S65 1EN
444830
394082
53.44163
-1.326573
在下面的代码块中,我有一个数据框 geo
,我想对其进行迭代以获得 geo
中每个英国邮政编码的东距、北距、经度和纬度。我写了一个函数来调用 API 和另一个函数来调用 return 这四个变量。
我已经用邮政编码测试了 get_data
调用以证明它有效(这是 public API 任何人都可以使用的):
import requests
import pandas as pd
geo = spark.table('property_address').toPandas()
def call_api(url: str) -> dict:
postcode_response =requests.get(url)
return postcode_response.json()
def get_data(postcode):
url = f"http://api.getthedata.com/postcode/{postcode}"
req = r.get(url)
results = req.json()['data']
easting = results['easting']
northing = results['northing']
latitude = results['latitude']
longitude = results ['longitude']
return easting ,northing,latitude, longitude
get_data('SW1A 1AA')
其中 return 个:
Out[108]: (529090, 179645, '51.501009', '-0.141588')
我想要做的是 运行 将 geo
和 return 中的每一行作为数据集。我的研究让我找到了 apply
,我的尝试基于 this guide。
我试图在 geo
中传递一个名为 property_postcode
的列,并将每一行迭代到 return 值,这是我的尝试:
def get_columns(row):
column_name = 'property_postcode'
api_param = row[column_name]
easting,northing,latitude,longitude = get_data(api_param)
row['east'] = easting
row['north'] = northing
row['lat'] = latitude
row['long'] = longitude
return row
geo= geo.apply(get_columns, axis=1)
display(geo)
我得到的错误是
`JSONDecodeError: Expecting value: line 1 column 1 (char 0)`
并没有告诉我很多。寻找 assistance\pointers.
而不是尝试在函数中设置东、北、纬度和经度列的值 return 它们来自函数。
from numpy import result_type
import requests
import pandas as pd
# geo = spark.table('property_address').toPandas()
def call_api(url: str) -> dict:
postcode_response = requests.get(url)
return postcode_response.json()
def get_data(postcode):
url = f"http://api.getthedata.com/postcode/{postcode}"
req = requests.get(url)
if req.json()["status"] == "match":
results = req.json()["data"]
easting = results.get("easting")
northing = results.get("northing")
latitude = results.get("latitude")
longitude = results.get("longitude")
else:
easting = None
northing = None
latitude = None
longitude = None
return easting, northing, latitude, longitude
def get_columns(code):
api_param = code
return get_data(api_param)
df = pd.DataFrame(
{
"property_postcode": [
"BE21 6NZ",
"SW1A 1AA",
"W1A 1AA",
"DE21",
"B31",
"ST16 2NY",
"S65 1EN",
]
}
)
df[["east", "north", "lat", "long"]] = df.apply(
lambda row: get_columns(row["property_postcode"]), axis=1, result_type="expand"
)
print(df)
property_postcode | east | north | lat | long |
---|---|---|---|---|
BE21 6NZ | NaN | NaN | None | None |
SW1A 1AA | 529090 | 179645 | 51.501009 | -0.141588 |
W1A 1AA | 528887 | 181593 | 51.518561 | -0.143799 |
DE21 | NaN | NaN | None | None |
B31 | NaN | NaN | None | None |
ST16 2NY | 391913 | 323540 | 52.809346 | -2.121413 |
S65 1EN | 444830 | 394082 | 53.44163 | -1.326573 |