如何仅从 python 中的特定单元格抓取数据?
How to webscrape data from only specific cells in python?
我正在尝试从 https://il.water.usgs.gov/gmaps/precip/ 中抓取一些数据。我只想要名为“RAIN GAGE AT PING TOM PARK AT CHICAGO, IL”的行中的特定单元格。只有包含 1、3 和 12 小时降雨预测的单元格。我应该修复什么?
import pandas as pd
url = "https://il.water.usgs.gov/gmaps/precip/"
df = pd.read_html(url, flavor="bs4")[0]
print(df.loc[df[0] == "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"])
从另一个返回 JSON 的端点动态检索数据。您可以编写一个调用该端点的函数并传入位置和所需时间
def get_precipitation(location:str, hrs:list):
import requests
url = "https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json"
r = requests.get('https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json').json()
data = [i for i in r['value']['items'] if i['title'] == location][0]
for k,v in data.items():
if k in hrs:
print(f'{k}={v}')
if __name__ == "__main__":
location = "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"
hrs = ['precip1hrvalue', 'precip3hrvalue', 'precip12hrvalue']
get_precipitation(location, hrs)
我正在尝试从 https://il.water.usgs.gov/gmaps/precip/ 中抓取一些数据。我只想要名为“RAIN GAGE AT PING TOM PARK AT CHICAGO, IL”的行中的特定单元格。只有包含 1、3 和 12 小时降雨预测的单元格。我应该修复什么?
import pandas as pd
url = "https://il.water.usgs.gov/gmaps/precip/"
df = pd.read_html(url, flavor="bs4")[0]
print(df.loc[df[0] == "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"])
从另一个返回 JSON 的端点动态检索数据。您可以编写一个调用该端点的函数并传入位置和所需时间
def get_precipitation(location:str, hrs:list):
import requests
url = "https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json"
r = requests.get('https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json').json()
data = [i for i in r['value']['items'] if i['title'] == location][0]
for k,v in data.items():
if k in hrs:
print(f'{k}={v}')
if __name__ == "__main__":
location = "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"
hrs = ['precip1hrvalue', 'precip3hrvalue', 'precip12hrvalue']
get_precipitation(location, hrs)