Python 从 google 搜索中获取温度的脚本
Python script to get temperature from google search
我正在制作一个 python 脚本,它将通过搜索关键字温度从 google 获取温度。
我发现温度值存储在 span id="wob_tm" 从此检查元素代码->
<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px"><span id="wob_tm" class="wob_t" style="display:inline">
18
</span><span id="wob_ttm" class="wob_t" style="display:none"> … </span>
</div>
可以看出温度18在span id="wob_tm"内。
所以,我的 python 脚本是->
from bs4 import BeautifulSoup
import requests,sys,webbrowser
str="temperature"
res = requests.get('http://google.com/search?q=%s'%str)
res.raise_for_status()
examplesoup= BeautifulSoup(res.text,"lxml")
linkelems=examplesoup.findAll("span",{"id":"wob_tm"})
print linkelems.string.strip()
它给了我这个错误-
AttributeError: 'NoneType' 对象没有属性 'string'
如何纠正?这意味着linkelems没有元素。
您正在打印的 0
是 span 标签内容的 长度,而不是内容本身。 string
属性将为您提供 div 标签的内容:
from bs4 import BeautifulSoup
s = """<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px">
<span id="wob_tm" class="wob_t" style="display:inline">
18
</span><span id="wob_ttm" class="wob_t" style="display:none"> … </span>
</div>"""
soup = BeautifulSoup(s)
temperature = soup.find("span", id="wob_tm")
print(temperature.string.strip())
# 18
我运行这段代码(使用Python3和bs4)得到了span标签的字符串。
from bs4 import BeautifulSoup
html_snippet = """<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px"><span id="wob_tm" class="wob_t" style="display:inline">18</span><span id="wob_ttm" class="wob_t" style="display:none"> ... </span></div>"""
soup = BeautifulSoup(html_snippet)
temp = soup.find("span", id='wob_tm')
print(temp.string)
根据一些实验,Google 发送的结果似乎会根据它认为您使用的浏览器而略有不同。例如,当我使用 Firefox 时,我会看到带有 id 'wob_tm' 的跨度,但当您的代码 运行 时,默认情况下不会。 (我确实得到了具有温度的 class wob_t 的跨度,但我也得到了其他 10 个 wob_t 跨度)。尝试将用户代理设置为流行的浏览器,如下所示:
str="temperature"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'
}
res = requests.get('http://www.google.com/search?q=%s' % str, headers=headers)
res.raise_for_status()
examplesoup=BeautifulSoup(res.text,'lxml')
linkelems=examplesoup.findAll('span', {'id': 'wob_tm'}) # This now has an element in it
确保您使用的是 user-agent
,这样 Google 就不会将您的请求视为 python-requests
,这是默认的 requests
User-Agent
。如果只需要提取温度数据,可以使用.select_one()
bs4
方法。
>>> soup.select_one('#wob_tm').text
'85°F'
提取更多的代码和示例in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "london weather",
"hl": "en",
}
response = requests.get('https://www.google.com/search', headers=headers, params=params).text
soup = BeautifulSoup(response, 'lxml')
tempature = soup.select_one('#wob_tm').text
print(f'Tempature: {tempature}')
---
# Tempature: 73°F
或者,您可以使用 SerpApi 中的 Google Direct Answer Box API。这是付费 API 和免费计划。
要集成的代码:
from serpapi import GoogleSearch
import os
params = {
"engine": "google",
"q": "london weather",
"api_key": os.getenv("API_KEY"),
"hl": "en",
}
search = GoogleSearch(params)
results = search.get_dict()
loc = results['answer_box']['location']
weather_date = results['answer_box']['date']
weather = results['answer_box']['weather']
temp = results['answer_box']['temperature']
unit = results['answer_box']['unit']
precipitation = results['answer_box']['precipitation']
humidity = results['answer_box']['humidity']
wind = results['answer_box']['wind']
forecast = results['answer_box']['forecast']
print(f'{loc}\n{weather_date}\n{weather}\n{temp}\n{unit}\n{precipitation}\n{humidity}\n{wind}\n{forecast}')
---------
'''
London, UK
Wednesday 1:00 PM
Partly cloudy
73°F
0%
55%
7 mph
[{'day': 'Wednesday', 'weather': 'Partly cloudy', 'temperature': {'high': '74', 'low': '59'}, 'thumbnail': 'https://ssl.gstatic.com/onebox/weather/48/partly_cloudy.png'}..]
'''
Disclaimer, I work for SerpApi.
我正在制作一个 python 脚本,它将通过搜索关键字温度从 google 获取温度。 我发现温度值存储在 span id="wob_tm" 从此检查元素代码->
<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px"><span id="wob_tm" class="wob_t" style="display:inline">
18
</span><span id="wob_ttm" class="wob_t" style="display:none"> … </span>
</div>
可以看出温度18在span id="wob_tm"内。 所以,我的 python 脚本是->
from bs4 import BeautifulSoup
import requests,sys,webbrowser
str="temperature"
res = requests.get('http://google.com/search?q=%s'%str)
res.raise_for_status()
examplesoup= BeautifulSoup(res.text,"lxml")
linkelems=examplesoup.findAll("span",{"id":"wob_tm"})
print linkelems.string.strip()
它给了我这个错误- AttributeError: 'NoneType' 对象没有属性 'string' 如何纠正?这意味着linkelems没有元素。
您正在打印的 0
是 span 标签内容的 长度,而不是内容本身。 string
属性将为您提供 div 标签的内容:
from bs4 import BeautifulSoup
s = """<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px">
<span id="wob_tm" class="wob_t" style="display:inline">
18
</span><span id="wob_ttm" class="wob_t" style="display:none"> … </span>
</div>"""
soup = BeautifulSoup(s)
temperature = soup.find("span", id="wob_tm")
print(temperature.string.strip())
# 18
我运行这段代码(使用Python3和bs4)得到了span标签的字符串。
from bs4 import BeautifulSoup
html_snippet = """<div>
<div class="vk_bk sol-tmp" style="float:left;margin-top:-3px;font-size:64px"><span id="wob_tm" class="wob_t" style="display:inline">18</span><span id="wob_ttm" class="wob_t" style="display:none"> ... </span></div>"""
soup = BeautifulSoup(html_snippet)
temp = soup.find("span", id='wob_tm')
print(temp.string)
根据一些实验,Google 发送的结果似乎会根据它认为您使用的浏览器而略有不同。例如,当我使用 Firefox 时,我会看到带有 id 'wob_tm' 的跨度,但当您的代码 运行 时,默认情况下不会。 (我确实得到了具有温度的 class wob_t 的跨度,但我也得到了其他 10 个 wob_t 跨度)。尝试将用户代理设置为流行的浏览器,如下所示:
str="temperature"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'
}
res = requests.get('http://www.google.com/search?q=%s' % str, headers=headers)
res.raise_for_status()
examplesoup=BeautifulSoup(res.text,'lxml')
linkelems=examplesoup.findAll('span', {'id': 'wob_tm'}) # This now has an element in it
确保您使用的是 user-agent
,这样 Google 就不会将您的请求视为 python-requests
,这是默认的 requests
User-Agent
。如果只需要提取温度数据,可以使用.select_one()
bs4
方法。
>>> soup.select_one('#wob_tm').text
'85°F'
提取更多的代码和示例in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "london weather",
"hl": "en",
}
response = requests.get('https://www.google.com/search', headers=headers, params=params).text
soup = BeautifulSoup(response, 'lxml')
tempature = soup.select_one('#wob_tm').text
print(f'Tempature: {tempature}')
---
# Tempature: 73°F
或者,您可以使用 SerpApi 中的 Google Direct Answer Box API。这是付费 API 和免费计划。
要集成的代码:
from serpapi import GoogleSearch
import os
params = {
"engine": "google",
"q": "london weather",
"api_key": os.getenv("API_KEY"),
"hl": "en",
}
search = GoogleSearch(params)
results = search.get_dict()
loc = results['answer_box']['location']
weather_date = results['answer_box']['date']
weather = results['answer_box']['weather']
temp = results['answer_box']['temperature']
unit = results['answer_box']['unit']
precipitation = results['answer_box']['precipitation']
humidity = results['answer_box']['humidity']
wind = results['answer_box']['wind']
forecast = results['answer_box']['forecast']
print(f'{loc}\n{weather_date}\n{weather}\n{temp}\n{unit}\n{precipitation}\n{humidity}\n{wind}\n{forecast}')
---------
'''
London, UK
Wednesday 1:00 PM
Partly cloudy
73°F
0%
55%
7 mph
[{'day': 'Wednesday', 'weather': 'Partly cloudy', 'temperature': {'high': '74', 'low': '59'}, 'thumbnail': 'https://ssl.gstatic.com/onebox/weather/48/partly_cloudy.png'}..]
'''
Disclaimer, I work for SerpApi.