如何搜索 Dictionary.com 一个词并只提取第一个定义?
How to search Dictionary.com for a word and ONLY extract the first definition?
这是我的代码:
import colored
import requests
from bs4 import BeautifulSoup
inputcolor = colored.fg(2)
headers = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Max-Age': '3600',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'
}
url = "https://www.dictionary.com/browse/balance-of-power"
req = requests.get(url, headers)
soup = BeautifulSoup(req.content, 'html.parser')
print(soup.find_all("div", {"value": "1"}))
我之所以使用 soup.find_all("div", {"value": "1"})
是因为这是站点代码中第一个结果的位置:
<div value="1" class="css-10ul8x e1q3nk1v2"><span class="one-click-content css-nnyc96 e1q3nk1v1" data-term="distribution" data-linkid="nn1ov4">a distribution and opposition of forces among nations such that no single nation is strong enough to assert its will or dominate all the others.</span></div>
我的代码returns这个:
<div value="1" class="css-10ul8x e1q3nk1v2"><span class="one-click-content css-nnyc96 e1q3nk1v1" data-term="distribution" data-linkid="nn1ov4">a distribution and opposition of forces among nations such that no single nation is strong enough to assert its will or dominate all the others.</span></div>
已经很接近了,但它仍然不只打印定义而没有其他任何内容,我怎样才能做到这一点?
find_all
returns 一个列表。因此,您必须通过索引 0
获取第一个列表项。然后你可以用 get_text()
:
提取文本
soup.find_all("div", {"value": "1"})[0].get_text()
这是我的代码:
import colored
import requests
from bs4 import BeautifulSoup
inputcolor = colored.fg(2)
headers = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Max-Age': '3600',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'
}
url = "https://www.dictionary.com/browse/balance-of-power"
req = requests.get(url, headers)
soup = BeautifulSoup(req.content, 'html.parser')
print(soup.find_all("div", {"value": "1"}))
我之所以使用 soup.find_all("div", {"value": "1"})
是因为这是站点代码中第一个结果的位置:
<div value="1" class="css-10ul8x e1q3nk1v2"><span class="one-click-content css-nnyc96 e1q3nk1v1" data-term="distribution" data-linkid="nn1ov4">a distribution and opposition of forces among nations such that no single nation is strong enough to assert its will or dominate all the others.</span></div>
我的代码returns这个:
<div value="1" class="css-10ul8x e1q3nk1v2"><span class="one-click-content css-nnyc96 e1q3nk1v1" data-term="distribution" data-linkid="nn1ov4">a distribution and opposition of forces among nations such that no single nation is strong enough to assert its will or dominate all the others.</span></div>
已经很接近了,但它仍然不只打印定义而没有其他任何内容,我怎样才能做到这一点?
find_all
returns 一个列表。因此,您必须通过索引 0
获取第一个列表项。然后你可以用 get_text()
:
soup.find_all("div", {"value": "1"})[0].get_text()