if 语句 python beautifulsoup 的精确文本匹配
Exact text match if statement python beautifulsoup
我正在尝试使用以下代码查找 'exact text match'。该网站是:https://www.girafferestaurant.co.nz/menu。当我打印 (soup.find_all(text=True)) 时,我可以取回文本并进行搜索,但我只想匹配或不匹配,具体取决于 word/phrase(在本例中为 'offering at Giraffe' ) 在声明中。
下面是我试过的。
text = soup.find_all(text=True)
if 'offering at Giraffe' in text:
print ("Match")
else:
print ("No Match")
另外,我使用了 text = soup.find_all('p') 但文本并不总是在 p 标签中,因为它在不同的站点中。
有几种方法可以使用 BeautifulSoup
按文本搜索:
searching function。使用函数作为 text
值:
results = soup.find_all(text=lambda text: text and 'offering at Giraffe' in text)
regular expression。使用正则表达式模式作为 text
值:
import re
results = soup.find_all(text=re.compile(r'offering at Giraffe'))
import bs4
import requests
url = 'https://www.girafferestaurant.co.nz/menu'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text,'html.parser')
text = soup.find_all(text=True)
matches = []
for item in text:
if 'offering at Giraffe' in item:
matches.append(item)
if matches != []:
print ('Match')
else:
print ("No Match")
编辑:供您跟进。如果您只想查看整个文本:
import bs4
import requests
url = 'https://www.girafferestaurant.co.nz/menu'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text,'html.parser')
text = soup.text
matches = []
if 'offering at Giraffe' in text and 'customised set' not in text:
matches.append(text)
if matches != []:
print ('Match')
else:
print ("No Match")
我正在尝试使用以下代码查找 'exact text match'。该网站是:https://www.girafferestaurant.co.nz/menu。当我打印 (soup.find_all(text=True)) 时,我可以取回文本并进行搜索,但我只想匹配或不匹配,具体取决于 word/phrase(在本例中为 'offering at Giraffe' ) 在声明中。
下面是我试过的。
text = soup.find_all(text=True)
if 'offering at Giraffe' in text:
print ("Match")
else:
print ("No Match")
另外,我使用了 text = soup.find_all('p') 但文本并不总是在 p 标签中,因为它在不同的站点中。
有几种方法可以使用 BeautifulSoup
按文本搜索:
searching function。使用函数作为
text
值:results = soup.find_all(text=lambda text: text and 'offering at Giraffe' in text)
regular expression。使用正则表达式模式作为
text
值:import re results = soup.find_all(text=re.compile(r'offering at Giraffe'))
import bs4
import requests
url = 'https://www.girafferestaurant.co.nz/menu'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text,'html.parser')
text = soup.find_all(text=True)
matches = []
for item in text:
if 'offering at Giraffe' in item:
matches.append(item)
if matches != []:
print ('Match')
else:
print ("No Match")
编辑:供您跟进。如果您只想查看整个文本:
import bs4
import requests
url = 'https://www.girafferestaurant.co.nz/menu'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text,'html.parser')
text = soup.text
matches = []
if 'offering at Giraffe' in text and 'customised set' not in text:
matches.append(text)
if matches != []:
print ('Match')
else:
print ("No Match")