beautifulsoup 无法在网站上运行
beautifulsoup is not working on a website
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("https://homeshopping.pk/categories/Mobile-Phones-Price-Pakistan")
soup = BeautifulSoup(html,features="lxml")
x = soup.find("div",{"class":"innerp product-box Even product_300564"})
print(x)
可能是因为该网站是通过 javascript 加载的。尝试使用 selenium 而不是 Beautifulsoup.
from selenium import webdriver
chrome_driver_path = 'chromedriver.exe'
driver = webdriver.Chrome(chrome_driver_path)
driver.get("https://homeshopping.pk/categories/Mobile-Phones-Price-Pakistan")
element = driver.find_element_by_css_selector('div.innerp.product-box.Even.product_300564')
print(element.text)
>>> element.text
'Samsung Galaxy A51 (4G, 6GB, 128GB, Prism Black) With Official Warranty\nRs 53,999'
你说得对,只是删除了 class 定义中 Even 前后的多余空格。
像这样:
x = soup.find("div",{"class":"innerp product-box Even product_300564"})
您可能已经从 html 复制了 class 定义,但似乎 Beautifulsoup 美化了它不包含空格。
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("https://homeshopping.pk/categories/Mobile-Phones-Price-Pakistan")
soup = BeautifulSoup(html,features="lxml")
x = soup.find("div",{"class":"innerp product-box Even product_300564"})
print(x)
可能是因为该网站是通过 javascript 加载的。尝试使用 selenium 而不是 Beautifulsoup.
from selenium import webdriver
chrome_driver_path = 'chromedriver.exe'
driver = webdriver.Chrome(chrome_driver_path)
driver.get("https://homeshopping.pk/categories/Mobile-Phones-Price-Pakistan")
element = driver.find_element_by_css_selector('div.innerp.product-box.Even.product_300564')
print(element.text)
>>> element.text
'Samsung Galaxy A51 (4G, 6GB, 128GB, Prism Black) With Official Warranty\nRs 53,999'
你说得对,只是删除了 class 定义中 Even 前后的多余空格。 像这样:
x = soup.find("div",{"class":"innerp product-box Even product_300564"})
您可能已经从 html 复制了 class 定义,但似乎 Beautifulsoup 美化了它不包含空格。