无法在亚马逊中提取商品标题

Unable to extract item title in Amazon

当我尝试使用以下代码了解 Sony 耳机的标题时,代码的结果是 None

import requests    
from bs4 import BeautifulSoup

URL = 'https://www.amazon.com/Sony-Noise-Cancelling-Headphones- 
       WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ%3D%3D- 
       ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f- 
       1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45- 
       89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER'

headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like 
Gecko) Chrome/79.0.3945.88 Safari/537.36"}

page = requests.get(URL, headers=headers)    
soup = BeautifulSoup(page.content, "html.parser")
soup.prettify()

#print(soup)

title = soup.find_all('span', {'id':'productTitle'})                        

print(title, len(title))   

当前输出为:

[ ] 0
import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
soup = BeautifulSoup(r.text, 'html.parser')

for item in soup.findAll("span", {'id': 'productTitle'}):
    print(item.get_text(strip=True))

输出:

Sony Noise Cancelling Headphones WH1000XM3: Wireless Bluetooth Over the Ear Headphones with Mic and Alexa voice control - Industry Leading Active Noise Cancellation - Black

运行在线代码:Click Here

我花了最后两个小时试图用 BeautifulSoup 抓取那个标题。我尝试抓取页面上的其他元素。没有成功。我尝试将原始内容发送到文件,但由于存在奇怪的字符而中断。

我尝试了 Ahmed 的回答,但仍然得到 none。我尝试了很多我在网上找到的其他解决方案,但仍然得到 none。我这辈子都想不出如何使用 BeautifulSoup 来抓取它。

我知道您使用 Selenium,所以这是 Selenium 解决方案。

from selenium import webdriver
bot = webdriver.Chrome()
bot.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
title = bot.find_element_by_id('productTitle').text
print(title)
bot.close()