抓取数据时无法遍历多个页面

not able to iterate through multiple pages while scraping data

所以,我必须在 flipkart 上抓取该产品的评论和评分。 我需要至少收集 30-40 条评论和评分。因此,为此我必须单击下一页,因为第一页上只有 10 条评论。下面是我用来检查我的代码是否能够点击下一页的代码。

'''

driver =webdriver.Chrome(r"chromedriver.exe")

'''

driver.get('https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=2)

'''

for page in range(4):
   
   try:
       next_butt = driver.find_element_by_xpath("//nav[@class='yFHi8N']/a/span")

       if next_butt.text == 'NEXT':
           next_butt.click()
   except NoSuchElementException:
       continue
time.sleep(1)

每当我运行这段代码时,我观察到它能够点击下一个按钮,但在第一次迭代后它点击了上一个按钮,所以我没有取得进展。

请帮忙。

看看你分享的这个 URL :

https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=2

最后你会看到 page = 2,所以如果我将其更改为 page = 3 我会看到第 3 页评论而无需 Selenium bot 点击在 Next button.

所以我在这里要做的是解析 page_number int 变量,如下所示:

示例代码:

driver.maximize_window()
page_number = 1
for page in range(4):
    driver.get("https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=%s" % page_number)
    #scrape anything you want here
    page_number = page_number + 1
    sleep(5)