使用 selenium python 从网站单击按钮后找到 URL

find the URL after button click from the website using selenium python

网站的每个按钮都可能包含link,下面的网站如何找到URL出现在下一个标签中。

想要在单击按钮后打印并抓取 URL 我正在使用 firefox 网络驱动程序

driver.get("https://www.dove.com/us/en/skin-care/body-lotion/cream-oil-intensive-body-lotion.html")
driver.find_element_by_xpath("//span[contains(text(),'Ingredients')]").click()
time.sleep(3)
driver.find_element_by_xpath("//button[contains(text(),'Go to SmartLabel™')]").click()

这个应该很简单,用driver.current_url就可以了。所以用你的代码你可以试试

driver.get("https://www.dove.com/us/en/skin-care/body-lotion/cream-oil-intensive-body-lotion.html")
driver.find_element_by_xpath("//span[contains(text(),'Ingredients')]").click()
time.sleep(3)
driver.find_element_by_xpath("//button[contains(text(),'Go to SmartLabel™')]").click()
time.sleep(5)
driver.switch_to.window(driver.window_handles[1])

print(driver.current_url)

我发现了几个问题:

1 等待。摆脱 time.sleep()。将其替换为 explicit/implicit 等待。我观察到这些元素是最后加载到页面上的:picture[class='loaded']。所以,我添加了等待他们。

2 要在选项卡之间切换,请使用:driver.switch_to.window(driver.window_handles[1])driver.switch_to.window(driver.window_handles[0]) - 切换到初始选项卡。

Chrome

的解决方案
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
# driver.implicitly_wait(10)
driver.get("https://www.dove.com/us/en/skin-care/body-lotion/cream-oil-intensive-body-lotion.html")
wait = WebDriverWait(driver, 30)
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "picture[class='loaded']")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".collapsed>a[title='Ingredients']")))
driver.find_element_by_css_selector(".collapsed>a[title='Ingredients']").click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[contains(text(),'Go to SmartLabel')]")))
driver.find_element_by_xpath("//button[contains(text(),'Go to SmartLabel')]").click()
driver.switch_to.window(driver.window_handles[1])
print(driver.current_url)
driver.close()
driver.switch_to.window(driver.window_handles[0])
print(driver.current_url)

输出:

https://smartlabel.unileverusa.com/011111375512-0001-en-US/index.html
https://www.dove.com/us/en/skin-care/body-lotion/cream-oil-intensive-body-lotion.html

对于 Firefox 你需要等待第二页上的至少一个元素,否则输出不会给你预期的 link:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Firefox()
driver.implicitly_wait(10)
driver.get("https://www.dove.com/us/en/skin-care/body-lotion/cream-oil-intensive-body-lotion.html")
wait = WebDriverWait(driver, 30)
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "picture[class='loaded']")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".collapsed>a[title='Ingredients']")))
driver.find_element_by_css_selector(".collapsed>a[title='Ingredients']").click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[contains(text(),'Go to SmartLabel')]")))
driver.find_element_by_xpath("//button[contains(text(),'Go to SmartLabel')]").click()
driver.switch_to.window(driver.window_handles[1])
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".container-fluid.content-section")))
print(driver.current_url)
driver.close()
driver.switch_to.window(driver.window_handles[0])
print(driver.current_url)

P.S。如果您正在寻找一种通过属性名称查找 links 的方法,那是不可能的,因为此按钮没有这样的功能。生成了link。