使用 Selenium 获取标签 'h1' 和 id 内的信息

Using Selenium to get information within the tag 'h1' and id

我正在尝试获取以下信息:'Jarrow Formulas, Methyl Folate, 400 mcg, 60 Veggie Caps'

可以看看图片,非常感谢:

我使用了这段代码,但没有成功:

driver = webdriver.Chrome(chrome_path)
driver.get("https://www.iherb.com/c/Vitamin-B?sr=2")
wait = WebDriverWait(driver, 10)

item_name = list()

#close the pop up
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR,"svg[data-ga-event-action='list-close']"))).click()

#store all the links in a list
item_links = [item.get_attribute("href") for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,".absolute-link-wrapper > a.product-link")))]

for item_link in item_links:
    driver.get(item_link)item_name.append(driver.find_element_by_css_selector('[id="name"]').text) #this code doesnt work

要打印 text value 您可以使用以下任一方法 :

  • 使用 xpathtext 属性:

    print(driver.find_element_by_xpath("//section[@class='column image-fixed']//following::section[2]//div[@id='product-summary-header']//h1[@id='name']").text)
    
  • 使用 xpathget_attribute():

    print(driver.find_element_by_xpath("//section[@class='column image-fixed']//following::section[2]//div[@id='product-summary-header']//h1[@id='name']").get_attribute("innerHTML"))
    
  • 控制台输出:

    Jarrow Formulas, Methyl Folate, 400 mcg, 60 Veggie Caps
    

理想情况下,您需要为 visibility_of_element_located() 引入 ,您可以使用以下任一项 :

  • 使用 xpathtext 属性:

    driver.get('https://ca.iherb.com/pr/Jarrow-Formulas-Methyl-Folate-400-mcg-60-Veggie-Caps/42778')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='column image-fixed']//following::section[2]//div[@id='product-summary-header']//h1[@id='name']"))).text)
    
  • 使用 XPATHget_attribute():

    driver.get('https://ca.iherb.com/pr/Jarrow-Formulas-Methyl-Folate-400-mcg-60-Veggie-Caps/42778')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='column image-fixed']//following::section[2]//div[@id='product-summary-header']//h1[@id='name']"))).get_attribute("innerHTML"))
    
  • 控制台输出:

    Jarrow Formulas, Methyl Folate, 400 mcg, 60 Veggie Caps
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in


参考资料

Link 到有用的文档:

  • get_attribute()方法Gets the given attribute or property of the element.
  • text属性returnsThe text of the element.
  • Difference between text and innerHTML using Selenium