如何 select 所有带有硒的标签和 python
How to select all tags with selenium and python
<a title="Citrate of Magnesia for Consumers" href="/cdi/citrate-of-magnesia-solution.html">
<b>Citrate of Magnesia</b>
我正在尝试从药物网站提取数据,如何 select <b></b>
标签中的所有文本?因为那是我想要的文字。
我试过了*//a[@b]
但没用。
假设您尝试依赖前面的 a
元素,请使用 following-sibling
,示例:
//a/following-sibling::b
Python代码:
b = driver.find_element_by_xpath("//a/following-sibling::b")
print(b.text)
如果您想要多个 b
标签将 a
作为前置元素:
for b in driver.find_elements_by_xpath("//a/following-sibling::b"):
print(b.text)
聊天后提供的解决方案:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("http://www.drugs.com/drug-class/laxatives.html?condition_id=&generic=0&sort=rating&order=desc")
# wait for the table list to load
table = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "table.data-list")))
for b in table.find_elements_by_css_selector("tr td > a[href] > b"):
print(b.text)
<a title="Citrate of Magnesia for Consumers" href="/cdi/citrate-of-magnesia-solution.html">
<b>Citrate of Magnesia</b>
我正在尝试从药物网站提取数据,如何 select <b></b>
标签中的所有文本?因为那是我想要的文字。
我试过了*//a[@b]
但没用。
假设您尝试依赖前面的 a
元素,请使用 following-sibling
,示例:
//a/following-sibling::b
Python代码:
b = driver.find_element_by_xpath("//a/following-sibling::b")
print(b.text)
如果您想要多个 b
标签将 a
作为前置元素:
for b in driver.find_elements_by_xpath("//a/following-sibling::b"):
print(b.text)
聊天后提供的解决方案:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("http://www.drugs.com/drug-class/laxatives.html?condition_id=&generic=0&sort=rating&order=desc")
# wait for the table list to load
table = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "table.data-list")))
for b in table.find_elements_by_css_selector("tr td > a[href] > b"):
print(b.text)