如何使用 Selenium 和 Python 从 HTML 中提取文本
How to extract the text from the HTML using Selenium and Python
我有这个HTML:
并且我想要得到这个文本“rataoriginal”。 (这段文字有改动,我需要这部分代码作为文字)
我试过了
xpath = "//span[@class='_5h6Y_ _3Whw5 selectable-text invisible-space copyable-text']"
auxa = driver.find_element_by_xpath(xpath).text
print(auxa)
但它的打印结果与 print("\n") 相同。我暂时不想用 beaultifulsoup。
这个HTML来自'https://web.whatsapp.com'
//*[contains(text(),"rataoriginal")] 请使用这个 xpath
is a dynamic element, so to print the values you have to induce for the visibility_of_element_located()
and you can use either of the following :
使用CSS_SELECTOR
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.selectable-text.invisible-space.copyable-text[dir='auto']"))).text)
使用XPATH
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(@class, '') and contains(@class, 'invisible-space')][contains(@class, '') and @dir='auto']"))).text)
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
参考资料
您可以在以下位置找到相关讨论:
我有这个HTML:
并且我想要得到这个文本“rataoriginal”。 (这段文字有改动,我需要这部分代码作为文字)
我试过了
xpath = "//span[@class='_5h6Y_ _3Whw5 selectable-text invisible-space copyable-text']"
auxa = driver.find_element_by_xpath(xpath).text
print(auxa)
但它的打印结果与 print("\n") 相同。我暂时不想用 beaultifulsoup。
这个HTML来自'https://web.whatsapp.com'
//*[contains(text(),"rataoriginal")] 请使用这个 xpath
visibility_of_element_located()
and you can use either of the following
使用
CSS_SELECTOR
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.selectable-text.invisible-space.copyable-text[dir='auto']"))).text)
使用
XPATH
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(@class, '') and contains(@class, 'invisible-space')][contains(@class, '') and @dir='auto']"))).text)
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
参考资料
您可以在以下位置找到相关讨论: