通过 python selenium 访问 html 文本

Access html text through python selenium

我正在尝试获取文本(由主题标签标记)。

<div class="XYZ">
        <h5>
    "
          #######Reports due by##############
    "
          <span class="hbl" data-hint="task due date">
            <i class="icon-boxy-sign"></i>
          </span>
        </h5>
        <script type="jsv#61^"></script><script type="jsv#123_"></script>
        <script type="jsv#60^"></script><script type="jsv#124_"></script>
      <script type="jsv#59^"></script><p>#################07/10/2020#######################</p><script type="jsv/125^"></script>
    <script type="jsv/52_"></script><script type="jsv/24^"></script>
        <script type="jsv/42_"></script><script type="jsv/23^"></script>
      </div>

Python 行以获取主题标签内的文本:

txt = dat =wait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'div[class="XYZ"]'))).text

我希望打印该行:"Reports due by" 和“07/10/2020,我不断收到超时异常和无法定位元素错误。

更改依据。 CSS_SELECTOR 到 By.XPATH 并将定位器更新为“//div[@class='XYZ']”。应该可以。

看来你很接近。要提取文本(由主题标签标记),您必须为 visibility_of_element_located() 引入 WebDriverWait 并且您可以使用以下任一方法 :

  • 使用CSS_SELECTOR:

    print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.XYZ"))).get_attribute("title"))
    
  • 使用XPATH:

    print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='XYZ']"))).get_attribute("title"))
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Here you can find a relevant discussion on