Python Selenium：如何在文本文件中打印来自网站的值？

Question

我正在尝试编写一个脚本，该脚本将从网站 tulsaspca.org 中获取以下 6 个值并将它们打印在 .txt 文件中。

最终输出应该是：

HTML“动物放置”

<span class="number" data-to="905">905</span>
</div>
<p class="title">Animals Placed</p>

我写了下面的代码，但它似乎不起作用。

for element in driver.find_elements_by_class_name('Animals Placed'):
  print(element.text)

Answer 1

我没有看到所有 6 个号码的 HTML。

但是对于这个HTML

<span class="number" data-to="905">905</span>
</div>
<p class="title">Animals Placed</p>

您的脚本应如下所示：

XPath

//p[text()='Animals Placed']/preceding-sibling::div/span[@class='number']

如果我们在 HTML DOM 中有 unique 条目，请检查 dev tools (Google chrome)。

检查步骤：

Press F12 in Chrome -> 转到 element 部分 -> 执行 CTRL + F -> 然后粘贴 xpath 并查看是否需要 element正在 突出显示 与 1/1 匹配节点。

代码试用 1：

time.sleep(5)
animal_num = driver.find_element_by_xpath("//p[text()='Animals Placed']/preceding-sibling::div/span[@class='number']").text
print(animal_num)

代码试用2：

animal_num = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//p[text()='Animals Placed']/preceding-sibling::div/span[@class='number']"))).text
print(animal_num)

进口：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

更新：

请使用下面的 xpath

//span[@class='number' and @data-to]

应该代表HTMLDOM.

中所有的节点数

driver.maximize_window()
driver.get("https://tulsaspca.org/")
driver.execute_script("window.scrollTo(0, 250)")
all_numbers = driver.find_elements(By.XPATH, "//span[@class='number' and @data-to]")
for number in all_numbers:
    print(number.text)

输出：

Answer 2

从网站上抓取六个值TULSASPCA and print them in a text file you need to induce WebDriverWait for the visibility_of_all_elements_located() and then using you can create a list and subsequently create a DataFrame and finally export the values to a TEXT file excluding the Index using the following :

代码块：

driver.get("https://tulsaspca.org/")
driver.execute_script("window.scrollTo(0, 250)")
# read into a DataFrame
df = pd.DataFrame([my_elem.get_attribute("data-to") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[@class='number']")))])
# Exporting as TEXT file excluding the Index
df.to_csv("C:\Data_Files\output_files\new_text_marks.txt", index=False)
driver.quit()

输出文本文件的快照：

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd

PS：您可能喜欢drop the first row from the DataFrame

Python Selenium：如何在文本文件中打印来自网站的值？

Python Selenium: How do I print the values from a website in a text file?

python

selenium

dataframe

pandas

webdriverwait