Python 用于抓取 USPS 的 Selenium

Question

我正在尝试创建一个脚本来登录 USPS 网站以获取来自 Informed Delivery 的传入包裹列表。

我试过两种方法：

请求
硒

请求

我捕获了登录请求并导入到 Postman 中。当我发送请求时，我收到错误：

{
    "actionErrors": [
        "We have encountered an error.  Please refresh the page and try again."
    ],
    "actionMessages": [],
    "fieldErrors": {}
}

在请求 body 中，它发送一个令牌值（来自登录表单）。请求headers也发送了几个headers以x-jfuguzwb-开头。这些看起来是不同价值的代币。

硒

即使使用无头浏览器也行不通。

LOGIN_URL = "https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/"
driver.get(LOGIN_URL)
username = driver.find_element_by_name('username')
username.send_keys(USERNAME)
password = driver.find_element_by_name('password')
password.send_keys(PASSWORD)
driver.find_element_by_id('btn-submit').click()

显示错误 "Our apologies that you are having issues with your login."

有一个名为 myusps 的 Python 模块，但已经几年没有更新了。

关于我如何完成这个有什么建议吗？

Answer 1

关于您的用例和错误的更多信息我们很抱歉您在登录时遇到问题 您所看到的会帮助我们以更好的方式调试问题。但是，我能够将字符序列发送到 username 和 password 字段并在 [=35= 上调用 click() ]登录按钮使用 inducing for the element_to_be_clickable() and you can use either of the following :

使用css-selectors:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#username"))).send_keys("Bijan")
driver.find_element_by_css_selector("input#password").send_keys("Bijan")
driver.find_element_by_css_selector("button#btn-submit").click()

使用xpath:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)

driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='username']"))).send_keys("Bijan")
driver.find_element_by_xpath("//input[@id='password']").send_keys("Bijan")
driver.find_element_by_xpath("//button[@id='btn-submit']").click()

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

浏览器快照：

Answer 2

下面的回答帮助我解决了未命名的站点登录的自动化问题。我建议看看用户@colossatr0n 的回答。

You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.

Python 用于抓取 USPS 的 Selenium

Python Selenium to Scrape USPS

python

selenium

xpath

css-selectors

webdriverwait