Python 用于抓取 USPS 的 Selenium

Python Selenium to Scrape USPS

我正在尝试创建一个脚本来登录 USPS 网站以获取来自 Informed Delivery 的传入包裹列表。

我试过两种方法:

  1. 请求

请求

我捕获了登录请求并导入到 Postman 中。当我发送请求时,我收到错误:

{
    "actionErrors": [
        "We have encountered an error.  Please refresh the page and try again."
    ],
    "actionMessages": [],
    "fieldErrors": {}
}

在请求 body 中,它发送一个令牌值(来自登录表单)。请求headers也发送了几个headers以x-jfuguzwb-开头。这些看起来是不同价值的代币。


即使使用无头浏览器也行不通。

LOGIN_URL = "https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/"
driver.get(LOGIN_URL)
username = driver.find_element_by_name('username')
username.send_keys(USERNAME)
password = driver.find_element_by_name('password')
password.send_keys(PASSWORD)
driver.find_element_by_id('btn-submit').click()

显示错误 "Our apologies that you are having issues with your login."


有一个名为 myusps 的 Python 模块,但已经几年没有更新了。

关于我如何完成这个有什么建议吗?

关于您的用例和错误的更多信息我们很抱歉您在登录时遇到问题 您所看到的会帮助我们以更好的方式调试问题。但是,我能够将字符序列发送到 usernamepassword 字段并在 [=35= 上调用 click() ]登录按钮使用 inducing for the element_to_be_clickable() and you can use either of the following :

  • 使用:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#username"))).send_keys("Bijan")
    driver.find_element_by_css_selector("input#password").send_keys("Bijan")
    driver.find_element_by_css_selector("button#btn-submit").click()
    
  • 使用:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    
    driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='username']"))).send_keys("Bijan")
    driver.find_element_by_xpath("//input[@id='password']").send_keys("Bijan")
    driver.find_element_by_xpath("//button[@id='btn-submit']").click()
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • 浏览器快照:

下面的回答帮助我解决了未命名的站点登录的自动化问题。我建议看看用户@colossatr0n 的回答。

You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.