尝试使用 Python 请求登录网站时出现响应 403 错误

Response 403 Error When Trying to Login To Website Using Python Requests

我正在尝试从此网站提取数据,但在 运行 session.post 时收到响应 403 错误。请参阅下面的代码以供参考。任何帮助将不胜感激。

import requests
from bs4 import BeautifulSoup
import re

username = 'username'
password = 'password'
scrape_url = 'https://app.mapro.us/en/manage/owners/houses'

login_url = 'https://app.mapro.us/en/login'
login_info = {'login': username, 'pwd': password}
headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
            'Accept-Encoding': 'gzip, deflate, br',
            'Accept-Language': 'en-US,en;q=0.9',
            'Connection': 'keep-alive'
          }

#Start session.
session = requests.session()

#Login using your authentication information.
p = session.post(url=login_url, data=login_info, headers=headers)

print(p)

我没有帐户可以使用正确的 loginpassword 进行测试,但是当我在浏览器中检查时有一些差异(在 DevToolsFirefox /Chrome 在选项卡 Network)

主要区别是:

  • 它发送 POST 到地址 https://app.mapro.us/ajax?login=

如果我使用这个 link 然后我得到 200JSON 数据

{
    "status": 0,
    "msg": "Authorization denied."
}

也许如果我有帐户那么它会给出不同的信息。


还有其他重要或不重要的差异

  • 它发送 POST 作为 AJAX 所以它有 header

    X-Requested-With': 'XMLHttpRequest
    
  • 它期望 JSON 的响应所以它有不同的 header Accept

    'Accept': 'application/json, text/javascript, */*; q=0.01'
    
  • 它发送 POST 和它在之前 GET 中获得的 cookie SID - 所以你可能需要 运行 session.get('https://app.mapro.us/en/login', ....)POST

    之前

顺便说一句:这个 GET 在浏览器中总是得到 403 所以它似乎并不重要。


import requests

session = requests.session()

# --- GET ---

headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
            'Accept-Encoding': 'gzip, deflate, br',
            'Accept-Language': 'en-US,en;q=0.9',
            'Connection': 'keep-alive',
          }

url_get = 'https://app.mapro.us/en/login'

p = session.get(url_get, headers=headers)

print(p)
#print(p.text)
print('Cookies SID:', session.cookies.get('SID'))

# --- POST ---

username = 'username'
password = 'password'

login_info = {'login': username, 'pwd': password}

headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
            'Accept': 'application/json, text/javascript, */*; q=0.01',  # expect JSON 
            'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
            'Accept-Encoding': 'gzip, deflate, br',
            'Accept-Language': 'en-US,en;q=0.9',
            'Connection': 'keep-alive',
            'X-Requested-With': 'XMLHttpRequest',  # send AJAX
          }

url_post = 'https://app.mapro.us/ajax?login='

p = session.post(url_post, headers=headers, data=login_info)

print(p)
print(p.text)