多个 POST 请求,第二个请求收到 404 错误代码
Multiple POST requests, second request gets a 404 error code
我是 python 的新手,我遇到了与请求 GitHub 中的问题相同的问题。我正在尝试对一个网站进行身份验证,该网站在初始登录后将您重定向到一个安全问题。初始登录和后续页面都使用相同的 "action URL",在第二个 post 请求中,我收到了 404,这是我的代码,在 [=] 上提出问题后,我们将不胜感激。 16=] 他们说这是一个要在这里问的问题,因为这不是他们的问题。 (尽管他们对此 GitHub 有疑问):
from bs4 import BeautifulSoup as bs
import requests
import time
sources = ["https://www.dandh.com/v4/view?pageReq=dhMainNS"]
req = requests.Session()
def login():
authentication_url = "https://www.dandh.com/v4/dh"
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0"}
payload = {"Login": "12345",
"PW": "12345",
"Request": "Login"}
payload2 = {"securityAnswer": "12345",
"Request": "postForm"}
req.post(authentication_url, data=payload, headers=header)
time.sleep(3)
req.post(authentication_url, data=payload2, headers=header)
time.sleep(3)
def print_object(sources):
for url in sources:
soup_object = bs(req.get(url).text, "html.parser")
print(soup_object.get_text())
def main():
login()
print_object(sources)
main()
第 1 部分
浏览网站后,问题的一个坏部分取决于 payload2
你只需要向它添加另一个项目:"formName":"loginChallengeValidation"
所以总体来说 payload2
应该看起来像这样:
payload2 = {"formName":"loginChallengeValidation","securityAnswer": your_security_answer,
"Request": "postForm"}
这将阻止您获取状态代码 404
。希望这有帮助。
第 2 部分
尽管这是您问题中的问题,但我怀疑这是否是您真正想要的(因为第 1 部分中的代码会将您重定向到另一个验证表单)。为了访问网站本身,您必须添加以下行:
header2 = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0", "Referer":"https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation"}
和
req.post("https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation", headers=header)
因此您的最终代码应如下所示:
from bs4 import BeautifulSoup as bs
import requests
import time
sources = ["https://www.dandh.com/v4/view?pageReq=dhMainNS"]
req = requests.Session()
def login():
authentication_url = "https://www.dandh.com/v4/dh"
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0"}
header2 = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0", "Referer":"https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation"}
payload = {"Login": your_username,
"PW": your_pasword,
"Request": "Login"}
payload2 = {"formName":"loginChallengeValidation","securityAnswer": your_security_answer,
"Request": "postForm", "btContinue": ""}
req.post(authentication_url, data=payload, headers=header)
req.post("https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation", headers=header)
time.sleep(3)
req.post(authentication_url, data=payload2, headers=header2)
time.sleep(3)
def print_object(sources):
for url in sources:
soup_object = bs(req.get(url).text, "html.parser")
print(soup_object.get_text())
def main():
login()
print_object(sources)
main()
(PS:您应将 your_username
、your_password
和 your_security_answer
替换为您的凭据)
另外,我想指出,我认为 time.sleep(3)
在代码中没有用。
真心希望这对您有所帮助。
我是 python 的新手,我遇到了与请求 GitHub 中的问题相同的问题。我正在尝试对一个网站进行身份验证,该网站在初始登录后将您重定向到一个安全问题。初始登录和后续页面都使用相同的 "action URL",在第二个 post 请求中,我收到了 404,这是我的代码,在 [=] 上提出问题后,我们将不胜感激。 16=] 他们说这是一个要在这里问的问题,因为这不是他们的问题。 (尽管他们对此 GitHub 有疑问):
from bs4 import BeautifulSoup as bs
import requests
import time
sources = ["https://www.dandh.com/v4/view?pageReq=dhMainNS"]
req = requests.Session()
def login():
authentication_url = "https://www.dandh.com/v4/dh"
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0"}
payload = {"Login": "12345",
"PW": "12345",
"Request": "Login"}
payload2 = {"securityAnswer": "12345",
"Request": "postForm"}
req.post(authentication_url, data=payload, headers=header)
time.sleep(3)
req.post(authentication_url, data=payload2, headers=header)
time.sleep(3)
def print_object(sources):
for url in sources:
soup_object = bs(req.get(url).text, "html.parser")
print(soup_object.get_text())
def main():
login()
print_object(sources)
main()
第 1 部分
浏览网站后,问题的一个坏部分取决于 payload2
你只需要向它添加另一个项目:"formName":"loginChallengeValidation"
所以总体来说 payload2
应该看起来像这样:
payload2 = {"formName":"loginChallengeValidation","securityAnswer": your_security_answer,
"Request": "postForm"}
这将阻止您获取状态代码 404
。希望这有帮助。
第 2 部分
尽管这是您问题中的问题,但我怀疑这是否是您真正想要的(因为第 1 部分中的代码会将您重定向到另一个验证表单)。为了访问网站本身,您必须添加以下行:
header2 = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0", "Referer":"https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation"}
和
req.post("https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation", headers=header)
因此您的最终代码应如下所示:
from bs4 import BeautifulSoup as bs
import requests
import time
sources = ["https://www.dandh.com/v4/view?pageReq=dhMainNS"]
req = requests.Session()
def login():
authentication_url = "https://www.dandh.com/v4/dh"
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0"}
header2 = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; rv:60.0) Gecko/20100101 Firefox/60.0", "Referer":"https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation"}
payload = {"Login": your_username,
"PW": your_pasword,
"Request": "Login"}
payload2 = {"formName":"loginChallengeValidation","securityAnswer": your_security_answer,
"Request": "postForm", "btContinue": ""}
req.post(authentication_url, data=payload, headers=header)
req.post("https://www.dandh.com/v4/view?pageReq=LoginChallengeValidation", headers=header)
time.sleep(3)
req.post(authentication_url, data=payload2, headers=header2)
time.sleep(3)
def print_object(sources):
for url in sources:
soup_object = bs(req.get(url).text, "html.parser")
print(soup_object.get_text())
def main():
login()
print_object(sources)
main()
(PS:您应将 your_username
、your_password
和 your_security_answer
替换为您的凭据)
另外,我想指出,我认为 time.sleep(3)
在代码中没有用。
真心希望这对您有所帮助。