jsoup 403 错误。简单的网页工作正常
jsoup 403 error. Simple webpage works fine
我看过其他类似的帖子,但没有明显的跳出。如果我错过了,我相信有人会指出正确的方向!
问题是我的应用程序中的这段代码以前可以工作,但现在不行了。所以我假设网站上发生了一些变化。我在同一个应用程序中为其他三个网站使用完全相同的代码,它们运行良好。 LOGCAT 显示以下错误:
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php
我制作了这个简单的网页,我可以从本地驱动器启动它并且它有效(如果你自己尝试,你需要将日期和时间调整为当前的 UTC 时间):
<form method="post" action="http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php">
Enter aerodrome ID(s)
<input type="text" name="AERO_Tab_Aero[0]">
<input type="hidden" name="AERO_Date_DATE" value="2016/01/25">
<input type="hidden" name="AERO_Date_HEURE" value="07:12">
<input type="hidden" name="bResultat" value="true">
<input type="hidden" name="ModeAffichage" value="COMPLET">
<input type="hidden" name="AERO_Duree" value="96">
<input type="hidden" name="AERO_CM_REGLE" value="1">
<input type="hidden" name="AERO_CM_GPS" value="2">
<input type="hidden" name="AERO_CM_INFO_COMP" value="1">
<p>
<input type="Submit" value="Get the bulletins">
</p>
</form>
这段代码returns错误:
doc = Jsoup.connect("http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php")
.data("bResultat", "true").data("ModeAffichage", "COMPLET")
.data("AERO_Date_DATE", date).data("AERO_Date_HEURE", time).data("AERO_Duree", "96").data("AERO_CM_REGLE", "1").data("AERO_CM_GPS", "2")
.data("AERO_CM_INFO_COMP", "1").data("AERO_Tab_Aero[0]", params[0].substring(0, params[0].length() - 1))
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36")
.timeout(6000).post();
想法?
编辑 #1:
我使用迷你网页时看到的headers是:
REQUEST HEADERS
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-US,en;q=0.8,en-AU;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:180
Content-Type:application/x-www-form-urlencoded
Host:notamweb.aviation-civile.gouv.fr
Origin:null
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.1;
WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111
Safari/537.36
FORM DATA
AERO_Tab_Aero[0]:KLAX
AERO_Date_DATE:2016/01/25
AERO_Date_HEURE:11:21
bResultat:true
ModeAffichage:COMPLET
AERO_Duree:96
AERO_CM_REGLE:1
AERO_CM_GPS:2
AERO_CM_INFO_COMP:1
来自JonasCz的帮助想法:
A way to fix this would be to load the page in your desktop browser, and look at the network tab of the developer tools to see what exactly it's sending, especially the cookies and headers. My guess is that you need to send other / additional cookies, or maybe a Referer header, as the website may be checking for this, and then send same or similar headers / cookies with your request.
问题已解决。问题是模拟器时钟错误导致网页拒绝请求。
我看过其他类似的帖子,但没有明显的跳出。如果我错过了,我相信有人会指出正确的方向!
问题是我的应用程序中的这段代码以前可以工作,但现在不行了。所以我假设网站上发生了一些变化。我在同一个应用程序中为其他三个网站使用完全相同的代码,它们运行良好。 LOGCAT 显示以下错误:
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php
我制作了这个简单的网页,我可以从本地驱动器启动它并且它有效(如果你自己尝试,你需要将日期和时间调整为当前的 UTC 时间):
<form method="post" action="http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php">
Enter aerodrome ID(s)
<input type="text" name="AERO_Tab_Aero[0]">
<input type="hidden" name="AERO_Date_DATE" value="2016/01/25">
<input type="hidden" name="AERO_Date_HEURE" value="07:12">
<input type="hidden" name="bResultat" value="true">
<input type="hidden" name="ModeAffichage" value="COMPLET">
<input type="hidden" name="AERO_Duree" value="96">
<input type="hidden" name="AERO_CM_REGLE" value="1">
<input type="hidden" name="AERO_CM_GPS" value="2">
<input type="hidden" name="AERO_CM_INFO_COMP" value="1">
<p>
<input type="Submit" value="Get the bulletins">
</p>
</form>
这段代码returns错误:
doc = Jsoup.connect("http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php")
.data("bResultat", "true").data("ModeAffichage", "COMPLET")
.data("AERO_Date_DATE", date).data("AERO_Date_HEURE", time).data("AERO_Duree", "96").data("AERO_CM_REGLE", "1").data("AERO_CM_GPS", "2")
.data("AERO_CM_INFO_COMP", "1").data("AERO_Tab_Aero[0]", params[0].substring(0, params[0].length() - 1))
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36")
.timeout(6000).post();
想法?
编辑 #1: 我使用迷你网页时看到的headers是:
REQUEST HEADERS Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-US,en;q=0.8,en-AU;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:180
Content-Type:application/x-www-form-urlencoded
Host:notamweb.aviation-civile.gouv.fr
Origin:null
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36
FORM DATA
AERO_Tab_Aero[0]:KLAX
AERO_Date_DATE:2016/01/25
AERO_Date_HEURE:11:21
bResultat:true
ModeAffichage:COMPLET
AERO_Duree:96
AERO_CM_REGLE:1
AERO_CM_GPS:2
AERO_CM_INFO_COMP:1
来自JonasCz的帮助想法:
A way to fix this would be to load the page in your desktop browser, and look at the network tab of the developer tools to see what exactly it's sending, especially the cookies and headers. My guess is that you need to send other / additional cookies, or maybe a Referer header, as the website may be checking for this, and then send same or similar headers / cookies with your request.
问题已解决。问题是模拟器时钟错误导致网页拒绝请求。