Post 请求不适用于 scrapy 但适用于请求
Post request not working with scrapy but works with requests
请求代码:
listings_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
response = requests.post(listings_url, json=form_data)
if response.status_code == 200:
data = response.json()
print(data)
Scrapy 代码:
class BiltorvetScraperSpider(scrapy.Spider):
name = 'biltorvet'
listings_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
def start_requests(self):
yield FormRequest(url=self.listings_url, callback=self.parse, body=json.dumps(self.form_data))
def parse(self, response):
print(response.text)
我在 scrapy 请求中得到了 400。我也尝试使用 headers 但结果相同。尝试将参数从 body 更改为 json 仍然没有影响。
这应该符合以下目的:
import json
import scrapy
from scrapy.http.request import Request
class BiltorvetScraperSpider(scrapy.Spider):
name = 'biltorvet'
start_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
'Content-Type': 'application/json; charset=UTF-8',
}
def start_requests(self):
yield Request(
self.start_url,
headers=self.headers,
callback=self.parse,
method='POST',
body=json.dumps(self.form_data)
)
def parse(self, response):
print(response.json())
或者,您可以根据documentation进行如下尝试:
from scrapy.http import JsonRequest
def start_requests(self):
yield JsonRequest(
self.start_url,
headers=self.headers,
callback=self.parse,
data=self.form_data
)
def parse(self, response):
print(response.json())
请求代码:
listings_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
response = requests.post(listings_url, json=form_data)
if response.status_code == 200:
data = response.json()
print(data)
Scrapy 代码:
class BiltorvetScraperSpider(scrapy.Spider):
name = 'biltorvet'
listings_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
def start_requests(self):
yield FormRequest(url=self.listings_url, callback=self.parse, body=json.dumps(self.form_data))
def parse(self, response):
print(response.text)
我在 scrapy 请求中得到了 400。我也尝试使用 headers 但结果相同。尝试将参数从 body 更改为 json 仍然没有影响。
这应该符合以下目的:
import json
import scrapy
from scrapy.http.request import Request
class BiltorvetScraperSpider(scrapy.Spider):
name = 'biltorvet'
start_url = "https://www.biltorvet.dk/Api/Search/Page"
form_data = {
"pageNumber": "1",
"searchOrigin": "1",
"searchValue": "22526899",
"sort": ""
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
'Content-Type': 'application/json; charset=UTF-8',
}
def start_requests(self):
yield Request(
self.start_url,
headers=self.headers,
callback=self.parse,
method='POST',
body=json.dumps(self.form_data)
)
def parse(self, response):
print(response.json())
或者,您可以根据documentation进行如下尝试:
from scrapy.http import JsonRequest
def start_requests(self):
yield JsonRequest(
self.start_url,
headers=self.headers,
callback=self.parse,
data=self.form_data
)
def parse(self, response):
print(response.json())