如何在 Python 中使用带有查询参数的 Request?
How to use Request with query parameter in Python?
我想从电子商务网站获取数据。但是,我可以获得的产品不超过 24 个。该站点正在使用页面索引查询参数。我正在发送查询参数,但脚本仍然获得 24 个产品。
代码、结果和网址 --> Screenshots
代码:
import requests
import sqlite3
from bs4 import BeautifulSoup
db = sqlite3.connect('veritabani.sqlite')
cursor = db.cursor()
cursor.execute("CREATE TABLE products (id, product, price)")
url = 'https://www.trendyol.com/cep-telefonu-x-c103498'
html_text = requests.get(url,params={'q': 'pi:5'}).text
soup = BeautifulSoup(html_text, 'lxml')
print(soup.contents)
products = soup.find_all("div", {"class": "p-card-wrppr"})
for product in products:
product_id = product['data-id']
product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
print(product_id,product_name,price)
db.commit()
db.close()
您提到的网站使用了滚动分页。但是你仍然可以获得你想要的数据。
首先,你传递的参数是错误的。尝试更改此行
html_text = requests.get(url,params={'q': 'pi:5'}).text
有了这个:
html_text = requests.get(url,params={'pi':'5'}).text
您将在第 5 页获得 24 件产品。
所以基本上,你可以这样走:
for i in range(10):
html_text = requests.get(url,params={'pi': str(i)}).text
soup = BeautifulSoup(html_text, 'lxml')
print(soup.contents)
products = soup.find_all("div", {"class": "p-card-wrppr"})
for product in products:
product_id = product['data-id']
product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
print(product_id,product_name,price)
db.commit()
db.close()
这应该让您的产品出现在 10 页上。
我想从电子商务网站获取数据。但是,我可以获得的产品不超过 24 个。该站点正在使用页面索引查询参数。我正在发送查询参数,但脚本仍然获得 24 个产品。
代码、结果和网址 --> Screenshots
代码:
import requests
import sqlite3
from bs4 import BeautifulSoup
db = sqlite3.connect('veritabani.sqlite')
cursor = db.cursor()
cursor.execute("CREATE TABLE products (id, product, price)")
url = 'https://www.trendyol.com/cep-telefonu-x-c103498'
html_text = requests.get(url,params={'q': 'pi:5'}).text
soup = BeautifulSoup(html_text, 'lxml')
print(soup.contents)
products = soup.find_all("div", {"class": "p-card-wrppr"})
for product in products:
product_id = product['data-id']
product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
print(product_id,product_name,price)
db.commit()
db.close()
您提到的网站使用了滚动分页。但是你仍然可以获得你想要的数据。
首先,你传递的参数是错误的。尝试更改此行
html_text = requests.get(url,params={'q': 'pi:5'}).text
有了这个:
html_text = requests.get(url,params={'pi':'5'}).text
您将在第 5 页获得 24 件产品。 所以基本上,你可以这样走:
for i in range(10):
html_text = requests.get(url,params={'pi': str(i)}).text
soup = BeautifulSoup(html_text, 'lxml')
print(soup.contents)
products = soup.find_all("div", {"class": "p-card-wrppr"})
for product in products:
product_id = product['data-id']
product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
print(product_id,product_name,price)
db.commit()
db.close()
这应该让您的产品出现在 10 页上。