如何用汤获得分页的最后一个数字?
How to get the pagination last number with soup?
您好,我这里有一个网站 https://alfagift.id/find/ayam,我一直在尝试获取最后页码。然而无济于事。
from fake_useragent import UserAgent
import requests
from bs4 import BeautifulSoup
import pandas as pd
#get soup
ua = UserAgent()
USER_AGENT = ua.random
headers = {"User-Agent" : str(USER_AGENT),"Accept-Encoding": "*","Connection": "keep-alive"}
resp = requests.get(URL, headers=headers)
soup = BeautifulSoup(resp.content, "html.parser")
#get page
pages=[]
page = soup.find_all("a",{class:"page-link"})
for p in page:
pages.append(int(p.text))
maxpage = int(page).max()
然而,这并没有返回任何东西。我怎样才能正确找到分页?
分页以及所有产品由 JS 从 JSON.
动态添加
要获得总页数,试试这个:
import json
import requests
import re
print(
json.loads(
re.search(
r"var pageList = (.*);",
requests.get("https://alfagift.id/find/ayam").text,
).group(1),
)["totalPage"]
)
输出:
5
您好,我这里有一个网站 https://alfagift.id/find/ayam,我一直在尝试获取最后页码。然而无济于事。
from fake_useragent import UserAgent
import requests
from bs4 import BeautifulSoup
import pandas as pd
#get soup
ua = UserAgent()
USER_AGENT = ua.random
headers = {"User-Agent" : str(USER_AGENT),"Accept-Encoding": "*","Connection": "keep-alive"}
resp = requests.get(URL, headers=headers)
soup = BeautifulSoup(resp.content, "html.parser")
#get page
pages=[]
page = soup.find_all("a",{class:"page-link"})
for p in page:
pages.append(int(p.text))
maxpage = int(page).max()
然而,这并没有返回任何东西。我怎样才能正确找到分页?
分页以及所有产品由 JS 从 JSON.
动态添加要获得总页数,试试这个:
import json
import requests
import re
print(
json.loads(
re.search(
r"var pageList = (.*);",
requests.get("https://alfagift.id/find/ayam").text,
).group(1),
)["totalPage"]
)
输出:
5