难以理解的解析器行为
Incomprehensible parser behavior
请帮帮我!我编写了一个简单的解析器,但它不能正常工作,而且我不知道这与什么有关。
import requests
from bs4 import BeautifulSoup
URL = 'https://stopgame.ru//topgames'
HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0', 'accept': '*/*'}
HOST = 'https://stopgame.ru'
def get_html(url, params=None):
r = requests.get(url, headers=HEADERS, params=params)
return r
def get_content(html):
soup = BeautifulSoup(html, 'html.parser')
items = soup.find_all('a', class_="lent-block game-block")
print(items)
def parse():
html = get_html(URL)
if html.status_code == 200:
items = get_content(html.text)
else:
print('Error')
parse()
我得到了这个输出:
[]
Process finished with exit code 0
items = soup.find_all('a', class_="lent-block game-block")
You are trying to find out "lent-block game-block" class for anchor
tag which actually is not there in html and hence you are getting
blank list.
试试这个 div 项目你会得到匹配项目的列表。
items = soup.find_all('div', class_="lent-block lent-main")
请帮帮我!我编写了一个简单的解析器,但它不能正常工作,而且我不知道这与什么有关。
import requests
from bs4 import BeautifulSoup
URL = 'https://stopgame.ru//topgames'
HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0', 'accept': '*/*'}
HOST = 'https://stopgame.ru'
def get_html(url, params=None):
r = requests.get(url, headers=HEADERS, params=params)
return r
def get_content(html):
soup = BeautifulSoup(html, 'html.parser')
items = soup.find_all('a', class_="lent-block game-block")
print(items)
def parse():
html = get_html(URL)
if html.status_code == 200:
items = get_content(html.text)
else:
print('Error')
parse()
我得到了这个输出:
[]
Process finished with exit code 0
items = soup.find_all('a', class_="lent-block game-block")
You are trying to find out "lent-block game-block" class for anchor tag which actually is not there in html and hence you are getting blank list.
试试这个 div 项目你会得到匹配项目的列表。
items = soup.find_all('div', class_="lent-block lent-main")