使用 Beautiful Soup 和 Requests 按下按钮后如何获得 HTML 更改

How to get HTML changes after pressing button with Beautiful Soup and Requests

我想要 HTML 这个网站 https://www.forebet.com/en/football-predictions 在按更多[+] 按钮足以加载所有游戏后。每次点击页面底部的更多 [+] 按钮时,HTML 都会更改并显示更多足球比赛。如何获取加载了所有足球比赛的页面请求?

from bs4 import BeautifulSoup
import requests

leagues = {"EPL","UCL","Es1","De1","Fr1","Pt1","It1","UEL"}

class ForeBet:

#gets all games from the leagues on leagues returning the games on a string list
#game format is League|Date|Hour|Home Team|Away Team|Prob Home|Prob Tie| Prob Away
def get_games_and_probs(self):

    response=requests.get('https://www.forebet.com/en/football-prediction')
    soup = BeautifulSoup(response.text, 'html.parser')
    results=list()

    games = soup.findAll(class_='rcnt tr_0')+soup.findAll(class_='rcnt tr_1')

    for game in games:
        if(leagues.__contains__(game.find(class_='shortTag').text.strip())):
            game=game.find(class_='shortTag').text+"|"+\
                game.find(class_='date_bah').text.split(" ")[0]+"|"+ \
                game.find(class_='date_bah').text.split(" ")[1]+"|"+ \
                game.find(class_='homeTeam').text+"|"+\
                game.find(class_='awayTeam').text+"|"+\
                game.find(class_='fprc').findNext().text+"|"+\
                game.find(class_='fprc').findNext().findNext().text+"|"+\
                game.find(class_='fprc').findNext().findNext().findNext().text
            print(game)
            results.append(game)

    return results

如前所述,requests 和 beautfulsoup 用于解析数据,而不是与网站交互。为此,您需要 Selenium。

你的另一个选择是看你是否可以直接获取数据,并查看是否有参数可以像你点击获取更多一样再次请求。这对你有用吗?

import pandas as pd
import requests

results = pd.DataFrame()
i=0
while True:
    print(i)
    url = 'https://m.forebet.com/scripts/getrs.php'
    payload = {
    'ln': 'en',
    'tp': '1x2',
    'in': '%s' %(i+11),
    'ord': '0'}
    
    jsonData = requests.get(url, params=payload).json()
    results = results.append(pd.DataFrame(jsonData[0]), sort=False).reset_index(drop=True)

    if max(results['id'].value_counts()) <=1:
        i+=1
    else:
        results = results.drop_duplicates()
        break

输出:

print(results)
          id  pr_under  ...    country         full_name
0    1473708        31  ...    England   Isthmian League
1    1473713        35  ...    England   Isthmian League
2    1473745        28  ...    England   Isthmian League
3    1473710        35  ...    England   Isthmian League
4    1473033        28  ...    England  Premier League 2
..       ...       ...  ...        ...               ...
515  1419208        47  ...  Argentina  Torneo Federal A
516  1419156        57  ...  Argentina  Torneo Federal A
517  1450589        50  ...    Armenia    Premier League
518  1450590        35  ...    Armenia    Premier League
519  1450591        52  ...    Armenia    Premier League

[518 rows x 73 columns]