从 data-op-info 中提取键值对的漂亮汤

Beautiful soup to extract key value pairs from data-op-info

下面的代码没有失败,但并不完整。从这一点开始,我试图只将所有完整游戏值放入数据框中。

import json
from bs4 import BeautifulSoup
import urllib.request

source = urllib.request.urlopen('https://www.oddsshark.com/nfl/odds').read()
soup = BeautifulSoup(source, 'html.parser')

results = soup.find_all(class_ = "op-item op-spread op-opening")

for result in (results):
    print(json.loads(result['data-op-info']).items())

我在最后使用了打印,因为我试图只提取行值并查看它。

请注意,此站点上有一个类似的问题,但该解决方案仅适用于一个 div。如果变量有多个 divs.
,它将失败 How to parse information between {} on web page using Beautifulsoup

你快到了。看看我在哪里有列表理解来捕获结果然后使用 json_normalize()

import json
from bs4 import BeautifulSoup
import urllib.request

source = urllib.request.urlopen('https://www.oddsshark.com/nfl/odds').read()
soup = BeautifulSoup(source, 'html.parser')

results = soup.find_all(class_ = "op-item op-spread op-opening")

rlist = [json.loads(result['data-op-info']) for result in (results)]
pd.json_normalize(rlist)

   fullgame firsthalf secondhalf firstquarter secondquarter thirdquarter fourthquarter
0      -4.5      -2.5       -1.5         -0.5          -0.5         -0.5          -0.5
1      +4.5      +2.5       +1.5         +0.5          +0.5         +0.5          +0.5
2        +7        +4       +3.5           +3            +3         +2.5            +2
3        -7        -4       -3.5           -3            -3         -2.5            -2
4        -3        -3       -2.5         -0.5            -2         -0.5          -0.5
5        +3        +3       +2.5         +0.5            +2         +0.5          +0.5
6        +3      +2.5       +0.5         +0.5          +0.5         +0.5          +0.5
7        -3      -2.5       -0.5         -0.5          -0.5         -0.5          -0.5
8        -3      -0.5       -0.5         -0.5          -0.5         -0.5          -0.5
9        +3      +0.5       +0.5         +0.5          +0.5         +0.5          +0.5
10       -3      -2.5         -1         -0.5            -1         -0.5          -0.5
11       +3      +2.5         +1         +0.5            +1         +0.5          +0.5
12       -1      +0.5       -0.5         +0.5          -0.5         -0.5          -0.5
13       +1      -0.5       +0.5         -0.5          +0.5         +0.5          +0.5
14     +2.5      +3.5         +3         +0.5          +2.5         +0.5            +1
15     -2.5      -3.5         -3         -0.5          -2.5         -0.5            -1
16       +4        +3         +2         +0.5            +1         +0.5          +0.5
17       -4        -3         -2         -0.5            -1         -0.5          -0.5
18     -2.5      -0.5       -0.5         +0.5          -0.5         -0.5          -0.5
19     +2.5      +0.5       +0.5         -0.5          +0.5         +0.5          +0.5
20     -2.5      -1.5       -0.5         -0.5          -0.5         -0.5          -0.5
21     +2.5      +1.5       +0.5         +0.5          +0.5         +0.5          +0.5
22     +2.5      +1.5       +0.5         +0.5          +0.5         +0.5          +0.5
23     -2.5      -1.5       -0.5         -0.5          -0.5         -0.5          -0.5
24     +1.5      +1.5         Ev         +0.5          -0.5         -0.5          -0.5
25     -1.5      -1.5         Ev         -0.5          +0.5         +0.5          +0.5
26     +5.5        +3       +2.5         +0.5          +0.5         +0.5          +0.5
27     -5.5        -3       -2.5         -0.5          -0.5         -0.5          -0.5
28     -3.5      -0.5         Ev         -0.5          +0.5         +0.5          +0.5
29     +3.5      +0.5         Ev         +0.5          -0.5         -0.5          -0.5
30       -5
31       +5

或者,如果您真的只想要字典中的一个键:

rlist = [json.loads(result['data-op-info'])['fullgame'] for result in (results)]
pd.DataFrame({'fullgame': rlist})

   fullgame
0      -4.5
1      +4.5
2        +7
3        -7
4        -3
5        +3
6        +3
7        -3
8        -3
9        +3
10       -3
11       +3
12       -1
13       +1
14     +2.5
15     -2.5
16       +4
17       -4
18     -2.5
19     +2.5
20     -2.5
21     +2.5
22     +2.5
23     -2.5
24     +1.5
25     -1.5
26     +5.5
27     -5.5
28     -3.5
29     +3.5
30       -5
31       +5