获取 <span> 个属性值

Getting <span> attribute values

我有一大块 html 代码,我想提取名为“data-content”的跨度属性的每个值

import requests
from bs4 import BeautifulSoup

with open("C:\Users\stasiek\Desktop\Atom-PYTHON\Python-Udemy\web-scraping\strona.html") as raw_resuls:
    results = BeautifulSoup(raw_resuls, "html.parser")

for element in results.find_all("span"):
        print(element['data-content'])

此代码仅 returns 此文件中第一个“数据内容”的值(只有一个单词)然后抛出错误:

 File "niemiecki.py", line 10, in <module>
    print(element['data-content'])
  File "C:\Users\stasiek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\bs4\element.py", line 1406, in __getitem__
    return self.attrs[key]
KeyError: 'data-content'

知道我做错了什么吗?

Select 仅具有上述属性的,例如

from bs4 import BeautifulSoup
from io import BytesIO

data = b'''\
<body>
<span data-content="foo">1</span>
<span>2</span>
<span data-content="bar">3</span>
<span>4</span>
<span>5</span>
</body>
'''

f = BytesIO(data)
soup = BeautifulSoup(f, 'html.parser')
for span in soup.select('span[data-content]'):
    print(span['data-content'])