获取 BeautifulSoup 中 id 为空的标签内容

Get content of tags with empty id in BeautifulSoup

from bs4 import BeautifulSoup

page = """<span id="something">useless</span>
          <span id="">some text</span>
          <span id="different">useless</span>"""
soup = BeautifulSoup(page)

如何才能只获得some text?使用 soup.find_all('span', {'id': ""}) 查找所有内容。

你有两个选择:

  1. 使用自定义过滤器;传入一个函数,它将被要求 return TrueFalse 元素:

    soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
    
  2. 使用 CSS selector,属性完全匹配:

    soup.select('span[id=""]')
    

演示:

>>> from bs4 import BeautifulSoup
>>> page = """<span id="something">useless</span>
...           <span id="">some text</span>
...           <span id="different">useless</span>"""
>>> soup = BeautifulSoup(page)
>>> soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
[<span id="">some text</span>]
>>> soup.select('span[id=""]')
[<span id="">some text</span>]