bs4 `next_sibling` VS `find_next_sibling`

bs4 `next_sibling` VS `find_next_sibling`

我在使用 next_sibling 时苦苦挣扎(next_element 也是如此)。如果用作属性,我不会得到任何回报,但如果用作 find_next_sibling(或 find_next),则它可以工作。 来自 doc:

因此,find_next_sibling 取决于 next_siblingsnext_sibling 依赖什么,为什么 return 什么都没有?

from bs4 import BeautifulSoup

html = """
<div class="......>
 <div class="one-ad-desc">
  <div class="one-ad-title">
   <a class="one-ad-link" href="www this is the URL!">
    <h5>
     Text needed
    </h5>
   </a>
  </div>
  <div class="one-ad-desc">
    ...and some more needed text here!
  </div>
 </div>
</div>
"""

soup = BeautifulSoup(html, 'lxml')

for div in soup.find_all('div', class_="one-ad-title"):
    print('-> ', div.next_element)
    print('-> ', div.next_sibling)
    print('-> ', div.find_next_sibling())-> ')
    break

输出

->  

->  

->  <div class="one-ad-desc">
    ...and some more needed text here!
  </div>

我认为这里的要点是 .find_next_sibling() 范围在树的 下一级

虽然 .next_element.next_sibling 作用域在解析树的 同一级别

所以看看并打印元素的名称,你会看到下一个元素不是标签,因为树的同一层没有任何东西:

for div in soup.find_all('div', class_="one-ad-title"):
    print('-> ', div.next_element.name)
    print('-> ', div.next_sibling.name)
    print('-> ', div.find_next_sibling().name)

#output
->  None
->  None
->  div

因此,如果您将输入更改为一行并且 标签之间没有空格,... 您将得到以下结果:

from bs4 import BeautifulSoup

html = """
<div class="......><div class="one-ad-desc"><div class="one-ad-title"><a class="one-ad-link" href="www this is the URL!"><h5>Text needed</h5></a></div><div class="one-ad-desc">...and some more needed text here!</div></div></div>"""

soup = BeautifulSoup(html, 'lxml')

for div in soup.find_all('div', class_="one-ad-title"):
    print('-> ', div.next_element)
    print('-> ', div.next_sibling)
    print('-> ', div.find_next_sibling())

输出:

->  <a class="one-ad-link" href="www this is the URL!"><h5>Text needed</h5></a>
->  <div class="one-ad-desc">...and some more needed text here!</div>
->  <div class="one-ad-desc">...and some more needed text here!</div>

注意 “需要的文本”不在您的 selected 标签的兄弟标签中,它在其子标签之一中。至 select“需要文字”-> print('-> ', div.find_next().text)