selenium ,如何按原样打印此 html 的元素?
selenium , how to print elements of this html in order as they are?
如果这是 whatsapp 消息的 html(“你好吗”),那么如何遍历此消息的元素并按硒的顺序获取(打印它们)?
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b75 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -60px -40px;">
" how "
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b60 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -60px -40px;">
" are you"
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b25 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -40px -40px;">
</span>
</span>
输出应该是
how
are you
或者输出也可以这样
how are you
我试过了
chats = driver.find_elements_by_class_name("message-in")
for i in range(0,len(chats)):
messages = chats[i].find_elements_by_class_name("i0jNr")
for j in range(0,len(messages)):
if messages[j].text == "" :
emojis = chats[i].find_elements_by_class_name("emoji")
for emoji in emojis:
print(emoji.get_attribute('alt'))
break
else:
print(messages[j].text)
这给出的输出为
how
are you
那么如何按原样获取其中的元素?
您可以遍历 span
元素的子元素并在字符串的情况下打印文本,在 img
标记的情况下打印替代文本
from bs4 import BeautifulSoup as bs4
from bs4 import NavigableString, Tag
soup = bs4(html, 'html.parser')
s = soup.find('span', attrs={'class':'i0jNr'})
s = s.find('span')
for i in s.children:
if isinstance(i, NavigableString):
print(i.strip())
elif isinstance(i, Tag):
print(i.attrs['alt'])
这是您的用例的代码示例
它的输出是 this message 是
how
are you
如果这是 whatsapp 消息的 html(“你好吗”),那么如何遍历此消息的元素并按硒的顺序获取(打印它们)?
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b75 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -60px -40px;">
" how "
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b60 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -60px -40px;">
" are you"
<img crossorigin="anonymous"
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt=""
draggable="false" class="b25 emoji wa i0jNr selectable-text copyable-text" data-plain-text=""
style="background-position: -40px -40px;">
</span>
</span>
输出应该是
how
are you
或者输出也可以这样
how are you
我试过了
chats = driver.find_elements_by_class_name("message-in")
for i in range(0,len(chats)):
messages = chats[i].find_elements_by_class_name("i0jNr")
for j in range(0,len(messages)):
if messages[j].text == "" :
emojis = chats[i].find_elements_by_class_name("emoji")
for emoji in emojis:
print(emoji.get_attribute('alt'))
break
else:
print(messages[j].text)
这给出的输出为
how
are you
那么如何按原样获取其中的元素?
您可以遍历 span
元素的子元素并在字符串的情况下打印文本,在 img
标记的情况下打印替代文本
from bs4 import BeautifulSoup as bs4
from bs4 import NavigableString, Tag
soup = bs4(html, 'html.parser')
s = soup.find('span', attrs={'class':'i0jNr'})
s = s.find('span')
for i in s.children:
if isinstance(i, NavigableString):
print(i.strip())
elif isinstance(i, Tag):
print(i.attrs['alt'])
这是您的用例的代码示例 它的输出是 this message 是
how
are you