使用美汤查找数据
Find data using beautiful soup
所以我一直在尝试使用 BeautifulSoup 检索一些数据。
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
如何获取他们的数据内容名称和百分比。
我正在尝试 .text
但它只给出百分比。
简单地select元素的属性名称['data-content']
- 因为问题没有那么详细,这个答案只是指明了方向。
例子
html = '''
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
'''
soup = BeautifulSoup(html)
for e in soup.select('.custom_popover'):
print(f"{e['data-content']}: {e.text}")
输出
<b>Advertising</b>: 40%
<b>Media Planning & Buying</b>: 35%
<b>Branding</b>: 20%
<b>Event Marketing & Planning</b>: 5%
如果您正在使用 parsel
:
from parsel import Selector
html = """
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
"""
selector = Selector(text=html)
# regular for loop
for result in selector.css(".grid.custom_popover::text"):
print(result.get())
# or list comprehension
list_result = "\n".join([result.get() for result in selector.css(".grid.custom_popover::text")])
print(list_result)
输出:
40%
35%
20%
5%
40%
35%
20%
5%
所以我一直在尝试使用 BeautifulSoup 检索一些数据。
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
如何获取他们的数据内容名称和百分比。
我正在尝试 .text
但它只给出百分比。
简单地select元素的属性名称['data-content']
- 因为问题没有那么详细,这个答案只是指明了方向。
例子
html = '''
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
'''
soup = BeautifulSoup(html)
for e in soup.select('.custom_popover'):
print(f"{e['data-content']}: {e.text}")
输出
<b>Advertising</b>: 40%
<b>Media Planning & Buying</b>: 35%
<b>Branding</b>: 20%
<b>Event Marketing & Planning</b>: 5%
如果您正在使用 parsel
:
from parsel import Selector
html = """
<div class="chartAreaContainer spm-bar-chart">
<div class="grid custom_popover" data-content="<b>Advertising</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 40%" title="">40%</div>
<div class="grid custom_popover" data-content="<b>Media Planning & Buying</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 35%" title="">35%</div>
<div class="grid custom_popover" data-content="<b>Branding</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 20%" title="">20%</div>
<div class="grid custom_popover" data-content="<b>Event Marketing & Planning</b>" data-html="true" data-original-title="" data-placement="top" data-toggle="popover" data-trigger="hover" role="button" style="width: 5%" title="">5%</div>
</div>
"""
selector = Selector(text=html)
# regular for loop
for result in selector.css(".grid.custom_popover::text"):
print(result.get())
# or list comprehension
list_result = "\n".join([result.get() for result in selector.css(".grid.custom_popover::text")])
print(list_result)
输出:
40%
35%
20%
5%
40%
35%
20%
5%