span 标签 - 如何将 HTML 上的数字与 span 标签相加
span tags- how do I sum the numbers on HTML with span tag
我有一个 URL 并试图通过 span 标签对上面的数字求和。任何人都可以帮助修改下面的代码来做这样的事情吗?:(URL 是:http://py4e-data.dr-chuck.net/comments_1050359.html)
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
# Look at the parts of a tag
print('TAG:', tag)
print('URL:', tag.get('href', None))
print('Contents:', tag.contents[0])
print('Attrs:', tag.attrs)
要对所有数字求和,您可以试试这个:
tags = soup.find_all('span', class_ = 'comments')
total = sum([int(tag.text) for tag in tags])
完整代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = 'http://py4e-data.dr-chuck.net/comments_1050359.html'
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
tags = soup.find_all('span', class_ = 'comments')
total = sum([int(tag.text) for tag in tags])
print(total)
输出:
2692
我有一个 URL 并试图通过 span 标签对上面的数字求和。任何人都可以帮助修改下面的代码来做这样的事情吗?:(URL 是:http://py4e-data.dr-chuck.net/comments_1050359.html)
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
# Look at the parts of a tag
print('TAG:', tag)
print('URL:', tag.get('href', None))
print('Contents:', tag.contents[0])
print('Attrs:', tag.attrs)
要对所有数字求和,您可以试试这个:
tags = soup.find_all('span', class_ = 'comments')
total = sum([int(tag.text) for tag in tags])
完整代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = 'http://py4e-data.dr-chuck.net/comments_1050359.html'
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
tags = soup.find_all('span', class_ = 'comments')
total = sum([int(tag.text) for tag in tags])
print(total)
输出:
2692