如何使用 BS4 在 Python 中添加新的 HTML 标签而不用结束语句
How to add new HTML tag without ending statement in Python utilizing BS4
我正在尝试使用 python 创建一些 HTML 输出,但无法获得正确的格式。我希望不包含中断标记的关闭语句。目前我能够生成以下 HTML:
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
soup.body.append(SHIPPING)
print(soup)
#Yields
<html>
<head>
</head>
<body>
<br>US$ 68.83</br>
<br>1974</br>
<br>US$ 14.16 Shipping</br>
</body></html>
期望的结果:
<html>
<head>
</head>
<body>
<br>US$ 68.83
<br>1974
<br>US$ 14.16 Shipping
</body></html>
最后一个输出不会在行与行之间产生任何空白,而第一个输出 会。我找不到关于 .new_tag() 语句的任何文档,不包括闭包语句。此外,需要三行来添加一个带有信息的
标签似乎很不 pythonic 开始?
你是对的,我没有在文档中看到它。最好有一个不包含结束标记的参数。就像设置默认值 True
,但如果需要,可以将其更改为 False。我想如果你愿意的话,你可以只做一个简单的函数来做到这一点。
但如果没有它,我想你在这里有 3 个选择。
- 只需将
div
用作 .new_tag()
而不是 br
即可获得所需的输出,将内容放在新行上,无需额外的 space.
- 由于这是一个相对简单的任务,绕过 bs4 的
.new_tag()
功能,只需插入您想要的标签和字符串:
- 将字符串添加到新标签后删除结束标签
选项2:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
soup.body.append(BeautifulSoup(f'<br>{item["PRICE"]}\n', 'html.parser'))
#PUB_DATE
soup.body.append(BeautifulSoup(f'<br>{item["PUB_DATE"]}\n', 'html.parser'))
#SHIPPING
soup.body.append(BeautifulSoup(f'<br>{item["SHIPPING"]}\n', 'html.parser'))
print(soup)
选项 3:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
PRICE = BeautifulSoup(str(PRICE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
PUB_DATE = BeautifulSoup(str(PUB_DATE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
SHIPPING = BeautifulSoup(str(SHIPPING).replace('</br>', '\n'), 'html.parser')
soup.body.append(SHIPPING)
print(soup)
输出:
<html>
<head>
</head>
<body>
<br/>US$ 68.83
<br/>1974
<br/>US$ 14.16 Shipping
</body></html>
我正在尝试使用 python 创建一些 HTML 输出,但无法获得正确的格式。我希望不包含中断标记的关闭语句。目前我能够生成以下 HTML:
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
soup.body.append(SHIPPING)
print(soup)
#Yields
<html>
<head>
</head>
<body>
<br>US$ 68.83</br>
<br>1974</br>
<br>US$ 14.16 Shipping</br>
</body></html>
期望的结果:
<html>
<head>
</head>
<body>
<br>US$ 68.83
<br>1974
<br>US$ 14.16 Shipping
</body></html>
最后一个输出不会在行与行之间产生任何空白,而第一个输出 会。我找不到关于 .new_tag() 语句的任何文档,不包括闭包语句。此外,需要三行来添加一个带有信息的
标签似乎很不 pythonic 开始?
你是对的,我没有在文档中看到它。最好有一个不包含结束标记的参数。就像设置默认值 True
,但如果需要,可以将其更改为 False。我想如果你愿意的话,你可以只做一个简单的函数来做到这一点。
但如果没有它,我想你在这里有 3 个选择。
- 只需将
div
用作.new_tag()
而不是br
即可获得所需的输出,将内容放在新行上,无需额外的 space. - 由于这是一个相对简单的任务,绕过 bs4 的
.new_tag()
功能,只需插入您想要的标签和字符串: - 将字符串添加到新标签后删除结束标签
选项2:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
soup.body.append(BeautifulSoup(f'<br>{item["PRICE"]}\n', 'html.parser'))
#PUB_DATE
soup.body.append(BeautifulSoup(f'<br>{item["PUB_DATE"]}\n', 'html.parser'))
#SHIPPING
soup.body.append(BeautifulSoup(f'<br>{item["SHIPPING"]}\n', 'html.parser'))
print(soup)
选项 3:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
PRICE = BeautifulSoup(str(PRICE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
PUB_DATE = BeautifulSoup(str(PUB_DATE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
SHIPPING = BeautifulSoup(str(SHIPPING).replace('</br>', '\n'), 'html.parser')
soup.body.append(SHIPPING)
print(soup)
输出:
<html>
<head>
</head>
<body>
<br/>US$ 68.83
<br/>1974
<br/>US$ 14.16 Shipping
</body></html>