用 Beautifulsoup 中的 <br> 标记替换元素字符串中的“\n”

Replace "\n" in element string with <br> tag in Beautifulsoup

我正在创建一个新标签并分配一个带有换行符的字符串

from bs4 import BeautifulSoup

soup = BeautifulSoup("", "html.parser")

myTag = soup.new_tag("div")
myTag.string = "My text \n with a new line"

soup.insert(0, myTag)

结果是

<div>My text 
 with a new line</div>

符合预期。但是,换行符需要 <br> 标记才能正确呈现。

我怎样才能做到这一点?

我认为将 CSS white-space 属性 设置为 pre-wrap 可能会更好 div:

pre-wrap -- Whitespace is preserved by the browser. Text will wrap when necessary, and on line breaks.

一个例子:

<div style="white-space:pre-wrap"> Some \n text here </div>

以及 BeautifulSoup 中的代码:

myTag = soup.new_tag("div", style="white-space:pre-wrap")
myTag.string = "My text \n with a new line"

似乎替换 \n 并非易事,因为 BeautifulSoup 默认会转义 HTML 实体。另一种方法是拆分输入字符串并自行构建包含文本和 <br> 标签的标签结构:

def replace_newline_with_br(s, soup):
    lines = s.split('\n')
    div = soup.new_tag('div')
    div.append(lines[0])
    for l in lines[1:]:
        div.append(soup.new_tag('br'))
        div.append(l)
    soup.append(div)

mytext = "My text with a few \n newlines \n"
mytext2 = "Some other text \n with a few more \n newlines \n here"

soup = BeautifulSoup("", )
replace_newline_with_br(mytext, soup)
replace_newline_with_br(mytext2, soup)
print soup.prettify()     

打印:

<div>
 My text with a few
 <br/>
 newlines
 <br/>
</div>
<div>
 Some other text
 <br/>
 with a few more
 <br/>
 newlines
 <br/>
 here
</div>