Select 多个元素 BeautifulSoup 并单独管理它们
Select multiple elements with BeautifulSoup and manage them individually
我正在使用BeautifulSoup解析一个诗歌网页。诗歌分为 h3
诗名,.line
每行诗。我可以获得这两个元素并将它们添加到列表中。但我想将 h3
操作为大写并指示换行符,然后将其插入到行列表中。
linesArr = []
for lines in full_text:
booktitles = lines.select('h3')
for booktitle in booktitles:
linesArr.append(booktitle.text.upper())
linesArr.append('')
for line in lines.select('h3, .line'):
linesArr.append(line.text)
此代码将所有书名附加到列表的开头,然后继续获取 h3
和 .line
项。我试过插入这样的代码:
linesArr = []
for lines in full_text:
for line in lines.select('h3, .line'):
if line.find('h3'):
linesArr.append(line.text.upper())
linesArr.append('')
else:
linesArr.append(line.text)
我不确定你想做什么,但在这里你可以用大写的方式得到一个数组,标题和你的所有行:
#!/usr/bin/python3
# coding: utf8
from bs4 import BeautifulSoup
import requests
page = requests.get("https://quod.lib.umich.edu/c/cme/CT/1:1?rgn=div2;view=fulltext")
soup = BeautifulSoup(page.text, 'html.parser')
title = soup.find('h3')
full_lines = soup.find_all('div',{'class':'line'})
linesArr = []
linesArr.append(title.get_text().upper())
for line in full_lines:
linesArr.append(line.get_text())
# Print full array with the title and text
print(linesArr)
# Print text here with line break
for linea in linesArr:
print(linea + '\n')
我正在使用BeautifulSoup解析一个诗歌网页。诗歌分为 h3
诗名,.line
每行诗。我可以获得这两个元素并将它们添加到列表中。但我想将 h3
操作为大写并指示换行符,然后将其插入到行列表中。
linesArr = []
for lines in full_text:
booktitles = lines.select('h3')
for booktitle in booktitles:
linesArr.append(booktitle.text.upper())
linesArr.append('')
for line in lines.select('h3, .line'):
linesArr.append(line.text)
此代码将所有书名附加到列表的开头,然后继续获取 h3
和 .line
项。我试过插入这样的代码:
linesArr = []
for lines in full_text:
for line in lines.select('h3, .line'):
if line.find('h3'):
linesArr.append(line.text.upper())
linesArr.append('')
else:
linesArr.append(line.text)
我不确定你想做什么,但在这里你可以用大写的方式得到一个数组,标题和你的所有行:
#!/usr/bin/python3
# coding: utf8
from bs4 import BeautifulSoup
import requests
page = requests.get("https://quod.lib.umich.edu/c/cme/CT/1:1?rgn=div2;view=fulltext")
soup = BeautifulSoup(page.text, 'html.parser')
title = soup.find('h3')
full_lines = soup.find_all('div',{'class':'line'})
linesArr = []
linesArr.append(title.get_text().upper())
for line in full_lines:
linesArr.append(line.get_text())
# Print full array with the title and text
print(linesArr)
# Print text here with line break
for linea in linesArr:
print(linea + '\n')