从列表中将数据写入 csv 文件后，某些列单元格为空

Question

我有一个代码可以从烂番茄网站上获取前 100 部电影。解析后，数据被放入列表中。这是代码：

# create and write headers to a list 
rows = []
rows.append(['Rank', 'Rating', 'Title', 'No. of Reviews'])
print(rows)

# loop over results
for result in results:
    # find all columns per result
    data = result.find_all('td')
    # check that columns have data 
    if len(data) == 0: 
        continue
        
    # write columns to variables
    rank = data[0].getText()
    rating = data[1].getText()
    title = data[2].getText()
    reviews = data[3].getText()
    
    # write each result to rows
    rows.append([rank, rating, title, reviews])
    
print(rows)

输出如下所示：

[['Rank', 'Rating', 'Title', 'No. of Reviews'], ['1.', '\n\n\n\xa096%\n\n', '\n\n            Black Panther (2018)\n', '503'], ['2.', '\n\n\n\xa094%\n\n', '\n\n            Avengers: Endgame (2019)\n', '514'], ['3.', '\n\n\n\xa093%\n\n', '\n\n            Us (2019)\n', '520'], ['4.', '\n\n\n\xa097%\n\n', '\n\n            Toy Story 4 (2019)\n', '433'], ['5.', '\n\n\n\xa098%\n\n', '\n\n           The Wizard of Oz (1939)\n', '117'], ['6.', '\n\n\n\xa099%\n\n', '\n\n  Lady Bird (2017)\n', '388']...

然后我将数据写入csv文件。

# Create csv and write rows to output file
with open('rottentomato.csv','w', newline='') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerows(rows)

但只有 'Rank' 和 'No. of Reviews' 列有数据。 'Rating' 和 'Title' 列为空。

Answer 1

我试图重现您的问题，但我发现的唯一问题是创建空格的特殊字符。你可以用 strip

清理那些

import csv
rows = [['Rank', 'Rating', 'Title', 'No. of Reviews'], ['1.', '\n\n\n\xa096%\n\n', '\n\nBlack Panther (2018)\n', '503'], ['2.', '\n\n\n\xa094%\n\n', '\n\nAvengers: Endgame (2019)\n', '514'], ['3.', '\n\n\n\xa093%\n\n', '\n\nUs (2019)\n', '520'], ['4.', '\n\n\n\xa097%\n\n', '\n\nToy Story 4 (2019)\n', '433'], ['5.', '\n\n\n\xa098%\n\n', '\n\nThe Wizard of Oz (1939)\n', '117'], ['6.', '\n\n\n\xa099%\n\n', '\n\nLady Bird (2017)\n', '388']]
for i, row in enumerate(rows):
    for j, data in enumerate(row):
        rows[i][j] = data.strip()

with open('rottentomato.csv','w', newline='') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerows(rows)

这是我得到的输出：排名，评级，标题，编号评论数
1.,96%,黑豹 (2018),503
2.,94%,复仇者联盟4：终局之战 (2019),514
3.,93%,美国 (2019),520
4.,97%,反斗奇兵4 (2019),433
5.,98%,绿野仙踪 (1939),117
6.,99%,伯德小姐 (2017),388

Answer 2

您可以使用 pandas 完成大部分繁重的工作。

import pandas as pd

pd.read_html(
    'https://www.rottentomatoes.com/top/bestofrt/'
)[2].to_csv(
    'rottentomatoes.csv',
    index=False
)

从列表中将数据写入 csv 文件后，某些列单元格为空

After writing data to a csv file from a list, some columns cells are empty

python

beautifulsoup

dataframe

web-scraping

export-to-csv