如何在不使用 pandas 的情况下重写 python 代码?

How can I rewrite the python code without using pandas?

我正在尝试将使用模块 pandas 的代码 'translate' 转换为不使用 pandas 的代码。

代码如下所示:

my code

import pandas as pd
data=pd.read_csv('review.csv')
data
titles=data['book_title']
temp=[]
for name in titles:
    temp.append(name)
temp_set=set(temp)
temp_list=list(temp_set)
temp_list
data_simple=data.filter(items=['book_title','stars_given'])
data_simple=data_simple.set_index('book_title')
result_table=[]
for title in temp_list:
  book_data=data_simple.filter(like=title,axis=0)
  average=book_data['stars_given'].mean()
   result_table.append([title,average])
result=pd.DataFrame(data=result_table, columns=['book_title', 'average_rating'])
result
result.to_csv('average_rating.csv', index=False, encoding='cp949')

(看图片,我打字可能不准确)

不使用pandas模块,只使用内置模块,(例如以'import csv'开头)有人可以帮忙修改代码吗?

建议使用:

  • CSV 模块
  • 过滤数据的列表理解

代码

import csv

# Load Data
with open('review.csv', 'r') as csv_file:
    data = []
    csv_reader = csv.DictReader(csv_file, delimiter=',')
    
    for row in csv_reader:
        data.append(row)  # each row is a dictionary containing
                          # column names as keys
                          # and data in CSV file row as values

print(data)

# Names of unique book titles
temp = []
for name in [row['book_title'] for row in data]: # list comprehension for titles column
    temp.append(name)
temp_set = set(temp)
temp_list=list(temp_set)
print(temp_list)

# Filter to book_titles and stars_given 
# Each row is a dictioanry, using dictionary comprehension
data_simple = [{column:row[column] for column in ['book_title', 'stars_given']} for row in data]
print(data_simple)
   
# Mean of stars by title
# Dictionary to look up column indexes for book_title and stars_given
result_table = []
for title in temp_list:
    # Filter to rows with title
    book_data = [row for row in data_simple if row['book_title']==title]
    
    # Sum up number of stars for book
    sum_ = sum(int(row['stars_given']) for row in book_data)
    average = sum_ / len(book_data)
    result_table.append((title, average))   # store each as tuple
    
print(result_table)

# Create resulting CSV
with open('average_rating.csv', 'w', newline = '', encoding = 'cp949') as csv_file:
    csv_writer = csv.writer(csv_file, delimiter=',')
    csv_writer.writerow(['book_title', 'average_rating'])  # Header
    for row in result_table:
        csv_writer.writerow(row)

测试

文件:review.csv

book_title,stars_given,comment
abc,5,loved it
def,3,okay to watch
bce,2,too long
abc,4,very funny

文件:average_rating.csv

book_title,average_rating
def,3.0
abc,4.5
bce,2.0

我认为 NumPy 可以做到吗?

 import numpy as np
  
# using loadtxt()
arr = np.loadtxt("review.csv",
                 delimiter=",", dtype=str)

我不确定,但试试 Numpy。