如何在不使用 pandas 的情况下重写 python 代码?
How can I rewrite the python code without using pandas?
我正在尝试将使用模块 pandas 的代码 'translate' 转换为不使用 pandas 的代码。
代码如下所示:
my code
import pandas as pd
data=pd.read_csv('review.csv')
data
titles=data['book_title']
temp=[]
for name in titles:
temp.append(name)
temp_set=set(temp)
temp_list=list(temp_set)
temp_list
data_simple=data.filter(items=['book_title','stars_given'])
data_simple=data_simple.set_index('book_title')
result_table=[]
for title in temp_list:
book_data=data_simple.filter(like=title,axis=0)
average=book_data['stars_given'].mean()
result_table.append([title,average])
result=pd.DataFrame(data=result_table, columns=['book_title', 'average_rating'])
result
result.to_csv('average_rating.csv', index=False, encoding='cp949')
(看图片,我打字可能不准确)
不使用pandas模块,只使用内置模块,(例如以'import csv'开头)有人可以帮忙修改代码吗?
建议使用:
- CSV 模块
- 过滤数据的列表理解
代码
import csv
# Load Data
with open('review.csv', 'r') as csv_file:
data = []
csv_reader = csv.DictReader(csv_file, delimiter=',')
for row in csv_reader:
data.append(row) # each row is a dictionary containing
# column names as keys
# and data in CSV file row as values
print(data)
# Names of unique book titles
temp = []
for name in [row['book_title'] for row in data]: # list comprehension for titles column
temp.append(name)
temp_set = set(temp)
temp_list=list(temp_set)
print(temp_list)
# Filter to book_titles and stars_given
# Each row is a dictioanry, using dictionary comprehension
data_simple = [{column:row[column] for column in ['book_title', 'stars_given']} for row in data]
print(data_simple)
# Mean of stars by title
# Dictionary to look up column indexes for book_title and stars_given
result_table = []
for title in temp_list:
# Filter to rows with title
book_data = [row for row in data_simple if row['book_title']==title]
# Sum up number of stars for book
sum_ = sum(int(row['stars_given']) for row in book_data)
average = sum_ / len(book_data)
result_table.append((title, average)) # store each as tuple
print(result_table)
# Create resulting CSV
with open('average_rating.csv', 'w', newline = '', encoding = 'cp949') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
csv_writer.writerow(['book_title', 'average_rating']) # Header
for row in result_table:
csv_writer.writerow(row)
测试
文件:review.csv
book_title,stars_given,comment
abc,5,loved it
def,3,okay to watch
bce,2,too long
abc,4,very funny
文件:average_rating.csv
book_title,average_rating
def,3.0
abc,4.5
bce,2.0
我认为 NumPy 可以做到吗?
import numpy as np
# using loadtxt()
arr = np.loadtxt("review.csv",
delimiter=",", dtype=str)
我不确定,但试试 Numpy。
我正在尝试将使用模块 pandas 的代码 'translate' 转换为不使用 pandas 的代码。
代码如下所示:
my code
import pandas as pd
data=pd.read_csv('review.csv')
data
titles=data['book_title']
temp=[]
for name in titles:
temp.append(name)
temp_set=set(temp)
temp_list=list(temp_set)
temp_list
data_simple=data.filter(items=['book_title','stars_given'])
data_simple=data_simple.set_index('book_title')
result_table=[]
for title in temp_list:
book_data=data_simple.filter(like=title,axis=0)
average=book_data['stars_given'].mean()
result_table.append([title,average])
result=pd.DataFrame(data=result_table, columns=['book_title', 'average_rating'])
result
result.to_csv('average_rating.csv', index=False, encoding='cp949')
(看图片,我打字可能不准确)
不使用pandas模块,只使用内置模块,(例如以'import csv'开头)有人可以帮忙修改代码吗?
建议使用:
- CSV 模块
- 过滤数据的列表理解
代码
import csv
# Load Data
with open('review.csv', 'r') as csv_file:
data = []
csv_reader = csv.DictReader(csv_file, delimiter=',')
for row in csv_reader:
data.append(row) # each row is a dictionary containing
# column names as keys
# and data in CSV file row as values
print(data)
# Names of unique book titles
temp = []
for name in [row['book_title'] for row in data]: # list comprehension for titles column
temp.append(name)
temp_set = set(temp)
temp_list=list(temp_set)
print(temp_list)
# Filter to book_titles and stars_given
# Each row is a dictioanry, using dictionary comprehension
data_simple = [{column:row[column] for column in ['book_title', 'stars_given']} for row in data]
print(data_simple)
# Mean of stars by title
# Dictionary to look up column indexes for book_title and stars_given
result_table = []
for title in temp_list:
# Filter to rows with title
book_data = [row for row in data_simple if row['book_title']==title]
# Sum up number of stars for book
sum_ = sum(int(row['stars_given']) for row in book_data)
average = sum_ / len(book_data)
result_table.append((title, average)) # store each as tuple
print(result_table)
# Create resulting CSV
with open('average_rating.csv', 'w', newline = '', encoding = 'cp949') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
csv_writer.writerow(['book_title', 'average_rating']) # Header
for row in result_table:
csv_writer.writerow(row)
测试
文件:review.csv
book_title,stars_given,comment
abc,5,loved it
def,3,okay to watch
bce,2,too long
abc,4,very funny
文件:average_rating.csv
book_title,average_rating
def,3.0
abc,4.5
bce,2.0
我认为 NumPy 可以做到吗?
import numpy as np
# using loadtxt()
arr = np.loadtxt("review.csv",
delimiter=",", dtype=str)
我不确定,但试试 Numpy。