在 python 中直接写入 .csv 时如何防止用户输入相同的输入两次
How can I prevent users from entering the same input twice when writing directly to .csv in python
我正在尝试根据用户输入创建数据集。我试图防止其中一个字段重复。我要求用户选择一个与他们的姓名、年龄、gpa 和专业相匹配的字母。我想确保输入的字母是唯一的,但我不确定在直接写入 .csv 文件时如何做到这一点。
这是我目前的情况。
import csv
from colorama import Fore, Back, Style
with open('students2.csv', 'w+', newline='') as csvfile:
columnheaders = ['NAME','AGE','GPA','MAJOR']
writer = csv.DictWriter(csvfile, fieldnames=columnheaders)
writer.writeheader()
for i in range(0,10):
askname=input('Please select the letter that matches your name from the following: (A, B, C, D, E, F, G, H, I, J), ')
askage=input('Please enter your Age: ')
askgpa=input('Please enter your GPA: ')
askmajor=input('Please select your major from the following (CS, CET, CIS, CE) ')
writer.writerow({'NAME': askname,'AGE': askage,'GPA': askgpa,'MAJOR': askmajor})
print(Back.BLACK +'My name starts with the letter:', askname ,' and I am ', askage, 'years old. I study ', askmajor, 'and my GPA is: ', askgpa)
print(Style.RESET_ALL)
我知道如何使用列表来做到这一点,
namelist = []
while True:
#Input name
while True:
name = str(input('What is your name? '))
if name.upper() not in ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'):
print("Please use (A, B, C, D, E, F, G, H, I, J).")
continue
if name in namelist:
print("This name has already been used.")
continue
else:
namelist.append(name)
break
但是是否可以做到这一点而不必通过列表然后将其转换为 .csv?
任何帮助将不胜感激。
提前致谢。
您将需要在内存中保留一份列表副本(您可以每次都扫描 CSV,但这会导致大量不必要的磁盘 IO)。
我的建议是将名称缓存在一个集合中,这样您就可以在脚本顶部添加类似 nameseen = set()
的内容,然后在写入该行之前对其进行检查。类似于:
if not (askname in nameseen):
writer.writerow({'NAME': askname,'AGE': askage,'GPA': askgpa,'MAJOR': askmajor})
nameseen.add(askname)
print(Back.BLACK +'My name starts with the letter:', askname ,' and I am ', askage, 'years old. I study ', askmajor, 'and my GPA is: ', askgpa)
print(Style.RESET_ALL)
else:
print("This name has already been used.")
如果你会用pandas,你可以这样做:
import pandas as pd
df = pd.read_csv('kd.csv', index_col=0)
df.to_csv()
# 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\nBill,18,4.0,CS\n'
df.drop_duplicates(subset=None, inplace=True)
df.to_csv()
# 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\n'
更新
我更改了它以更新您的评论。一些更新,创建文件(如果它不存在)并且仍在努力根据您的评论改进它。如果陷入无限循环,可以按 CTRL-D。
$ cat kd2.csv
NAME AGE GPA MAJOR
A 20 3.2 CIS
B 31 4.0 CS
C 34 3.5 CE
D 18 2.0 CS
E 4.0 3.2 CE
import io
def new_student_add():
only_allowed = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
stub = io.StringIO('NAME AGE GPA MAJOR\n')
while True:
try:
df = pd.read_csv('kd4.csv', delim_whitespace=True, index_col=0)
except:
stub.seek(0)
df = pd.read_csv(stub, delim_whitespace=True, index_col=0)
new_csv_dict = {}
try:
new_name =input('Please select the letter that matches your name from the following: (A, B, C, D, E, F, G, H, I, J): ')
except:
break
if new_name not in only_allowed:
print("Only letters in {} are allowed".format(only_allowed))
continue
if new_name in df.index:
print("This name has already been used.")
continue
new_csv_dict['AGE'] =input('Please enter your Age: ')
new_csv_dict['GPA'] =input('Please enter your GPA: ')
new_csv_dict['MAJOR'] =input('Please select your major from the following (CS, CET, CIS, CE) ')
df.loc[new_name] = new_csv_dict
break
df.to_csv(r'kd4.csv', sep=' ')
return df
for x in range (0,9):
df = new_student_add()
for name, row in df.iterrows():
print("My name starts with the letter {} and I am {} years old. I study {} and my GPA is: {}".format(name, int(row['AGE']), row['MAJOR'], row['GPA']))
# This may be much faster, so adding it in in case the author needs a faster algorithm. Thanks AlexanderCécile
# for item in df.itertuples():
# print(f"My name starts with the letter {item[0]} and I am {item[1]} years old. I study {item[3]} and my GPA is: {item[2]}")
我正在尝试根据用户输入创建数据集。我试图防止其中一个字段重复。我要求用户选择一个与他们的姓名、年龄、gpa 和专业相匹配的字母。我想确保输入的字母是唯一的,但我不确定在直接写入 .csv 文件时如何做到这一点。
这是我目前的情况。
import csv
from colorama import Fore, Back, Style
with open('students2.csv', 'w+', newline='') as csvfile:
columnheaders = ['NAME','AGE','GPA','MAJOR']
writer = csv.DictWriter(csvfile, fieldnames=columnheaders)
writer.writeheader()
for i in range(0,10):
askname=input('Please select the letter that matches your name from the following: (A, B, C, D, E, F, G, H, I, J), ')
askage=input('Please enter your Age: ')
askgpa=input('Please enter your GPA: ')
askmajor=input('Please select your major from the following (CS, CET, CIS, CE) ')
writer.writerow({'NAME': askname,'AGE': askage,'GPA': askgpa,'MAJOR': askmajor})
print(Back.BLACK +'My name starts with the letter:', askname ,' and I am ', askage, 'years old. I study ', askmajor, 'and my GPA is: ', askgpa)
print(Style.RESET_ALL)
我知道如何使用列表来做到这一点,
namelist = []
while True:
#Input name
while True:
name = str(input('What is your name? '))
if name.upper() not in ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'):
print("Please use (A, B, C, D, E, F, G, H, I, J).")
continue
if name in namelist:
print("This name has already been used.")
continue
else:
namelist.append(name)
break
但是是否可以做到这一点而不必通过列表然后将其转换为 .csv?
任何帮助将不胜感激。 提前致谢。
您将需要在内存中保留一份列表副本(您可以每次都扫描 CSV,但这会导致大量不必要的磁盘 IO)。
我的建议是将名称缓存在一个集合中,这样您就可以在脚本顶部添加类似 nameseen = set()
的内容,然后在写入该行之前对其进行检查。类似于:
if not (askname in nameseen):
writer.writerow({'NAME': askname,'AGE': askage,'GPA': askgpa,'MAJOR': askmajor})
nameseen.add(askname)
print(Back.BLACK +'My name starts with the letter:', askname ,' and I am ', askage, 'years old. I study ', askmajor, 'and my GPA is: ', askgpa)
print(Style.RESET_ALL)
else:
print("This name has already been used.")
如果你会用pandas,你可以这样做:
import pandas as pd
df = pd.read_csv('kd.csv', index_col=0)
df.to_csv()
# 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\nBill,18,4.0,CS\n'
df.drop_duplicates(subset=None, inplace=True)
df.to_csv()
# 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\n'
更新
我更改了它以更新您的评论。一些更新,创建文件(如果它不存在)并且仍在努力根据您的评论改进它。如果陷入无限循环,可以按 CTRL-D。
$ cat kd2.csv
NAME AGE GPA MAJOR
A 20 3.2 CIS
B 31 4.0 CS
C 34 3.5 CE
D 18 2.0 CS
E 4.0 3.2 CE
import io
def new_student_add():
only_allowed = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
stub = io.StringIO('NAME AGE GPA MAJOR\n')
while True:
try:
df = pd.read_csv('kd4.csv', delim_whitespace=True, index_col=0)
except:
stub.seek(0)
df = pd.read_csv(stub, delim_whitespace=True, index_col=0)
new_csv_dict = {}
try:
new_name =input('Please select the letter that matches your name from the following: (A, B, C, D, E, F, G, H, I, J): ')
except:
break
if new_name not in only_allowed:
print("Only letters in {} are allowed".format(only_allowed))
continue
if new_name in df.index:
print("This name has already been used.")
continue
new_csv_dict['AGE'] =input('Please enter your Age: ')
new_csv_dict['GPA'] =input('Please enter your GPA: ')
new_csv_dict['MAJOR'] =input('Please select your major from the following (CS, CET, CIS, CE) ')
df.loc[new_name] = new_csv_dict
break
df.to_csv(r'kd4.csv', sep=' ')
return df
for x in range (0,9):
df = new_student_add()
for name, row in df.iterrows():
print("My name starts with the letter {} and I am {} years old. I study {} and my GPA is: {}".format(name, int(row['AGE']), row['MAJOR'], row['GPA']))
# This may be much faster, so adding it in in case the author needs a faster algorithm. Thanks AlexanderCécile
# for item in df.itertuples():
# print(f"My name starts with the letter {item[0]} and I am {item[1]} years old. I study {item[3]} and my GPA is: {item[2]}")