Using pandas.read_csv() is conflicting with csv.reader() - ValueError: I/O operation on closed file

Question

我正在解析通过 POST FormData() 发送的 csv 文件，然后将其转换为 JSON。当我在通过 pandas 之前使用包来验证 csv 时出现问题。验证器函数完成她的工作，然后 pandas 的正常读取给出错误 ValueError: I/O operation on closed file


if request.method == 'POST':
        content = request.form
        data_header = json.loads(content.get('JSON'))
        filename = data_header['data'][0]['name']
        
        # Here! starts the problem
        # validator = validCSV(validator={'header': ["id","type","name","subtype","tag","block","latitude","longitude","height","max_alt","min_alt","power","tia","fwl"]})
        # print(validator.verify_header(request.files[filename]))
        # then pseudo-code: if returned false, will abort(404)
        
        try:
            df = pd.read_csv(request.files[filename], dtype='object')
            dictObj = df.to_dict(orient='records')

如果我们跟踪这个包的内部问题，这就是我们将看到的：

def verify_header(self, inputfile):
        with TextIOWrapper(inputfile, encoding="utf-8") as wrapper:
            header = next(csv.reader(wrapper))

似乎当 TextIOWrapper 打开和关闭文件时，pandas 不再允许使用 read_csv() 打开文件。但是制作文件的副本对于只读取 header 来说似乎是一种浪费，我喜欢使用 csv.reader() 的想法，因为在其他示例中显示读取 csv 文件的效率高于 pandas。

如何防止 I/O 在另一个包打开文件后出错？或者一种简单有效的方法来验证 csv 而无需使用繁重的 pandas

Answer 1

解决方案是在读取第一行后查找() 指向文件开头的指针。阅读的过程和pandas做的差不多。唯一明显的优点是它不依赖于 importing/installing pandas.

wrapper = StringIO(inputfile.readline().decode('utf-8'))
        header = next(csv.reader(wrapper,  delimiter=','))
        inputfile.seek(0,0)

Using pandas.read_csv() is conflicting with csv.reader() - ValueError: I/O operation on closed file

Using pandas.read_csv() is conflicting with csv.reader() - ValueError: I/O operation on closed file

python

csv

io

pandas