pandas read_csv throwing ValueError: Invalid file path or buffer object type: <class 'list'>
pandas read_csv throwing ValueError: Invalid file path or buffer object type: <class 'list'>
我想读取作为命令行参数发送的 csv 文件。以为我可以直接使用 argsprase 的 FileType 对象,但我遇到了错误。
from argparse import ArgumentParser, FileType
from pandas import read_csv
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("input_file_path", help="Input CSV file", type=FileType('r'), nargs=1)
df = read_csv(parser.parse_args().input_file_path, sep="|")
print(df.to_string())
Pandas read_csv 在我执行下面给出的程序时无法读取 FileType 对象 - 缺少什么?
python csv_splitter.py test.csv
Traceback (most recent call last):
File "csv_splitter.py", line 7, in <module>
df = read_csv(parser.parse_args().input_file_path, sep="|")
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 605, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 814, in __init__
self._engine = self._make_engine(self.engine)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1045, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1862, in __init__
self._open_handles(src, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1357, in _open_handles
self.handles = get_handle(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 558, in get_handle
ioargs = _get_filepath_or_buffer(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 371, in _get_filepath_or_buffer
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'list'>
pd.read_csv
无法读取文件列表,一次只能读取一个文件。
要将多个文件读入一个数据帧,请使用 pd.concat
和生成器:
df = pd.concat(pd.read_csv(p) for p in paths)
df = pd.concat(map(pd.read_csv, paths))
在 OP 的情况下,即使 nargs=1
将 arg 解析器限制为使用 1 个文件,它仍然 returns 那 1 个文件对象的 list:
print(parser.parse_args().input_file_path)
# [ <_io.TextIOWrapper> ]
所以只需索引单个文件:
df = pd.read_csv(parser.parse_args().input_file_path[0])
# ^^^
我想读取作为命令行参数发送的 csv 文件。以为我可以直接使用 argsprase 的 FileType 对象,但我遇到了错误。
from argparse import ArgumentParser, FileType
from pandas import read_csv
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("input_file_path", help="Input CSV file", type=FileType('r'), nargs=1)
df = read_csv(parser.parse_args().input_file_path, sep="|")
print(df.to_string())
Pandas read_csv 在我执行下面给出的程序时无法读取 FileType 对象 - 缺少什么?
python csv_splitter.py test.csv
Traceback (most recent call last):
File "csv_splitter.py", line 7, in <module>
df = read_csv(parser.parse_args().input_file_path, sep="|")
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 605, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 814, in __init__
self._engine = self._make_engine(self.engine)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1045, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1862, in __init__
self._open_handles(src, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1357, in _open_handles
self.handles = get_handle(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 558, in get_handle
ioargs = _get_filepath_or_buffer(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 371, in _get_filepath_or_buffer
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'list'>
pd.read_csv
无法读取文件列表,一次只能读取一个文件。
要将多个文件读入一个数据帧,请使用 pd.concat
和生成器:
df = pd.concat(pd.read_csv(p) for p in paths)
df = pd.concat(map(pd.read_csv, paths))
在 OP 的情况下,即使 nargs=1
将 arg 解析器限制为使用 1 个文件,它仍然 returns 那 1 个文件对象的 list:
print(parser.parse_args().input_file_path)
# [ <_io.TextIOWrapper> ]
所以只需索引单个文件:
df = pd.read_csv(parser.parse_args().input_file_path[0])
# ^^^