从子目录 python 中提取 csv 文件

Extract the csv files from the sub directory python

我有本地路径(父目录),我想只提取包含 csv 的路径并将它们保存在 csv 文件中。

到目前为止我尝试了什么?

import os

directory = os.path.join("path/to/dir","path")
for root,dirs,files in os.walk(directory): 
    for file in files: 
        if file.endswith(".csv"): 
            f=open(file, 'r')
            f.close()

这不会提取所有 csv 并保存。我该怎么做?

如果我理解,你只想记录文件夹中CSV文件的路径。

对于所有递归路径,应该这样做:

import os

directory = os.path.join("path/to/dir","path")

with open("all_files.csv", "w") as f:
    for root,dirs,files in os.walk(directory): 
        for file in files: 
            if file.endswith(".csv"): 
                f.write(f"{root}/{file}")

如果您希望每个目录一个 CSV 文件:

for root,dirs,files in os.walk(directory): 
    with open(f"{root}/files.csv", "w") as f:
        for file in files: 
            if file.endswith(".csv"): 
                f.write(f"{root}/{file}")

我认为你真的不需要使用 os.walk 功能。 相反,glob 具有递归功能,可以让你得到你想要的东西。

from glob import glob
import csv
import os

parent_directory = "/parent/directory/"
save_file = "/save/directory/csv_of_csvs.csv"
csv_files_list = glob(pathname=parent_directory + "**/*.csv", recursive=True)
folder_list = [os.path.dirname(i) for i in csv_files_list]
folder_list = set(folder_list)

with open(save_file, 'w', newline ='') as csv_file:
    write = csv.writer(csv_file, delimiter=',')
    for i in folder_list:
        write.writerow([i])
exit()