如何通过主文件夹从许多子文件夹中抓取文件?
How could I grab file from many sub-folder via main folder?
如果我有一个主文件夹和许多子文件夹,而我的目标文件位于这些子文件夹中。我怎样才能正确设置我的路径然后程序可以直接通过我的主文件夹获取这些目标文件?
例如,
Main_folder
>sub_1
>>sub_1_v1
>>>targeted_file.txt # file I need
>>sub_2_v2
>>>targeted_file.txt # file I need
>sub_2
>>sub_1_v1
>>>targeted_file.txt # file I need
>>sub_2_v2
>>>targeted_file.txt # file I need
这是 Julien Spronck ()
创建的程序
def get_all_files(path):
## get a generator with all file names
import os
import glob
return glob.iglob(os.path.join(path,'*.txt'))
def get_all_data(files):
## get a generator with all the data from all the files
for fil in files:
with open(fil, 'r') as the_file:
for line in the_file:
yield line
def write_lines_to_file(lines, outfile):
with open(outfile, 'w') as the_file:
for line in lines:
the_file.write(line+'\n')
path = 'blah blah' # path should be given here!
outfile = 'blah.csv'
files = get_all_files(path)
lines = get_all_data(files)
write_lines_to_file(lines, outfile)
我的问题是,我怎样才能正确地给出路径(从主文件夹)然后我可以一次抓取所有目标文件?
谢谢。
要遍历文件夹,然后是文件,请使用:
import os
def list_files(dir):
r = []
subdirs = [x[0] for x in os.walk(dir)]
for subdir in subdirs:
files = os.walk(subdir).next()[2]
if (len(files) > 0):
for file in files:
r.append(subdir + "/" + file)
return r
如此处所示:
Python: Iterate through folders, then subfolders and print filenames with path to text file
如果我有一个主文件夹和许多子文件夹,而我的目标文件位于这些子文件夹中。我怎样才能正确设置我的路径然后程序可以直接通过我的主文件夹获取这些目标文件?
例如,
Main_folder
>sub_1
>>sub_1_v1
>>>targeted_file.txt # file I need
>>sub_2_v2
>>>targeted_file.txt # file I need
>sub_2
>>sub_1_v1
>>>targeted_file.txt # file I need
>>sub_2_v2
>>>targeted_file.txt # file I need
这是 Julien Spronck (
def get_all_files(path):
## get a generator with all file names
import os
import glob
return glob.iglob(os.path.join(path,'*.txt'))
def get_all_data(files):
## get a generator with all the data from all the files
for fil in files:
with open(fil, 'r') as the_file:
for line in the_file:
yield line
def write_lines_to_file(lines, outfile):
with open(outfile, 'w') as the_file:
for line in lines:
the_file.write(line+'\n')
path = 'blah blah' # path should be given here!
outfile = 'blah.csv'
files = get_all_files(path)
lines = get_all_data(files)
write_lines_to_file(lines, outfile)
我的问题是,我怎样才能正确地给出路径(从主文件夹)然后我可以一次抓取所有目标文件?
谢谢。
要遍历文件夹,然后是文件,请使用:
import os
def list_files(dir):
r = []
subdirs = [x[0] for x in os.walk(dir)]
for subdir in subdirs:
files = os.walk(subdir).next()[2]
if (len(files) > 0):
for file in files:
r.append(subdir + "/" + file)
return r
如此处所示:
Python: Iterate through folders, then subfolders and print filenames with path to text file