将目录传递到 Python 中的变量
Pass a directory into a variable in Python
我正在尝试修改 GitHub 的脚本,该脚本将访问 TAR 文件并对其进行处理。代码中有一个变量需要指向文件所在的根目录(我认为......)。下面是代码:
def make_Dictionary(root_dir):
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)]
all_words = []
for emails_dir in emails_dirs:
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
for mail in emails:
with open(mail) as m:
for line in m:
words = line.split()
all_words += words
dictionary = Counter(all_words)
list_to_remove = dictionary.keys()
for item in list_to_remove:
if item.isalpha() == False:
del dictionary[item]
elif len(item) == 1:
del dictionary[item]
dictionary = dictionary.most_common(4000)
np.save('dict_movie.npy',dictionary)
return dictionary
root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)
root_dir 正在投掷:
File "C:\Users\seand\eclipse-workspace\sentiment_project\src\root\nested\movie-polarity.py", line 22, in make_Dictionary
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\Users\seand\eclipse-workspace\sentiment_project\src\root\nested\movie-polarity-tfidf.py'
方向状态 "Note : Directory path of corpus in movie-polarity-tfidf.py and movie-polarity.py needs to be set accordingly." 但我指定的路径包含脚本需要的语料库 TAR 文件。我不明白为什么,如果脚本正在寻找一个目录,这个 .py 文件就会被拾取。
对 emails_dirs
的理解正在返回一些非目录。可能可以通过以下方式修复:
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)
if os.path.isdir(os.path.join(root_dir,f))]
您在函数的第一行使用了 os.path.join(rootdir,f)
,因此 email_dirs 是一个绝对路径列表,而不是目录。所以你得到了例外。
os.listdir
列出目录中的所有内容。这包括文件和目录。我假设您第一次只需要目录(生成 email_dirs
的列表),第二次只需要文件(生成 emails
的列表)。
def make_Dictionary(root_dir):
# # # Check for only directories # # #
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir) if os.path.isdir(f)]
all_words = []
for emails_dir in emails_dirs:
# # # Check for only files # # #
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir) if os.path.isfile(f)]
for mail in emails:
with open(mail) as m:
for line in m:
words = line.split()
all_words += words
dictionary = Counter(all_words)
list_to_remove = dictionary.keys()
for item in list_to_remove:
if item.isalpha() == False:
del dictionary[item]
elif len(item) == 1:
del dictionary[item]
dictionary = dictionary.most_common(4000)
np.save('dict_movie.npy',dictionary)
return dictionary
root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)
我正在尝试修改 GitHub 的脚本,该脚本将访问 TAR 文件并对其进行处理。代码中有一个变量需要指向文件所在的根目录(我认为......)。下面是代码:
def make_Dictionary(root_dir):
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)]
all_words = []
for emails_dir in emails_dirs:
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
for mail in emails:
with open(mail) as m:
for line in m:
words = line.split()
all_words += words
dictionary = Counter(all_words)
list_to_remove = dictionary.keys()
for item in list_to_remove:
if item.isalpha() == False:
del dictionary[item]
elif len(item) == 1:
del dictionary[item]
dictionary = dictionary.most_common(4000)
np.save('dict_movie.npy',dictionary)
return dictionary
root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)
root_dir 正在投掷:
File "C:\Users\seand\eclipse-workspace\sentiment_project\src\root\nested\movie-polarity.py", line 22, in make_Dictionary
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\Users\seand\eclipse-workspace\sentiment_project\src\root\nested\movie-polarity-tfidf.py'
方向状态 "Note : Directory path of corpus in movie-polarity-tfidf.py and movie-polarity.py needs to be set accordingly." 但我指定的路径包含脚本需要的语料库 TAR 文件。我不明白为什么,如果脚本正在寻找一个目录,这个 .py 文件就会被拾取。
对 emails_dirs
的理解正在返回一些非目录。可能可以通过以下方式修复:
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)
if os.path.isdir(os.path.join(root_dir,f))]
您在函数的第一行使用了 os.path.join(rootdir,f)
,因此 email_dirs 是一个绝对路径列表,而不是目录。所以你得到了例外。
os.listdir
列出目录中的所有内容。这包括文件和目录。我假设您第一次只需要目录(生成 email_dirs
的列表),第二次只需要文件(生成 emails
的列表)。
def make_Dictionary(root_dir):
# # # Check for only directories # # #
emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir) if os.path.isdir(f)]
all_words = []
for emails_dir in emails_dirs:
# # # Check for only files # # #
emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir) if os.path.isfile(f)]
for mail in emails:
with open(mail) as m:
for line in m:
words = line.split()
all_words += words
dictionary = Counter(all_words)
list_to_remove = dictionary.keys()
for item in list_to_remove:
if item.isalpha() == False:
del dictionary[item]
elif len(item) == 1:
del dictionary[item]
dictionary = dictionary.most_common(4000)
np.save('dict_movie.npy',dictionary)
return dictionary
root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)