在 python 的目录中列出具有特定扩展名的文件的第一部分

Question

我正在尝试提取具有特定扩展名 (.txt) 的文件的第一部分，并且我正在尝试使其尽可能短，即使在一行中也是如此：

path = "/home/inputs"
text_files = [f for f in os.listdir("path") if f.endswith('.txt')]

print(text_files)
>['new_categorized.txt', 'new.txt', '2017_input.txt']

所以直到这里，它都有效。但是，我无法获得此必需列表：

>['new_categorized', 'new', '2017_input']

我试过：

print(os.path.splitext(text_files[0])[0])
> new_categorized

但是这样一来，我就失去了其他文件名。我怎样才能得到所有？

Answer 1

对于 Python 3.4 及更高版本，尝试使用新的 pathlib:

print([path.stem for path in Path('/home/inputs').glob('*.txt')])

Path.glob() 实现与 os.listdir + f.endswith('.txt') 相同的效果，然后在最后一个斜杠之后但在扩展名之前获取路径的一部分，我们只使用 .stem 属性在每条路径上。

使用您现有的代码，您 "lose the other file names," 因为您只在 text_files[0] 上调用 os.path.splittext。要在其中多个上执行此操作，请使用列表理解：

print([os.path.splitext(path)[0] for path in text_files])

Answer 2

你需要一个小技巧：

path = "/home/inputs"
text_files = ['.'.join(f.split('.')[:-1]) for f in os.listdir(path) if f.endswith('.txt')]

技巧如下：

'.'.join(f.split('.')[:-1])

它首先用点分割文件名，然后删除最后一个，然后用点将它们连接起来。这有效地去除了最后一个点和之后的所有内容，如果没有点，则什么都不做。

Answer 3

我刚刚从您的代码中编辑了 2 个主要内容。首先我使用路径作为变量而不是字符串。其次，我使用切片来获得所需的结果。

所以你可以尝试这样的事情：

>>> import os
>>> path = "/home/shashank"

>>> text_files = [f for f in os.listdir(path) if f.endswith('.txt')]
>>> text_files
['temp.txt', 'myfile.txt', 'angular.txt', 'y.txt']
>>>
>>> text_files = [f[:-4] for f in os.listdir(path) if f.endswith('.txt')]
>>> text_files
['temp', 'myfile', 'angular', 'y']

Answer 4

如果您希望它尽可能短，请使用带有 lambda 表达式的 map 函数：

print(list(map(lambda f: os.path.splitext(f)[0], text_files)))

Answer 5

你可以这样做：

[f.split(".")[0] for f in os.listdir(path) if f.endswith('.txt')]

Answer 6

纯函数式方法是可能的：

import os

text_files = ['new_categorized.txt', 'new.txt', '2017_input.txt']
list(zip(*map(os.path.splitext, text_files)))[0]

# ('new_categorized', 'new', '2017_input')

这里输出的是元组而不是列表。

在 python 的目录中列出具有特定扩展名的文件的第一部分

Making a list of the first parts of files with certain extensions in a directory in python

python

filenames