我怎样才能防止这个函数在列表列表中产生列表列表？

Question

我让这个代码段处理一个相当难看的数据文件，该文件包含我需要以整洁的方式插入的数据。

即数据文件将包含：

...
...
...
alphabetical text
13 42 54 67
31 12
different alphabetical text
25 41 23 76
98 45 38
...
...
...

我需要将其写入列表列表中，内容如下：

[..., [13, 42, 54, 67, 31, 12], [25, 41, 23, 76, 98, 45, 38] ...]

我目前有这个代码：

if next_line[0].isalpha == True and line[0] == '1' or line[0] == '2' or line[0] == '3' or line[0] == '4' or line[0] == '5' or line[0] == '6' or line[0] == '7' or line[0] == '8' or line[0] == '9': #pardon my hard coding
    h = line.split()
    self.distances.append(h)
else:
    line_queue = []
    num_list = []
    for j in range(i, len(self.datlines)):
        check_line = self.datlines[j]
        if j != len(self.datlines)-1:
            next_check = self.datlines[j+1]
        if check_line[0] == '1' or check_line[0] == '2' or check_line[0] == '3' or check_line[0] == '4' or check_line[0] == '5' or check_line[0] == '6' or check_line[0] == '7' or check_line[0] == '8' or check_line[0] == '9':
            h = check_line.split()
            line_queue.append(h)
            for s in line_queue:
                if s != ' ' and s != '\n':
                    num_list.append(s)
            self.distances.append(num_list)
        if check_line[0].isalpha() == True:
            break

它偶尔给我的是这样一个列表的列表：

[..., [13, 42, 54, 67, 31, 12], [[25, 41, 23, 76, 98, 45, 38]] ...]

我翻了一遍又一遍，但我找不到它在哪里提出额外的列表层。

到底是什么导致了这种情况发生，我该如何解决？

非常感谢

Answer 1

您不需要单独的 line 和 next_line。只需遍历这些行，当该行是数字时连接数字列表。当您到达按字母顺序排列的行时，将该列表附加到结果列表并清除当前的数字列表。

在末尾附加最后的数字列表，以防末尾没有按字母顺序排列的行。

curlist = []
self.distances = []

for line in self.datlines:
    if line[0].isalpha():
        self.distances.append(curlist)
        curlist = []
    else:
        curlist.extend(line.split())

if curlist:
    self.distances.append(curlist)

Answer 2

这是使用正则表达式一次读取和捕获文件中所有数字的另一种方法：

with open(path_to_data_file, "r") as f:
    lines = re.findall(r"[a-zA-Z ]*\n([ \d]*)", f.read())


cleaned_lines = [[]]
for line in lines:
    if line:
        lines[-1].extend(map(int, line.split()))
    else:
        lines.append([])

使用此文本文件：

alphabetical text
13 42 54 67
31 12
different alphabetical text
25 41 23 76
98 45 38
hello world
532 15 52
5225 321 4789
999 999 999

输出：

[[13, 42, 54, 67, 31, 12],
 [25, 41, 23, 76, 98, 45, 38],
 [532, 15, 52, 5225, 321, 4789, 999, 999, 999]]

我认为实际上有一个正则表达式模式可以将数据捕获到单独的行中，但我还没有弄清楚。现在，模式 returns 结果如下：

['13 42 54 67',
 '31 12',
 '',
 '25 41 23 76',
 '98 45 38',
 '',
 '532 15 52',
 '5225 321 4789',
 '999 999 999']

然后我将其拆分为空字符串。

我怎样才能防止这个函数在列表列表中产生列表列表？

How can I keep this function from prodecing a list of lists within a list of lists?

python

list