如何制作未知级别的嵌套列表,然后将其组合为单独的列表

how to make an unknown level nested list and then get combinations of it as separated lists

我想获取未知级别的嵌套列表的所有不同组合。

我在寻找解决方案,但所有类似的问题都不是我要找的。

考虑到水平可能比这更深,例如对于 5 或​​ 6 级内部。

主要问题是为 CKY 算法实现一个反向指针以获得所有可能的语法树,这是一个真正未知级别的嵌套列表!

我有一个反向指针:

backpointer = {
    (0, 2, 'NP'): {
        (1, 'AD', 'NP')
    }, 
    (1, 3, 'X1'): {
        (2, 'NP', 'PA')
    }, 
    (1, 3, 'NP'): {
        (2, 'NP', 'NP')
    }, (0, 3, 'X1'): {
        (2, 'NP', 'PA'), 
        (1, 'DT', 'NP')
    }, 
    (2, 4, 'X2'): {
        (3, 'PA', 'VP')
    }, 
    (1, 4, 'S'): {
        (2, 'NP', 'X2'), 
        (3, 'X1', 'VP')
    }, 
    (0, 4, 'S'): {
        (2, 'NP', 'X2'), 
        (3, 'X1', 'VP')
    }
}

我从 (0, 4, 'S') 考虑所有可能的方式后退。

我现在的输出是这样的,没有分类。:

[
    (0, 4, 'S'), (0, 3, 'X1'), (0, 2, 'NP'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), 
    (0, 1, 'DT'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP'), (3, 4, 'VP'), (0, 2, 'NP'), 
    (0, 1, 'AD'), (1, 2, 'NP'), (2, 4, 'X2'), (2, 3, 'PA'), (3, 4, 'VP')
]

我正在尝试将其作为如下嵌套列表进行分类

[
    (0, 4, 'S'), 
    [
        (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
    ], 
    [
        (0, 3, 'X1'), 
        (3, 4, 'VP'), 
        [
            (0, 2, 'NP'), (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
        ], 
        [
            (0, 1, 'AD'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
        ]
    ]
]

然后将其作为每个可能的唯一树的一些列表显示给用户。

[
    (0, 4, 'S'), (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), 
    (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
]

[
    (0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 2, 'NP'), 
    (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
]

[
    (0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 1, 'AD'), 
    (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]

IIUC,您可以先为 "un-nest" 嵌套的 list 编写一个递归生成器。这是一个快速而肮脏的方法1:

def unnest(lst, append=False):
    chunk = []
    for x in lst:
        if isinstance(x, list):
            if chunk:
                yield chunk
            yield from unnest(x, True)
            chunk = []
        else:
            if append:
                chunk.append(x)
            else:
                yield [x]
    if chunk:
        yield chunk

lst = [0, 1, [2, 3, 4, [5, 6]], 7, [8, 9]]  # per original question
print(list(unnest(lst)))
#[[0], [1], [2, 3, 4], [5, 6], [7], [8, 9]]

现在使用 itertools.product 获得所需的元素组合:

from itertools import product
print(list(product(*unnest(lst))))
#[(0, 1, 2, 5, 7, 8),
# (0, 1, 2, 5, 7, 9),
# (0, 1, 2, 6, 7, 8),
# (0, 1, 2, 6, 7, 9),
# (0, 1, 3, 5, 7, 8),
# (0, 1, 3, 5, 7, 9),
# (0, 1, 3, 6, 7, 8),
# (0, 1, 3, 6, 7, 9),
# (0, 1, 4, 5, 7, 8),
# (0, 1, 4, 5, 7, 9),
# (0, 1, 4, 6, 7, 8),
# (0, 1, 4, 6, 7, 9)]

备注:

  1. yield from 仅适用于 python 3

它运行良好,无需创建嵌套列表。直接从根检索所有可能的树。

backpointer = {}
rules_prob = {}
syn_tags = []
syn_probs = []

def get_syn_tree(bp, ix):

    if bp[1] - bp[0] == 1:  # end - start = 1   (2, 3, N)
        return

    current_ix = ix
    for i in range(len(backpointer[bp])-1):
        syn_tags.append(syn_tags[current_ix].copy())

    counter = 0
    for item in list(backpointer[bp]):
        # (0, 6, S) -> (4, N,VP) =>     (0, 4, N) , (4, 6, VP)
        syn_tags[current_ix + counter].add((bp[0], item[0], item[1]))
        syn_tags[current_ix + counter].add((item[0], bp[1], item[2]))

        get_syn_tree((bp[0], item[0], item[1]), current_ix + counter)
        get_syn_tree((item[0], bp[1], item[2]), current_ix + counter)

        counter += 1

syn_tags.append({(start, end, A)})
syn_probs.append(0)  # for + = 0, for × = 1
get_syn_tree((start, end, A), 0)