如何制作未知级别的嵌套列表,然后将其组合为单独的列表
how to make an unknown level nested list and then get combinations of it as separated lists
我想获取未知级别的嵌套列表的所有不同组合。
我在寻找解决方案,但所有类似的问题都不是我要找的。
考虑到水平可能比这更深,例如对于 5 或 6 级内部。
主要问题是为 CKY 算法实现一个反向指针以获得所有可能的语法树,这是一个真正未知级别的嵌套列表!
我有一个反向指针:
backpointer = {
(0, 2, 'NP'): {
(1, 'AD', 'NP')
},
(1, 3, 'X1'): {
(2, 'NP', 'PA')
},
(1, 3, 'NP'): {
(2, 'NP', 'NP')
}, (0, 3, 'X1'): {
(2, 'NP', 'PA'),
(1, 'DT', 'NP')
},
(2, 4, 'X2'): {
(3, 'PA', 'VP')
},
(1, 4, 'S'): {
(2, 'NP', 'X2'),
(3, 'X1', 'VP')
},
(0, 4, 'S'): {
(2, 'NP', 'X2'),
(3, 'X1', 'VP')
}
}
我从 (0, 4, 'S') 考虑所有可能的方式后退。
我现在的输出是这样的,没有分类。:
[
(0, 4, 'S'), (0, 3, 'X1'), (0, 2, 'NP'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'),
(0, 1, 'DT'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP'), (3, 4, 'VP'), (0, 2, 'NP'),
(0, 1, 'AD'), (1, 2, 'NP'), (2, 4, 'X2'), (2, 3, 'PA'), (3, 4, 'VP')
]
我正在尝试将其作为如下嵌套列表进行分类
[
(0, 4, 'S'),
[
(0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
],
[
(0, 3, 'X1'),
(3, 4, 'VP'),
[
(0, 2, 'NP'), (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
],
[
(0, 1, 'AD'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]
]
]
然后将其作为每个可能的唯一树的一些列表显示给用户。
[
(0, 4, 'S'), (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'),
(1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
]
[
(0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 2, 'NP'),
(2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
]
[
(0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 1, 'AD'),
(1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]
IIUC,您可以先为 "un-nest" 嵌套的 list
编写一个递归生成器。这是一个快速而肮脏的方法1:
def unnest(lst, append=False):
chunk = []
for x in lst:
if isinstance(x, list):
if chunk:
yield chunk
yield from unnest(x, True)
chunk = []
else:
if append:
chunk.append(x)
else:
yield [x]
if chunk:
yield chunk
lst = [0, 1, [2, 3, 4, [5, 6]], 7, [8, 9]] # per original question
print(list(unnest(lst)))
#[[0], [1], [2, 3, 4], [5, 6], [7], [8, 9]]
现在使用 itertools.product
获得所需的元素组合:
from itertools import product
print(list(product(*unnest(lst))))
#[(0, 1, 2, 5, 7, 8),
# (0, 1, 2, 5, 7, 9),
# (0, 1, 2, 6, 7, 8),
# (0, 1, 2, 6, 7, 9),
# (0, 1, 3, 5, 7, 8),
# (0, 1, 3, 5, 7, 9),
# (0, 1, 3, 6, 7, 8),
# (0, 1, 3, 6, 7, 9),
# (0, 1, 4, 5, 7, 8),
# (0, 1, 4, 5, 7, 9),
# (0, 1, 4, 6, 7, 8),
# (0, 1, 4, 6, 7, 9)]
备注:
yield from
仅适用于 python 3
它运行良好,无需创建嵌套列表。直接从根检索所有可能的树。
backpointer = {}
rules_prob = {}
syn_tags = []
syn_probs = []
def get_syn_tree(bp, ix):
if bp[1] - bp[0] == 1: # end - start = 1 (2, 3, N)
return
current_ix = ix
for i in range(len(backpointer[bp])-1):
syn_tags.append(syn_tags[current_ix].copy())
counter = 0
for item in list(backpointer[bp]):
# (0, 6, S) -> (4, N,VP) => (0, 4, N) , (4, 6, VP)
syn_tags[current_ix + counter].add((bp[0], item[0], item[1]))
syn_tags[current_ix + counter].add((item[0], bp[1], item[2]))
get_syn_tree((bp[0], item[0], item[1]), current_ix + counter)
get_syn_tree((item[0], bp[1], item[2]), current_ix + counter)
counter += 1
syn_tags.append({(start, end, A)})
syn_probs.append(0) # for + = 0, for × = 1
get_syn_tree((start, end, A), 0)
我想获取未知级别的嵌套列表的所有不同组合。
我在寻找解决方案,但所有类似的问题都不是我要找的。
考虑到水平可能比这更深,例如对于 5 或 6 级内部。
主要问题是为 CKY 算法实现一个反向指针以获得所有可能的语法树,这是一个真正未知级别的嵌套列表!
我有一个反向指针:
backpointer = {
(0, 2, 'NP'): {
(1, 'AD', 'NP')
},
(1, 3, 'X1'): {
(2, 'NP', 'PA')
},
(1, 3, 'NP'): {
(2, 'NP', 'NP')
}, (0, 3, 'X1'): {
(2, 'NP', 'PA'),
(1, 'DT', 'NP')
},
(2, 4, 'X2'): {
(3, 'PA', 'VP')
},
(1, 4, 'S'): {
(2, 'NP', 'X2'),
(3, 'X1', 'VP')
},
(0, 4, 'S'): {
(2, 'NP', 'X2'),
(3, 'X1', 'VP')
}
}
我从 (0, 4, 'S') 考虑所有可能的方式后退。
我现在的输出是这样的,没有分类。:
[
(0, 4, 'S'), (0, 3, 'X1'), (0, 2, 'NP'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'),
(0, 1, 'DT'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP'), (3, 4, 'VP'), (0, 2, 'NP'),
(0, 1, 'AD'), (1, 2, 'NP'), (2, 4, 'X2'), (2, 3, 'PA'), (3, 4, 'VP')
]
我正在尝试将其作为如下嵌套列表进行分类
[
(0, 4, 'S'),
[
(0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
],
[
(0, 3, 'X1'),
(3, 4, 'VP'),
[
(0, 2, 'NP'), (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
],
[
(0, 1, 'AD'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]
]
]
然后将其作为每个可能的唯一树的一些列表显示给用户。
[
(0, 4, 'S'), (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'),
(1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
]
[
(0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 2, 'NP'),
(2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
]
[
(0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 1, 'AD'),
(1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]
IIUC,您可以先为 "un-nest" 嵌套的 list
编写一个递归生成器。这是一个快速而肮脏的方法1:
def unnest(lst, append=False):
chunk = []
for x in lst:
if isinstance(x, list):
if chunk:
yield chunk
yield from unnest(x, True)
chunk = []
else:
if append:
chunk.append(x)
else:
yield [x]
if chunk:
yield chunk
lst = [0, 1, [2, 3, 4, [5, 6]], 7, [8, 9]] # per original question
print(list(unnest(lst)))
#[[0], [1], [2, 3, 4], [5, 6], [7], [8, 9]]
现在使用 itertools.product
获得所需的元素组合:
from itertools import product
print(list(product(*unnest(lst))))
#[(0, 1, 2, 5, 7, 8),
# (0, 1, 2, 5, 7, 9),
# (0, 1, 2, 6, 7, 8),
# (0, 1, 2, 6, 7, 9),
# (0, 1, 3, 5, 7, 8),
# (0, 1, 3, 5, 7, 9),
# (0, 1, 3, 6, 7, 8),
# (0, 1, 3, 6, 7, 9),
# (0, 1, 4, 5, 7, 8),
# (0, 1, 4, 5, 7, 9),
# (0, 1, 4, 6, 7, 8),
# (0, 1, 4, 6, 7, 9)]
备注:
yield from
仅适用于 python 3
它运行良好,无需创建嵌套列表。直接从根检索所有可能的树。
backpointer = {}
rules_prob = {}
syn_tags = []
syn_probs = []
def get_syn_tree(bp, ix):
if bp[1] - bp[0] == 1: # end - start = 1 (2, 3, N)
return
current_ix = ix
for i in range(len(backpointer[bp])-1):
syn_tags.append(syn_tags[current_ix].copy())
counter = 0
for item in list(backpointer[bp]):
# (0, 6, S) -> (4, N,VP) => (0, 4, N) , (4, 6, VP)
syn_tags[current_ix + counter].add((bp[0], item[0], item[1]))
syn_tags[current_ix + counter].add((item[0], bp[1], item[2]))
get_syn_tree((bp[0], item[0], item[1]), current_ix + counter)
get_syn_tree((item[0], bp[1], item[2]), current_ix + counter)
counter += 1
syn_tags.append({(start, end, A)})
syn_probs.append(0) # for + = 0, for × = 1
get_syn_tree((start, end, A), 0)