从嵌套的 python 字典生成所有组合并将它们分离

Generate all combinations from a nested python dictionary and segregate them

我的示例字典是:

sample_dict = {
    'company': {
        'employee': {
            'name': [
                {'explore': ["noname"],
                 'valid': ["john","tom"],
                 'boundary': ["aaaaaaaaaa"],
                 'negative': ["$"]}],
            'age': [
                {'explore': [200],
                 'valid': [20,30],
                 'boundary': [1,99],
                 'negative': [-1,100]}],
            'others':{
                'grade':[
                    {'explore': ["star"],
                     'valid': ["A","B"],
                     'boundary': ["C"],
                     'negative': ["AB"]}]}
    }
}}

这是一个 "follow-on" 的问题->
我想获得如下组合的隔离列表

有效组合:[仅从有效数据列表中生成]
有效类别的完整输出:

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'B'}}} 
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'B'}}}

负组合:[这里有点棘手,因为负组合应该与 "valid" 池组合,并且至少只有一个值为负]
NEGATIVE 类别的预期完整输出:
=>[基本上,排除所有值都有效的组合——确保组合中至少有一个值来自负组]

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 100}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': 100}, 'name': '$', 'others': {'grade': 'AB'}}}

在上面的输出中,在第一行中,通过保持全部有效来测试等级是否为负值 AB。因此,没有必要生成与 30 岁相同的数据,因为其目的是仅测试负集。我们可以为其余参数提供任何有效数据。


边界组合类似于边界池内所有值的有效->组合
Explore : 类似于 negative - 与有效池混合,并且在所有组合中始终至少有一个 explore 值。

示例字典 - 修订版

sample_dict2 = {
    'company': {
        'employee_list': [
            {'employee': {'age': [{'boundary': [1,99],
                                   'explore': [200],
                                   'negative': [-1,100],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': ['aaaaaaaaaa'],
                                    'explore': ['noname'],
                                    'negative': ['$'],
                                    'valid': ['john','tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']},
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']}]}}},
            {'employee': {'age': [{'boundary': [1, 99],
                                   'explore': [200],
                                   'negative': [],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': [],
                                    'explore': [],
                                    'negative': ['$'],
                                    'valid': ['john', 'tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': [],
                                   'valid': ['A', 'B']},
                                  {'boundary': [],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A', 'B']}]}}}
        ]
    }
}

sample_dict2 包含字典列表。这里 "employee" 整个层次结构是一个列表元素,叶节点 "grade" 也是一个列表
此外,除了 "valid" 和 "boundary" 其他数据集可以为空 - [] 我们也需要处理它们。
有效组合将像

{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}

在员工索引 0 中加上 age=30 和 name=tom 的组合

这是一个开放式的马蜂窝问题。

  1. 查看 Agitar 的 Agitar 其他工具的白皮书,看看这是否是您的想法。

  2. 看看 Knuth 在 combinationals 上的工作。很难读。

  3. 考虑只编写一个使用 'yield '.

  4. 的递归下降生成器
import itertools

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            return thing[positive] if negative is None else [thing[positive][0]] + thing[negative]
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                subresults = []
                for result in generate_combinations(value, positive, negative):
                    subresults.append((key, result))
                results.append(subresults)
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, list) or isinstance(thing, tuple):  # added tuple just to be safe
        results = []
        for element in thing:  # generate recursive result sets for each element of list
            for result in generate_combinations(element, positive, negative):
                results.append(result)
        return results

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_invalid_combinations(thing):

    """ Generate all possible combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='negative') if result not in valid]


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    return generate_combinations(thing, positive="boundary")


def generate_explore_combinations(thing):

    """ Generate all possible explore combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='explore') if result not in valid]

呼叫 generate_combinations(sample_dict) returns:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'B'}}}}
]

正在调用 generate_invalid_combinations(sample_dict) returns:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'AB'}}}}
]

呼叫 generate_boundary_combinations(sample_dict) returns:

[
{'company': {'employee': {'age': 1, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}},
{'company': {'employee': {'age': 99, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}}
]

呼叫 generate_explore_combinations(sample_dict) returns:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'star'}}}}
]

修改后的解决方案(匹配修改后的问题)

import itertools
import random

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            if negative is None:
                return thing[positive]  # here it's OK if it's empty
            elif thing[positive]:  # here it's not OK if it's empty
                return [random.choice(thing[positive])] + thing[negative]
            else:
                return []
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                results.append([(key, result) for result in generate_combinations(value, positive, negative)])
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, (list, tuple)):  # added tuple just to be safe (thanks Padraic!)
        # generate recursive result sets for each element of list
        results = [generate_combinations(element, positive, negative) for element in thing]
        return [list(result) for result in itertools.product(*results)]

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='boundary') if result not in valid]

generate_invalid_combinations()generate_explore_combinations()同前。细微差别:

它现在从有效数组中随机抓取一个项目,而不是在负面评价中从有效数组中抓取第一个项目。

'age': [30] 等项的值以列表形式返回,因为它们是这样指定的:

'age': [{'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]}],

如果您想要 'age': 30 就像前面的输出示例一样,请相应地修改定义:

'age': {'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]},

边界 属性 现在被视为 'negative' 值之一。

仅供参考,这次我不打算生成所有输出:调用 generate_combinations(sample_dict2) returns 结果如下:

[
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
...
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}]}}
]