使用 itertools 库获取对象的组合？

Question

我想生成 Python 中列表对象的组合，并且正在研究 itertools.product 或类似函数来计算组合。由于 itertools.product 从多个数组生成组合

我有一个看起来像这样的对象：

{
    "Cities": [
        {
            "Id": 1,
            "Value": "New York"
        },
        {
            "Id": 2,
            "Value": "Boston"
        }
    ],
    "People": [
        {
            "Id": 1,
            "Value": "Michael"
        },
        {
            "Id": 2,
            "Value": "Ryan"
        },
        {
            "Id": 3,
            "Value": "Jim"
        },
        {
            "Id": 4,
            "Value": "Phyllis"
        }
    ]
}

我想生成一个列表，显示居住在每个城市的所有人口组合。所以在上面的例子中有 8 个值的列表。

我的代码如下：

import json
import itertools


def main():
    combinations = []
    with open('people.json') as f:
        data = json.load(f)

    combinations = list(itertools.product(*data))
    print(combinations)


if __name__ == "__main__":
    main()

当运行我得到一个完全不同的结果：

如何修改我的代码以获得我想要的结果？

注意：不一定要用itertools，我只是觉得itertools是用来计算这些

Answer 1

要使用您显示的数据执行您描述的工作，此脚本将执行此操作：

import json
import itertools


def main():
    combinations = []
    with open('people.json') as f:
        data = json.load(f)

    combinations = list(itertools.product(data['Cities'], data['People']))
    print(combinations)


if __name__ == "__main__":
    main()

唯一的区别是我从数据结构中指定要使用的数据。

输出（为了便于阅读，我对其进行了格式化）：

[
  ({"Id": 1, "Value": "New York"}, 
   {"Id": 1, "Value": "Michael"}), 
  ({"Id": 1, "Value": "New York"}, 
   {"Id": 2, "Value": "Ryan"}), 
  ({"Id": 1, "Value": "New York"}, 
   {"Id": 3, "Value": "Jim"}), 
  ({"Id": 1, "Value": "New York"}, 
   {"Id": 4, "Value": "Phyllis"}), 
  ({"Id": 2, "Value": "Boston"}, 
   {"Id": 1, "Value": "Michael"}), 
  ({"Id": 2, "Value": "Boston"}, 
   {"Id": 2, "Value": "Ryan"}), 
  ({"Id": 2, "Value": "Boston"}, 
   {"Id": 3, "Value": "Jim"}), 
  ({"Id": 2, "Value": "Boston"}, 
   {"Id": 4, "Value": "Phyllis"})
]

如果您想在数据集中的任何键之间执行 product，您会想要执行 itertools.product(data.values())，但我显示的代码更清晰。

Answer 2

为什么会得到这样的输出：

当您执行 list(itertools.product(*data)) 时，相同的内容会传递给 product，您在执行时会看到：

for x in data:
    print(x)

也就是说，你做到了

itertools.product(['Cities', 'People'])

这就是为什么你得到了这两个字符串中字符的乘积（耶鸭打字！）

[
 ('C', 'P'),
 ('C', 'e'),
 ('C', 'o'),
 ('C', 'p'),
 ('C', 'l'),
 ('C', 'e'),
 ('i', 'P'),
 ('i', 'e'),
 ('i', 'o'),
 ('i', 'p'),
 ('i', 'l'),
 ('i', 'e'),
 ...
]

如何得到你想要的输出：

您使用的 product() 是正确的，但提供了错误的数据。

cities = [c['Value'] for c in data['Cities']] # Extract all cities Value from list-of-dicts
people = [c['Value'] for c in data['People']] # Extract all people Value from list-of-dicts
print(list(itertools.product(cities, people))) # Product

这给出了输出：

[
 ('New York', 'Michael'),
 ('New York', 'Ryan'),
 ('New York', 'Jim'),
 ('New York', 'Phyllis'),
 ('Boston', 'Michael'),
 ('Boston', 'Ryan'),
 ('Boston', 'Jim'),
 ('Boston', 'Phyllis')
]

如果您想要 dict 对象 而不是 Value 键，您只需将这些对象传递给 product():

print(list(itertools.product(data['Cities'], data['People']))) # Product

这给出了

[
  ({'Id': 1, 'Value': 'New York'}, {'Id': 1, 'Value': 'Michael'}),
  ({'Id': 1, 'Value': 'New York'}, {'Id': 2, 'Value': 'Ryan'}),
  ({'Id': 1, 'Value': 'New York'}, {'Id': 3, 'Value': 'Jim'}),
  ({'Id': 1, 'Value': 'New York'}, {'Id': 4, 'Value': 'Phyllis'}),
  ({'Id': 2, 'Value': 'Boston'}, {'Id': 1, 'Value': 'Michael'}),
  ({'Id': 2, 'Value': 'Boston'}, {'Id': 2, 'Value': 'Ryan'}),
  ({'Id': 2, 'Value': 'Boston'}, {'Id': 3, 'Value': 'Jim'}),
  ({'Id': 2, 'Value': 'Boston'}, {'Id': 4, 'Value': 'Phyllis'})
]

符合预期。

使用 itertools 库获取对象的组合？

Using itertools library to get combinations of objects?

python

itertools

python-3.x

为什么会得到这样的输出：

如何得到你想要的输出：