纸浆优化不重复排列组合

Question

我正在研究一个优化问题，其中我的 objective 是在售出两个产品对时利润最大化，但约束条件是，该对中的产品不应重复。

我正在使用 Pulp 来优化解决方案，但代码效率低下并且会进入无限循环。

file = pd.read_csv('input_file.csv')

main_product_1 = list(file['Product ID'].unique())
main_product_2 = list(file['Product ID 2'].unique())

file.set_index(['Product ID', 'Product ID 2'], inplace=True)
file = file['Profit']

# Target Variable
combine = pulp.LpVariable.dicts("combine",
                                ((product_1, product_2) for product_1 in main_product_1 for
                                 product_2 in main_product_2 if product_1 != product_2),
                                cat='Binary')

# Initializing the model

model = pulp.LpProblem("prof_max", pulp.LpMaximize)

# Objective Function optimization
model += pulp.lpSum(
        [combine[product_1, product_2] * file.loc[(product_1, product_2)] for product_1 in
         main_product_1 for product_2 in main_product_2 if product_1 != product_2])

# Constraints for optimization
for area in set_plant:
  model += pulp.lpSum([combine[area, other] for other in main_product_1 if area != other]
                      + [combine[other, area] for other in main_product_2 if area != other]) == 1

model.solve()
print(pulp.LpStatus[model.status])

# Check
set_index = set(file.index)
set_expected = set(
        [(product_1, product_2) for product_1 in main_product_1 for product_2 in main_product_2 if
         product_1 != product_2])
len(set_expected - set_index)

问题是代码进入了无限循环，我没有得到任何结果，有没有更优化的方法运行这个方法？

Answer 1

问题是您要添加非常非常多的整数变量，整数线性规划是一个极其难以解决且效率低下的问题。但是，您可能还有一些额外的改进。

考虑以下结构：

如果您有 M 个产品 1 的实例和 N 个产品 2 的实例，那么您有 MN 个二元变量 x_mn;
如果 x_mn 为 1，则这些变量中的每一个都对 objective 函数贡献 P；
你在约束中说 x_mn + x_nm == 1，但实际上，它应该是 x_mn + x_nm <= 1。否则，你说你必须有你的列表中的每个组合。这可能会导致不可行的解决方案；
如果您考虑的是组合而不是排列（即 [1000, 1001] 和 [1001, 1000] 相同，那么这意味着 M = N 并且您只能删除大约一半的变量剩下 M^2 / 2。（如果你看到值 space 是一个 MxM 正方形，你只接受大约一个三角形，因为另一个三角形是等价的；
如果你像上面提到的那样限制变量space，你实际上不需要任何约束；如果 x_nm 不存在，那么 x_mn <= 1 显然是二进制变量。

import pandas as pd
import numpy as np
import pulp

file = pd.read_csv('file.csv')
file = file[file['Product ID'] < file['Product ID 2']]
file.set_index(['Product ID', 'Product ID 2'], inplace=True)
file = file['Profit']

combinations = file.index
individual_products = set()

for product_1, product_2 in combinations:
individual_products.add(product_1)
individual_products.add(product_2)

# Target Variable
combine = pulp.LpVariable.dicts("combine", combinations, cat='Binary')

# Initializing the model

model = pulp.LpProblem("prof_max", pulp.LpMaximize)

# Objective Function optimization
model += pulp.lpSum([combine[i] for i in file.index] * file)

# All individual products can only be used once
for product in individual_products:
matching_combinations = combinations[(combinations.to_frame() == product).any(axis=1)]
model += pulp.lpSum([combine[i] for i in matching_combinations]) <= 1

model.solve()
print(pulp.LpStatus[model.status])
print([v for v in model.variables() if v.varValue > 0])

通过这些更改，在不更改问题或固有实现的情况下，您实际上已经删除了 75% 的约束。

纸浆优化不重复排列组合

Pulp optimization not repeated permutation combination

python

math

optimization

pulp