适用于从记录中提取一对多关系的约束编程

Constraint programming suitable for extracting OneToMany relationships from records

也许有人可以帮助我解决 Prolog 或任何约束编程语言的问题。想象一下 table 个项目(学生与母亲一起做某事的学校项目)。每个项目都有一个或多个 children 参与。对于每个 child,我们存储它的名字和它母亲的名字。但是对于每个项目,只有一个包含所有母亲的单元格和一个包含所有 children 的单元格。两个单元格的排序方式不一定相同。

示例:

+-----------+-----------+------------+
|           |           |            |
|   Project |   Parents |   Children |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   1       |   Jane;   |   Brian;   |
|           |   Claire  |   Stephen  |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   2       |   Claire; |   Emma;    |
|           |   Jane    |   William  |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   3       |   Jane;   |   William; |
|           |   Claire  |   James    |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   4       |   Jane;   |   Brian;   |
|           |   Sophia; |   James;   |
|           |   Claire  |   Isabella |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   4       |   Claire  |   Brian    |
|           |           |            |
+-----------+-----------+------------+
|           |           |            |
|   5       |   Jane    |   Emma     |
|           |           |            |
+-----------+-----------+------------+

我希望这个例子能直观地说明问题。正如我所说,两个单元格仅包含由分隔符分隔的名称,但不一定以类似的方式排序。因此,对于技术应用程序,您可以将数据转换为:

+-------------+-----------+----------+
|   Project   |   Name    |   Role   |
+-------------+-----------+----------+
|   1         |   Jane    |   Mother |
+-------------+-----------+----------+
|   1         |   Claire  |   Mother |
+-------------+-----------+----------+
|   1         |   Brian   |   Child  |
+-------------+-----------+----------+
|   1         |   Stephen |   Child  |
+-------------+-----------+----------+
|   2         |   Jane    |   Mother |
+-------------+-----------+----------+
|   2         |   Claire  |   Mother |
+-------------+-----------+----------+
|   2         |   Emma    |   Child  |
+-------------+-----------+----------+
|   2         |   William |   Child  |
+-------------+-----------+----------+
|             |           |          |
|                                    |
|              And so on             |

每个项目parents和child人的人数相等。因此,对于每笔交易,我们有 n 个母亲和 n children,每个母亲恰好属于一个 child。有了这些限制,就可以通过逻辑推理将每个母亲分配给她所有的 children,从只涉及一个 child(即 4 和 5)的项目开始。

结果:

简有艾玛、斯蒂芬和詹姆斯;

克莱尔有布莱恩和威廉;

索菲亚有伊莎贝拉

我想知道如何使用约束规划来解决这个问题。此外,数据集可能不确定,我想知道是否有可能隔离记录,当手动解决时(即当 mother-child 分配是手动完成时),会破坏不确定性。

我不确定我是否理解问题的所有要求,但这是 MiniZinc 中的约束编程模型 (http://www.minizinc.org/). The full model is here: http://hakank.org/minizinc/one_to_many.mzn .

稍后注意:项目约束的第一个版本不正确。我已经删除了不正确的代码。查看原始答案的编辑历史。

enum mothers = {jane,claire,sophia};
enum children = {brian,stephen,emma,william,james,isabella};      

% decision variables

% who is the mother of this child?
array[children] of var mothers: x;


solve satisfy;

constraint
  % All mothers has at least one child
  forall(m in mothers) (
    exists(c in children) (
      x[c] = m
    )
  )
;

constraint
% NOTE: This is a more correct version of the project constraints.
% project 1
(
  ( x[brian] = jane /\ x[stephen] = claire) \/
  ( x[stephen] = jane /\ x[brian] = claire)
) 
/\
% project 2
(
  ( x[emma] = claire /\ x[william] = jane) \/
  ( x[william] = claire /\ x[emma] = jane) 
)
/\
% project 3
(
  ( x[william] = claire /\ x[james] = jane) \/
  ( x[james] = claire /\ x[william] = jane) 
)
/\
% project 4
( 
  ( x[brian] = jane /\ x[james] = sophia /\ x[isabella] = claire) \/
  ( x[james] = jane /\ x[brian] = sophia /\ x[isabella] = claire) \/
  ( x[james] = jane /\ x[isabella] = sophia /\ x[brian] = claire) \/
  ( x[brian] = jane /\ x[isabella] = sophia /\ x[james] = claire) \/
  ( x[isabella] = jane /\ x[brian] = sophia /\ x[james] = claire) \/
  ( x[isabella] = jane /\ x[james] = sophia /\ x[brian] = claire) 
)
/\

% project 4(sic!)
( x[brian] = claire) /\

% project 5
( x[emma] = jane)
;


output [
  "\(c): \(x[c])\n"
  | c in children
];

唯一解是

brian: claire
stephen: jane
emma: jane
william: claire
james: jane
isabella: sophia

Edit2:这是一个更通用的解决方案。有关完整模型,请参阅 http://hakank.org/minizinc/one_to_many.mzn

include "globals.mzn"; 

enum mothers = {jane,claire,sophia};
enum children = {brian,stephen,emma,william,james,isabella};      

% decision variables
% who is the mother of this child?
array[children] of var mothers: x;

% combine all the combinations of mothers and children in a project
predicate check(array[int] of mothers: mm, array[int] of children: cc) =
  let {
    int: n = length(mm);
    array[1..n] of var 1..n: y;
  } in
  all_different(y) /\
  forall(i in 1..n) (
     x[cc[i]] = mm[y[i]]
  )
;    

solve satisfy;

constraint
% All mothers has at least one child.
forall(m in mothers) (
  exists(c in children) (
    x[c] = m
  )
)
;


constraint
% project 1    
check([jane,claire], [brian,stephen]) /\
% project 2
check([claire,jane],[emma,william]) /\
% project 3
check([claire,jane],[william,james]) /\
% project 4
check([claire,sophia,jane],[brian,james,isabella]) /\
% project 4(sic!)
check([claire],[brian]) /\
% project 5
check([jane],[emma])
;

output [
 "\(c): \(x[c])\n"
 | c in children
];

此模型使用以下谓词来确保考虑母亲与 children 的所有组合:

predicate check(array[int] of mothers: mm, array[int] of children: cc) =
   let {
     int: n = length(mm);
     array[1..n] of var 1..n: y;
  } in
  all_different(y) /\
  forall(i in 1..n) (
    x[cc[i]] = mm[y[i]]
  )
;    

它使用全局约束all_different(y)来确保mm[y[i]]mm中的母亲之一,然后将第i个child分配给那个具体妈妈。

这是我的第一个CHR程序,所以我希望有人能来给我一些改进的建议。

我的想法是,您需要将所有列表扩展为事实。从那里,如果您知道一个项目只有一个父项和一个子项,您可以从中建立父项关系。此外,一旦您拥有 parent-child 关系,您就可以从其他项目的其他事实中删除该集合,并将问题的基数减一。最终你会想出你能想出的一切。完全确定的数据集和不完全确定的数据集之间的唯一区别在于减少的程度。如果它没有完全到达那里,它会留下一些事实,这样你就可以看到哪些 projects/parents/children 仍在制造歧义。

:- use_module(library(chr)).

:- chr_constraint project/3, project_parent/2, project_child/2, 
   project_parents/2, project_children/2, project_size/2, parent/2.

%% turn a project into a fact about its size plus 
%% facts for each parent and child in this project
project(N, Parents, Children) <=>
    length(Parents, Len),
    project_size(N, Len),
    project_parents(N, Parents),
    project_children(N, Children).

%% expand the list of parents for this project into a fact per parent per project
project_parents(_, []) <=> true.
project_parents(N, [Parent|Parents]) <=>
    project_parent(N, Parent),
    project_parents(N, Parents).

%% same for the children
project_children(_, []) <=> true.
project_children(N, [Child|Children]) <=>
    project_child(N, Child),
    project_children(N, Children).

%% a single parent-child combo on a project is exactly what we need
one_parent @ project_size(Project, 1), 
             project_parent(Project, Parent), 
             project_child(Project, Child) <=>
    parent(Parent, Child).

%% if I have a parent relationship for project of size N,
%% remove this parent and child from the project and decrease
%% the number of parents and children by one
parent_det @ parent(Parent, Child) \ project_size(Project, N), 
                                     project_parent(Project, Parent), 
                                     project_child(Project, Child) <=>
    succ(N0, N),
    project_size(Project, N0).

我 运行 通过创建一个 main/0 谓词来完成您的示例:

main :-
    project(1, [jane, claire], [brian, stephen]),
    project(2, [claire, jane], [emma, william]),
    project(3, [jane, claire], [william, james]),
    project(4, [jane, sophia, claire], [brian, james, isabella]),
    project(5, [claire], [brian]),
    project(6, [jane], [emma]).

这输出:

parent(sophia, isabella),
parent(jane, james),
parent(claire, william),
parent(jane, emma),
parent(jane, stephen),
parent(claire, brian).

为了证明不完整的决心,我添加了第七个项目:

project(7, [sally,sandy], [grace,miriam]).

然后程序输出:

project_parent(7, sandy),
project_parent(7, sally),
project_child(7, miriam),
project_child(7, grace),
project_size(7, 2),
parent(sophia, isabella),
parent(jane, james),
parent(claire, william),
parent(jane, emma),
parent(jane, stephen),
parent(claire, brian).

如您所见,任何剩余的 project_size/2 都会告诉您有待解决的基数(项目 7 有两个 parent/children 关系仍有待确定),您可以准确地返回待处理的 parents/children,以及所有可以确定的 parent/2 关系。

我对这个结果很满意,但希望其他人可以来改进我的代码!

编辑:我的代码有一个在邮件列表中发现的缺点,即某些输入将无法收敛,即使可以计算解决方案,例如:

project(1,[jane,claire],[brian, stephan]),
project(2,[jane,emma],[stephan, jones]).

有关详细信息,请参阅 Ian's solution,它使用集合交集来确定映射。

有点跑题了,但是从 SWI-Prolog manual:

Plain Prolog can be regarded as CLP(H), where H stands for Herbrand terms. Over this domain, =/2 and dif/2 are the most important constraints that express, respectively, equality and disequality of terms.

我觉得有权提出 Prolog 解决方案,比您建议的算法更通用(基于单对单关系逐步减少关系):

solve2(Projects,ParentsChildren) :-
    foldl([_-Ps-Cs,L,L1]>>try_links(Ps,Cs,L,L1),Projects,[],ChildrenParent),
    transpose_pairs(ChildrenParent,ParentsChildrenFlat),
    group_pairs_by_key(ParentsChildrenFlat,ParentsChildren).

try_links([],[],Linked,Linked).
try_links(Ps,Cs,Linked,Linked2) :-
    select(P,Ps,Ps1),
    select(C,Cs,Cs1),
    link(C,P,Linked,Linked1),
    try_links(Ps1,Cs1,Linked1,Linked2).

link(C,P,Assigned,Assigned1) :-
    (   memberchk(C-Q,Assigned)
    ->  P==Q,
        Assigned1=Assigned
    ;   Assigned1=[C-P|Assigned]
    ).

这接受自然格式的数据,例如

data(1,
    [1-[jane,claire]-[brian,stephen]
    ,2-[claire,jane]-[emma,william]
    ,3-[jane,claire]-[william,james]
    ,4-[jane,sophia,claire]-[brian,james,isabella]
    ,5-[claire]-[brian]
    ,6-[jane]-[emma]
    ]).
data(2,
    [1-[jane,claire]-[brian,stephen]
    ,2-[claire,jane]-[emma,william]
    ,3-[jane,claire]-[william,james]
    ,4-[jane,sophia,claire]-[brian,james,isabella]
    ,5-[claire]-[brian]
    ,6-[jane]-[emma]
    ,7-[sally,sandy]-[grace,miriam]
    ]).

?- data(2,Ps),solve2(Ps,S).
Ps = [1-[jane, claire]-[brian, stephen], 2-[claire, jane]-[emma, william], 3-[jane, claire]-[william, james], 4-[jane, sophia, claire]-[brian, james, isabella], 5-[claire]-[brian], 6-[jane]-[emma], 7-[...|...]-[grace|...]],
S = [claire-[william, brian], jane-[james, emma, stephen], sally-[grace], sandy-[miriam], sophia-[isabella]].