SPARQL：基于谓词对主题进行分组

Question

在语义网图中，我有一组主语（S1、S2、...、Sn）和一组谓词（P1、P2、...、Pn）。我想根据它们的谓词对实例进行分组（即 select 所有具有相同谓词集的实例，无论对象值如何）。

例如，如果我有

S1 P1 v1.
S1 P2 v2.
S2 P3 v3.
S2 P4 v4.
S3 P1 v5.
S3 P2 v6.

我希望有两个组 {S1, S3} 和 {S2}。我自己生成图表，所以如果它有助于实现此要求，我可以更改其结构。

Answer 1

这比听起来要复杂一些，我不完全确定它是否可以以完全通用的方式实现，但我认为你可以在大多数端点上实现这一点。如果你想根据一个主题拥有的 set of predicates 进行分组，那么你首先需要能够获得 set of predicates that一个主题有，并且可以与其他谓词集进行比较。 SPARQL 没有集合值数据类型的概念，但是使用 group_concat 和 distinct，您可以获得包含所有谓词的字符串，并且如果您在 select 时使用 order by，大多数端点将保持顺序不变，因此 group_concat 字符串本质上是规范的。 但是，据我所知，规范并未保证这种行为。

@prefix : <urn:ex:>

:S1 :P1 :v1 .
:S1 :P2 :v2 .
:S2 :P3 :v3 .
:S2 :P4 :v4 .
:S3 :P1 :v5 .
:S3 :P2 :v6 .

prefix : <urn:ex:>

#-- The behavior in most (all?) endpoints seems to be
#-- to preserve the order during the group_concat
#-- operation, so you'll get "noramlized" values
#-- for ?preds.  I don't think is *guaranteed*, though.
select ?s (group_concat(?p) as ?preds) where {
  #-- get the values of ?s and ?p and ensure that
  #-- they're in some kind of standarized order.
  #-- Just ordering by ?p might be fine, too.
  { select distinct ?s ?p {
      ?s ?p ?o
    }
    order by ?p
  }
}
group by ?s

-------------------------------
| s   | preds                 |
===============================
| :S2 | "urn:ex:P3 urn:ex:P4" |
| :S3 | "urn:ex:P1 urn:ex:P2" |
| :S1 | "urn:ex:P1 urn:ex:P2" |
-------------------------------

现在您只需要更进一步，将这些结果按 ?preds:

分组

prefix : <urn:ex:>

select (group_concat(?s) as ?subjects) {
  select ?s (group_concat(?p) as ?preds) where {
    { select distinct ?s ?p {
        ?s ?p ?o
      }
      order by ?p
    }
  }
  group by ?s
}
group by ?preds

-------------------------
| subjects              |
=========================
| "urn:ex:S1 urn:ex:S3" |
| "urn:ex:S2"           |
-------------------------

SPARQL：基于谓词对主题进行分组

SPARQL: group subjects based on predicates

semantic-web

sparql