使用 Beam 的子查询

Subqueries with Beam

我正在尝试 运行 table 的每一行上的子查询。这是一个最低限度的工作示例,其中包含一个 table "students".

data StudentT  f
  = StudentT
  { _studentId   :: C f Int
  , _studentName :: C f String
  , _score       :: C f Int
  } deriving Generic

type Student = StudentT Identity
type StudentId = PrimaryKey StudentT Identity

deriving instance Show Student

instance Beamable StudentT
instance Beamable (PrimaryKey StudentT)

instance Table StudentT where
  data PrimaryKey StudentT f = StudentId (Columnar f Int) deriving Generic

data SchoolDb f
  = SchoolDb
  { _students :: f (TableEntity StudentT)
  } deriving Generic

instance Database be SchoolDb

schoolDb :: DatabaseSettings be SchoolDb
schoolDb = defaultDbSettings

我想要实现的是这样的查询:

SELECT s.id,
       s.name,
       s.score,
       (SELECT COUNT(*) FROM students AS t where s.score >= t.score) AS percentile 
FROM students as S

我的尝试如下:

main = do
  conn <- open "test.db"
  runBeamSqliteDebug putStrLn conn $ do
    (students :: [(Student, Int)]) <- runSelectReturningList $ select tablePercentile
    liftIO $ mapM_ print students


tablePercentile :: Q _ _ _ _
tablePercentile = do
  student <- all_ (_students schoolDb)
  let percentile =  subquery_ $ aggregate_ (const countAll_) $ filter_ (\s -> _score s <=. (_score student)) (all_ (_students schoolDb))
  return (student, percentile)

有人能指出我正确的方向吗?

编辑:这是完整的错误信息。我认为 subquery_ returns 是 QGenExpr,所以我没有绑定它 (<-),而是将它放入 let 语句中。这稍微简化了错误消息。

src/Main.hs:52:71: error:
    • Couldn't match type ‘Database.Beam.Query.Internal.QNested s0’
                     with ‘Database.Beam.Query.QueryInaccessible’
      Expected type: Q SqliteSelectSyntax
                       SchoolDb
                       Database.Beam.Query.QueryInaccessible
                       (StudentT
                          (QExpr
                             Database.Beam.Sqlite.Syntax.SqliteExpressionSyntax
                             (Database.Beam.Query.Internal.QNested s0)),
                        QGenExpr
                          QValueContext
                          (Database.Beam.Backend.SQL.SQL92.Sql92SelectTableExpressionSyntax
                             (Database.Beam.Backend.SQL.SQL92.Sql92SelectSelectTableSyntax
                                SqliteSelectSyntax))
                          s0
                          Int)
        Actual type: Q SqliteSelectSyntax
                       SchoolDb
                       (Database.Beam.Query.Internal.QNested s0)
                       (StudentT
                          (QExpr
                             (Database.Beam.Backend.SQL.SQL92.Sql92SelectTableExpressionSyntax
                                (Database.Beam.Backend.SQL.SQL92.Sql92SelectSelectTableSyntax
                                   SqliteSelectSyntax))
                             (Database.Beam.Query.Internal.QNested s0)),
                        QGenExpr
                          QValueContext
                          (Database.Beam.Backend.SQL.SQL92.Sql92SelectTableExpressionSyntax
                             (Database.Beam.Backend.SQL.SQL92.Sql92SelectSelectTableSyntax
                                SqliteSelectSyntax))
                          s0
                          Int)
    • In the first argument of ‘select’, namely ‘tablePercentile’
      In the second argument of ‘($)’, namely ‘select tablePercentile’
      In a stmt of a 'do' block:
        (students :: [(Student, Int)]) <- runSelectReturningList
                                            $ select tablePercentile
   |
52 |     (students :: [(Student, Int)]) <- runSelectReturningList $ select tablePercentile
   |                                                                       ^^^^^^^^^^^^^^^

这是我第一次使用 Beam,我发现使用 the examples involving aggregates in the user guide 作为参考,从头开始,而不是在这里修复代码更容易:

tablePercentile =
  aggregate_ (\(student, student') -> (group_ (_studentId student), countAll_))
    . filter_ (\(student, student') -> (_score student <=. _score student'))
    $ (,) <$> all_ (_students schoolDb) <*> all_ (_students schoolDb)

这相当于 table 与自身的内部联接,filter_ 设置联接条件,aggregate_ 处理分组和计数。请注意,此查询仅检索学生 ID,而不是完整记录。这是因为通常不可能从使用 GROUP BY 的查询中获取比聚合和用于分组的列更多的信息。处理这个问题的一种方法是使用子查询将 id 传递给:

tablePercentile = do
  (sid, cou) <- aggregate_ (\(student, student') -> (group_ (_studentId student), countAll_))
    . filter_ (\(student, student') -> (_score student <=. _score student'))
    $ (,) <$> all_ (_students schoolDb) <*> all_ (_students schoolDb)
  (\student -> (student, cou))
    <$> filter_ (\student -> _studentId student ==. sid) (all_ (_students schoolDb))
-- N.B.: The last line of the do-block might be written as
-- (,) <$> filter_ (\student -> _studentId student ==. sid) (all_ (_students schoolDb)) <*> pure cou

这按预期工作:

sqlite> SELECT * from Students;
Id|Name|Score
1|Alice|9
2|Bob|7
3|Carol|6
4|David|8
5|Esther|10
6|Francis|6
GHCi> :main
SELECT "t1"."id" AS "res0", "t1"."name" AS "res1", "t1"."score" AS "res2", "t0"."res1" AS "res3" FROM (SELECT "t0"."id" AS "res0", COUNT(*) AS "res1" FROM "students" AS "t0" INNER JOIN "students" AS "t1" WHERE ("t0"."score")<=("t1"."score") GROUP BY "t0"."id") AS "t0" INNER JOIN "students" AS "t1" WHERE ("t1"."id")=("t0"."res0");
-- With values: []
(StudentT {_studentId = 1, _studentName = "Alice", _score = 9},2)
(StudentT {_studentId = 2, _studentName = "Bob", _score = 7},4)
(StudentT {_studentId = 3, _studentName = "Carol", _score = 6},6)
(StudentT {_studentId = 4, _studentName = "David", _score = 8},3)
(StudentT {_studentId = 5, _studentName = "Esther", _score = 10},1)
(StudentT {_studentId = 6, _studentName = "Francis", _score = 6},6)

最后,据我所知,您的代码中出现的错误与尝试在 (<=.) 条件下比较无与伦比的事物有关。如果 filter_ 被注释掉,您的原始代码(对 percentile 使用 monadic 绑定)将编译。这可能与我提到的 GROUP BY 问题有关,虽然我不确定。