EXCEPT 子查询对未出现在输出中的列名的内部联接

Inner join of an EXCEPT subquery on column names that do not appear in the output

我有 table 个学生和 类。我想找出 类 从一个学期到另一个学期被删除的内容(并添加了对 类 的类似查询)。

Student    Class       Semester
==============================
Alice      English     11
Alice      Geometry    11
Alice      English     12
Bob        Spanish     11
Bob        Spanish     12

我的方法是使用 except(与 minus 相同):

select distinct Class
from table
where table.Student = 'Alice'
and table.Semester = 11
except
select distinct Class
from table
where table.Student = 'Alice'
and table.Semester = 12

这可以正常工作,returning Geometry。但是,我需要像这样将其用作子查询:

select Student, string_agg(X.Class, ', ') as 'Deleted_Classes', 
count(X) as 'Num_deleted',
SemesterTable.Semester as semester, 
lag(Semester, 1) 
over (partition by StudentTable.Student 
order by SemesterTable.Semester) as Prev_Semester,
from 
StudentTable
SemesterTable
inner join (
<<<Same query from above>>>
) X on _______
where X.Num_deleted > 0

我的问题是 ____ 部分 - 内部联接只能联接在输出中出现的列上。但是我的 except 查询没有 return 以前和当前学期的值(如果没有 类 被删除,它甚至可能 return 什么都没有)。那么如何将子查询加入到主 table 中呢?我想要的输出是:

Student     Semester     Prev Semester   Deleted_Classes
========================================================
Alice        12          11              Geometry

Alice 出现是因为她的日程有变化,但是 Bob 被省略是因为他的日程没有变化。

我会通过 Left Join 执行此操作,并通过 where 中的 exist 检查特定学生下学期的可用性。

Select T.Student, T.Semester+1 As Semester, T.Semester As [Prev Semester], 
       string_agg(T.Class, ',') As Deleted_Classes
From Tbl As T Left Join Tbl As T1 On (T.Student=T1.Student 
                                      And T.Semester+1=T1.Semester
                                      And T.Class=T1.Class)
Where Exists (Select * From Tbl 
              Where Student=T.Student 
                    And Semester=T.Semester+1) And
      T1.Semester Is Null
Group by T.Student, T.Semester+1, T.Semester

如果您的学期 ID 没有严格增加,您可以使用相同的逻辑通过 ctedense_rank 根据您的标准为每个学生排序学期,如下所示:

With CTE As (
Select Student, Semester, Class, 
       Dense_Rank() Over (Partition by Student Order by Semester) As N
From Tbl
)
Select T.Student, Max(T2.Semester) As Semester, Max(T.Semester) As [Prev Semester], 
        string_agg(T.Class, ',') As Deleted_Classes
From CTE As T Left Join CTE As T1 On (T.Student=T1.Student 
                                      And T.N+1=T1.N
                                      And T.Class=T1.Class)
              Cross Apply (Select Distinct Semester 
                           From CTE 
                           Where Student=T.Student          
                                 And N=T.N+1) As T2
Where T1.N Is Null
Group by T.Student, T.N+1, T.N

结果:

Student Semester Prev Semester Deleted_Classes
Alice 12 11 Geometry

要通过单个查询同时删除 类 和添加 类,您可以使用以下命令:

With CTE As (
Select Student, Semester, Class, 
       Dense_Rank() Over (Partition by Student Order by Semester) As N
From Tbl
)
Select T.Student, Max(T1.Semester) As Semester, Max(T.Semester) As [Prev Semester], 
        Max(T2.Deleted_Classes) As Deleted_Classes, Max(T3.Added_Classes) As Added_Classes
From (Select Distinct Student, Semester, N From CTE) As T Cross Apply
     (Select Distinct Student, Semester, N From CTE Where Student=T.Student And N=T.N+1) As T1
              Outer Apply (Select Student, Semester, string_agg(Class, ', ') As Deleted_Classes
                           From CTE 
                           Where Student=T.Student          
                                 And N=T.N
                                 And Class Not In (Select Class From CTE Where Student=T.Student And N=T.N+1)
                           Group by Student, Semester) As T2
              Outer Apply (Select Student, Semester, string_agg(Class, ', ') As Added_Classes
                           From CTE 
                           Where Student=T.Student          
                                 And N=T.N+1
                                 And Class Not In (Select Class From CTE Where Student=T.Student And N=T.N)
                           Group by Student, Semester) As T3
Group by T.Student, T.N+1, T.N
Having Max(T2.Deleted_Classes) Is Not Null Or Max(T3.Added_Classes) Is Not Null

db<>fiddle here

结果:

Student Semester Prev Semester Deleted_Classes Added_Classes
Alice 11 10 Portuguese English, Math
Alice 12 11 Geometry, Math Biology, Geography
Bob 12 11 Portuguese Math

使用 NOT EXISTS 似乎适合于此。

create table StudentSemesters (
 Student varchar(30),
 Class varchar(30),
 Semester int
);

insert into StudentSemesters
(Student, Class, Semester) values
  ('Alice',     'English',     11) 
, ('Alice',     'Geometry',    11)
, ('Alice',     'English',     12)
, ('Bob',       'Spanish',     11) 
, ('Bob',       'Spanish',     12) 
;
select Student
, Semester+1 as [Semester] 
, Semester as [Prev_Semester]
, STRING_AGG(Class, ', ') as [Deleted_Classes] 
from StudentSemesters t
where not exists (
    select 1
    from StudentSemesters t2
    where t2.Student = t.Student
      and t2.Class = t.Class
      and t2.Semester = t.Semester+1
    )
  and exists (
    select 1
    from StudentSemesters t2
    where t2.Student = t.Student
      and t2.Semester = t.Semester+1
    )
group by Student, Semester;
Student Semester Prev_Semester Deleted_Classes
Alice 12 11 Geometry
select 
  Student
, Semester as [Semester] 
, Semester-1 as [Prev_Semester]
, STRING_AGG(Class, ', ') as [Added_Classes] 
from StudentSemesters t
where not exists (
    select 1
    from StudentSemesters t2
    where t2.Student = t.Student
      and t2.Class = t.Class
      and t2.Semester = t.Semester-1
    )
group by Student, Semester;
Student Semester Prev_Semester Added_Classes
Alice 11 10 English, Geometry
Bob 11 10 Spanish

合并,使用full join

select 
  coalesce(t1.Student, t2.Student) as Student
, coalesce(t2.Semester, t1.Semester+1) as [Semester] 
, coalesce(t1.Semester, t2.Semester-1) as [Prev_Semester]
, STRING_AGG(t1.Class, ', ') as [Deleted_Classes] 
, STRING_AGG(t2.Class, ', ') as [Added_Classes]  
from StudentSemesters t1
full join StudentSemesters t2
  on t2.Student = t1.Student
 and t2.Class = t1.Class
 and t2.Semester = t1.Semester+1
where (t1.Class is null or
 (t2.Class is null
  and exists (
    select 1
    from StudentSemesters t3
    where t3.Student = t1.Student
      and t3.Semester = t1.Semester+1
    )))
group by 
  coalesce(t1.Student, t2.Student), 
  coalesce(t2.Semester, t1.Semester+1), 
  coalesce(t1.Semester, t2.Semester-1) 
order by Student, Semester;
Student Semester Prev_Semester Deleted_Classes Added_Classes
Alice 11 10 null English, Geometry
Alice 12 11 Geometry null
Bob 11 10 null Spanish

演示 db<>fiddle here

with data as (
    select *,
        min(Semester) over (partition by Student, Class) as minSemester,
        max(Semester) over (partition by Student, Class) as maxSemester,
        count(*) over (partition by Student, Class) as cntSemester
    from T
    where Semester in (11, 12)
)
select Student, Class,
    case when minSemester = 12 then 'Added'   else '' end as Added,
    case when maxSemester = 11 then 'Dropped' else '' end as Dropped
from data
where maxSemester = 11;

您可以从这些值中获得各种信息。例如,知道最近的学期不是 12 意味着 class 被删除了。你可以做类似的添加。

根据学生对按学生和学期排序的行进行分组,并在每个组中按学期对行进行分组并执行inter-row 计算。 spring学期类组与秋季学期类组的区别是新增类,秋季学期与spring学期的区别包含已取消的 类。在 SQL 中编写此代码很麻烦,因为您将需要 window 函数、CROSSAPPLY 和 OUTERAPPLY。 SQL 语句将冗长且难以理解。另一种方法是将数据移出数据库并在 Python 或 SPL 中处理它。 SPL,open-source Java 包,很容易集成到Java 程序中并生成简单的代码。只需两行代码就可以搞定

A
1 =MSSQL.query@x("select * from Classes order by 1,3")
2 =A1.group@o(#1).conj(~.group@o(#1,#3;~.(#2)).new(Student,#2[+1]:Semester,#2:Prev_Semester,(#3#3[+1]).concat@c():Deleted_Classes,(#3[+1]#3).concat@c():Added_Classes).m(:-2))