比较两个表后创建标志列
Create flag column after comparing two tables
我有两个不同的报告 table,其中包含日期时间和报告所有者。我想 select 那些至少写过一次报告的人。我还需要一个计算字段来显示他们编写的报告编号。报告 1 优先,因此如果有人在任何时候撰写了报告 1,新的 report_number 列应该显示 1,否则为 2(针对报告 2)。
'people' table
| person_id | full_name
--------------------------
| 1 | John L Smith
| 2 | Carl M Selt
| 3 | Another Person
'report_1' table
| report_1_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1 | 1 | 2018-01-12 | foo
| 2 | 1 | 2018-02-18 | foo foo
'report_2' table
| report_2_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1 | 1 | 2018-03-21 | bar
| 2 | 1 | 2018-03-28 | bar bar
| 3 | 2 | 2018-04-16 | baz
| 4 | 2 | 2018-04-30 | baz baz
想要的结果:
| full_name | report_number
---------------------------
| John L Smith | 1
| Carl M Smelt | 2
请注意,约翰的 report_number
是 1
,尽管他还撰写了报告 2。
报告 1 和报告 2 具有不同的附加列,即使它们在上面看起来相同。
我试过的:
/* Get people from both reports */
WITH report_1_people AS (
SELECT P.full_name
FROM report_1 R1
INNER JOIN people P ON R1.author_person_id = P.person_id
WHERE P.full_name IS NOT NULL
AND P.full_name <> ''
), report_2_people AS (
SELECT P2.full_name
FROM report_2 R2
INNER JOIN people P2 ON R2.author_person_id = P2.person_id
WHERE P2.full_name IS NOT NULL
AND P2.full_name <> ''
)
SELECT
P.full_name,
CASE WHEN P.full_name IN ( /* Check if in report 1 */
SELECT full_name
FROM report_1)
THEN 1
ELSE 2
END AS report_number
FROM people P
WHERE P.full_name IS NOT NULL AND P.full_name <> ''
/* Eliminate duplicate names */
GROUP BY P.full_name
/* Filter only who either authored report 1 or report 2 */
HAVING P.full_name IN (SELECT full_name
FROM report_1_people)
OR P.full_name IN (SELECT full_name
FROM report_2_people)
注:GROUP BY
人 table 因为某些原因有重复条目。
查询时间太长以至于与数据库断开连接(超过 24 小时),所以我认为我做错了什么。有没有更好的方法来完成这个基于两个 table 的标志计算列? SQL 相对较新,所以我想知道是否还有另一种思考方式,我正在考虑 SQL 逻辑。
您可以使用 OUTER APPLY
:
SELECT person_id, full_name, COALESCE(ca1.report_num, ca2.report_num)
FROM people
OUTER APPLY (SELECT TOP (1) 1 FROM report_1 WHERE author_person_id = people.person_id) AS ca1(report_num)
OUTER APPLY (SELECT TOP (1) 2 FROM report_2 WHERE author_person_id = people.person_id) AS ca2(report_num)
CTE 上的 IN
很可能会杀死它。
另一种方法是使用 EXISTS
来检查某人是否写过报告。 CASE
表达式可以处理优先级。
SELECT p.full_name,
CASE
WHEN EXISTS (SELECT *
FROM report_1 r1
WHERE r1.author_person_id = p.person_id) THEN
1
WHEN EXISTS (SELECT *
FROM report_2 r2
WHERE r2.author_person_id = p.person_id) THEN
2
END report_number
FROM people p
WHERE EXISTS (SELECT *
FROM report_1 r1
WHERE r1.author_person_id = p.person_id)
OR EXISTS (SELECT *
FROM report_2 r2
WHERE r2.author_person_id = p.person_id);
为了提高性能,请尝试在 report_1 (author_person_id)
和 report_2 (author_person_id)
上建立索引。对于 people
,您可以在 person_id
(可能已经存在)上尝试一个索引,或者在 person_id
和 full_name
.
上尝试一个复合索引
这只是获取结果的另一种方式。
SELECT
P.full_name,
MIN( R.Report_Number) AS report_number
FROM people P
OUTER APPLY (SELECT 1 WHERE EXISTS(SELECT * FROM report_1 R1 WHERE R1.author_person_id = P.person_id)
UNION ALL
SELECT 2 WHERE EXISTS(SELECT * FROM report_2 R2 WHERE R2.author_person_id = P.person_id)) AS R(Report_Number)
WHERE P.full_name IS NOT NULL AND P.full_name <> ''
/* Eliminate duplicate names */
GROUP BY P.full_name;
我有两个不同的报告 table,其中包含日期时间和报告所有者。我想 select 那些至少写过一次报告的人。我还需要一个计算字段来显示他们编写的报告编号。报告 1 优先,因此如果有人在任何时候撰写了报告 1,新的 report_number 列应该显示 1,否则为 2(针对报告 2)。
'people' table
| person_id | full_name
--------------------------
| 1 | John L Smith
| 2 | Carl M Selt
| 3 | Another Person
'report_1' table
| report_1_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1 | 1 | 2018-01-12 | foo
| 2 | 1 | 2018-02-18 | foo foo
'report_2' table
| report_2_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1 | 1 | 2018-03-21 | bar
| 2 | 1 | 2018-03-28 | bar bar
| 3 | 2 | 2018-04-16 | baz
| 4 | 2 | 2018-04-30 | baz baz
想要的结果:
| full_name | report_number
---------------------------
| John L Smith | 1
| Carl M Smelt | 2
请注意,约翰的 report_number
是 1
,尽管他还撰写了报告 2。
报告 1 和报告 2 具有不同的附加列,即使它们在上面看起来相同。
我试过的:
/* Get people from both reports */
WITH report_1_people AS (
SELECT P.full_name
FROM report_1 R1
INNER JOIN people P ON R1.author_person_id = P.person_id
WHERE P.full_name IS NOT NULL
AND P.full_name <> ''
), report_2_people AS (
SELECT P2.full_name
FROM report_2 R2
INNER JOIN people P2 ON R2.author_person_id = P2.person_id
WHERE P2.full_name IS NOT NULL
AND P2.full_name <> ''
)
SELECT
P.full_name,
CASE WHEN P.full_name IN ( /* Check if in report 1 */
SELECT full_name
FROM report_1)
THEN 1
ELSE 2
END AS report_number
FROM people P
WHERE P.full_name IS NOT NULL AND P.full_name <> ''
/* Eliminate duplicate names */
GROUP BY P.full_name
/* Filter only who either authored report 1 or report 2 */
HAVING P.full_name IN (SELECT full_name
FROM report_1_people)
OR P.full_name IN (SELECT full_name
FROM report_2_people)
注:GROUP BY
人 table 因为某些原因有重复条目。
查询时间太长以至于与数据库断开连接(超过 24 小时),所以我认为我做错了什么。有没有更好的方法来完成这个基于两个 table 的标志计算列? SQL 相对较新,所以我想知道是否还有另一种思考方式,我正在考虑 SQL 逻辑。
您可以使用 OUTER APPLY
:
SELECT person_id, full_name, COALESCE(ca1.report_num, ca2.report_num)
FROM people
OUTER APPLY (SELECT TOP (1) 1 FROM report_1 WHERE author_person_id = people.person_id) AS ca1(report_num)
OUTER APPLY (SELECT TOP (1) 2 FROM report_2 WHERE author_person_id = people.person_id) AS ca2(report_num)
CTE 上的 IN
很可能会杀死它。
另一种方法是使用 EXISTS
来检查某人是否写过报告。 CASE
表达式可以处理优先级。
SELECT p.full_name,
CASE
WHEN EXISTS (SELECT *
FROM report_1 r1
WHERE r1.author_person_id = p.person_id) THEN
1
WHEN EXISTS (SELECT *
FROM report_2 r2
WHERE r2.author_person_id = p.person_id) THEN
2
END report_number
FROM people p
WHERE EXISTS (SELECT *
FROM report_1 r1
WHERE r1.author_person_id = p.person_id)
OR EXISTS (SELECT *
FROM report_2 r2
WHERE r2.author_person_id = p.person_id);
为了提高性能,请尝试在 report_1 (author_person_id)
和 report_2 (author_person_id)
上建立索引。对于 people
,您可以在 person_id
(可能已经存在)上尝试一个索引,或者在 person_id
和 full_name
.
这只是获取结果的另一种方式。
SELECT
P.full_name,
MIN( R.Report_Number) AS report_number
FROM people P
OUTER APPLY (SELECT 1 WHERE EXISTS(SELECT * FROM report_1 R1 WHERE R1.author_person_id = P.person_id)
UNION ALL
SELECT 2 WHERE EXISTS(SELECT * FROM report_2 R2 WHERE R2.author_person_id = P.person_id)) AS R(Report_Number)
WHERE P.full_name IS NOT NULL AND P.full_name <> ''
/* Eliminate duplicate names */
GROUP BY P.full_name;