N1ql 加入并聚合附加值
N1ql Join and aggregate with additional values
我有一个存储桶(Couchbase 社区版 6.5),其中包含以下文档:
employees {
employeeGroupId: string,
type: "Employee"
}
clocks {
employeeId: string,
areaId: string
date: string,
type: "Clock"
}
每个员工每天都有多个对应的时钟项。我需要得到以下信息:
- 第一个时钟 -> clockIn
- 最后一个时钟 -> clockOut
我编写了以下查询,它获取执行时间 <100 毫秒的第一个和最后一个时钟项:
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
MIN(clock.date) AS clockIn,
MAX(clock.date) AS clockOut
FROM `bucket` employee LEFT
JOIN `bucket` clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee;
问题是我需要用匹配的时钟获取对应的areaId。
我编写了以下查询。我创建了两个单独的子查询,对当天的所有时钟项进行排序,首先升序然后降序,select 第一项。
CREATE INDEX adv_employeeId_type_date_blockId ON `bucket`(`employeeId`,`type`,`date`,`blockId`)
CREATE INDEX adv_employeeId_type_date ON `bucket`(`employeeId`,`type`,`date`)
CREATE INDEX adv_type_employeeId_date ON `bucket`(`type`,`employeeId`,`date`)
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
clockIn,
clockOut
FROM `bucket` employee
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date
LIMIT 1) clockIn ON clockIn.employeeId = META(employee).id
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date DESC
LIMIT 1) clockOut ON clockOut.employeeId = META(employee).id
WHERE employee.type = "Employee"
GROUP BY employee,
clockIn,
clockOut;
问题是上面的查询效率低下,执行时间>10秒。
换句话说,我需要从聚合 MIN() 和 MAX() 函数中获取额外的对象值。
我确定第二个查询不是实现此目的的最有效方法,有人有任何其他建议吗?
CREATE INDEX ix1 ON `bucket`(type, `employeeGroupId`) WHERE type = "Employee";
CREATE INDEX ix2 ON `bucket`(`employeeId`, date, areaId) WHERE type = "Clock";
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
minclock[0] AS clockIn,
minclock[1] AS clockInAreaId,
maxclock[0] AS clockOut,
maxclock[1] AS clockOutAreaId
FROM `bucket` AS employee LEFT
JOIN `bucket` AS clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee
LETTING minclock = MIN([clock.date,clock.areaId]),
maxclock = MAX([clock.date,clock.areaId]);
或
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
MIN([clock.date, {clock.date, clock.areaId}])[1] AS clockIn,
MAX([clock.date, {clock.date, clock.areaId}])[1] AS clockOut,
FROM `bucket` AS employee LEFT
JOIN `bucket` AS clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee;
对数组使用 MIN/MAX。第 0 个表达式是 MIN/MAX 表达式。仅在关系上使用的数组位置重置(类似于 ORDER BY 多个字段)。结果将完成 ARRAY 表达式。
选择您想要投影的位置。此技术允许您按表达式投影非分组。
我有一个存储桶(Couchbase 社区版 6.5),其中包含以下文档:
employees {
employeeGroupId: string,
type: "Employee"
}
clocks {
employeeId: string,
areaId: string
date: string,
type: "Clock"
}
每个员工每天都有多个对应的时钟项。我需要得到以下信息:
- 第一个时钟 -> clockIn
- 最后一个时钟 -> clockOut
我编写了以下查询,它获取执行时间 <100 毫秒的第一个和最后一个时钟项:
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
MIN(clock.date) AS clockIn,
MAX(clock.date) AS clockOut
FROM `bucket` employee LEFT
JOIN `bucket` clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee;
问题是我需要用匹配的时钟获取对应的areaId。
我编写了以下查询。我创建了两个单独的子查询,对当天的所有时钟项进行排序,首先升序然后降序,select 第一项。
CREATE INDEX adv_employeeId_type_date_blockId ON `bucket`(`employeeId`,`type`,`date`,`blockId`)
CREATE INDEX adv_employeeId_type_date ON `bucket`(`employeeId`,`type`,`date`)
CREATE INDEX adv_type_employeeId_date ON `bucket`(`type`,`employeeId`,`date`)
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
clockIn,
clockOut
FROM `bucket` employee
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date
LIMIT 1) clockIn ON clockIn.employeeId = META(employee).id
LEFT JOIN (
SELECT obj.employeeId,
obj.date,
obj.areaId
FROM `bucket` obj
WHERE obj.employeeId = META(employee).id
AND obj.type = "Clock"
AND obj.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
ORDER BY obj.date DESC
LIMIT 1) clockOut ON clockOut.employeeId = META(employee).id
WHERE employee.type = "Employee"
GROUP BY employee,
clockIn,
clockOut;
问题是上面的查询效率低下,执行时间>10秒。
换句话说,我需要从聚合 MIN() 和 MAX() 函数中获取额外的对象值。
我确定第二个查询不是实现此目的的最有效方法,有人有任何其他建议吗?
CREATE INDEX ix1 ON `bucket`(type, `employeeGroupId`) WHERE type = "Employee";
CREATE INDEX ix2 ON `bucket`(`employeeId`, date, areaId) WHERE type = "Clock";
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
minclock[0] AS clockIn,
minclock[1] AS clockInAreaId,
maxclock[0] AS clockOut,
maxclock[1] AS clockOutAreaId
FROM `bucket` AS employee LEFT
JOIN `bucket` AS clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee
LETTING minclock = MIN([clock.date,clock.areaId]),
maxclock = MAX([clock.date,clock.areaId]);
或
SELECT META(employee).id AS employeeId,
employee.employeeGroupId,
MIN([clock.date, {clock.date, clock.areaId}])[1] AS clockIn,
MAX([clock.date, {clock.date, clock.areaId}])[1] AS clockOut,
FROM `bucket` AS employee LEFT
JOIN `bucket` AS clock ON clock.employeeId = META(employee).id
AND type = "Clock"
AND clock.date BETWEEN "2020-06-01T00:00:00.000Z" AND "2020-06-02T00:00:00.000Z"
WHERE employee.type = "Employee"
GROUP BY employee;
对数组使用 MIN/MAX。第 0 个表达式是 MIN/MAX 表达式。仅在关系上使用的数组位置重置(类似于 ORDER BY 多个字段)。结果将完成 ARRAY 表达式。 选择您想要投影的位置。此技术允许您按表达式投影非分组。