正在尝试从 SQL 中的组中获取第一条记录
Trying to fetch the first record from a group in SQL
我正在尝试查询候选人、项目、合同的最早时间戳 SQL。
spark.sql(
"""
|SELECT
| DISTICT
| timestamp,
| candidate_id,
| project_id,
| contract_id
|FROM candidatesHistory
|GROUP BY timestamp, candidate_id, project_id, contract_id
|ORDER BY timestamp DESC
|LIMIT 1
|""".stripMargin)
此代码不执行此操作,它仅获取一条记录 - 如何获取合同项目候选人的最早时间戳?
感谢任何帮助
如果table中只有4列,那么可以使用聚合:
select candidate_id, project_id, contract_id, min(timestamp) first_timestamp
from candidateshistory
group by candidate_id, project_id, contract_id
如果列比较多,想把所有都带上,那么可以用row_number()
来过滤 table:
select ch.*
from (
select ch.*,
row_number() over(partition by candidate_id, project_id, contract_id order by timestamp) rn
from candidateshistory ch
) ch
where rn = 1
对于每个 (candidate_id, project_id, contract_id)
元组,这会为您提供最早 timestamp
的行。
这应该可行,但不知道这是否是最好的方法:
SELECT candidate_id
, project_id
, contract_id
, timestamp
FROM (
SELECT RANK() OVER (PARTITION BY candidate_id ORDER BY timestamp) AS RNK
, candidate_id
, project_id
, contract_id
FROM candidatesHistory
) as CH
WHERE CH.RNK = 1;
我正在尝试查询候选人、项目、合同的最早时间戳 SQL。
spark.sql(
"""
|SELECT
| DISTICT
| timestamp,
| candidate_id,
| project_id,
| contract_id
|FROM candidatesHistory
|GROUP BY timestamp, candidate_id, project_id, contract_id
|ORDER BY timestamp DESC
|LIMIT 1
|""".stripMargin)
此代码不执行此操作,它仅获取一条记录 - 如何获取合同项目候选人的最早时间戳?
感谢任何帮助
如果table中只有4列,那么可以使用聚合:
select candidate_id, project_id, contract_id, min(timestamp) first_timestamp
from candidateshistory
group by candidate_id, project_id, contract_id
如果列比较多,想把所有都带上,那么可以用row_number()
来过滤 table:
select ch.*
from (
select ch.*,
row_number() over(partition by candidate_id, project_id, contract_id order by timestamp) rn
from candidateshistory ch
) ch
where rn = 1
对于每个 (candidate_id, project_id, contract_id)
元组,这会为您提供最早 timestamp
的行。
这应该可行,但不知道这是否是最好的方法:
SELECT candidate_id
, project_id
, contract_id
, timestamp
FROM (
SELECT RANK() OVER (PARTITION BY candidate_id ORDER BY timestamp) AS RNK
, candidate_id
, project_id
, contract_id
FROM candidatesHistory
) as CH
WHERE CH.RNK = 1;