Hive/Impala。查询以获取满足条件的范围内的行
Hive/Impala. Query to get rows in ranges which accomplish a condition
我对 Impala/hive 查询还很陌生,我不太确定如何进行查询。
本次查询objective获取的是定义范围内的数据(2个点满足一个条件)
为了更清楚,我们有一个包含 3 列的 table:日期、A 和 B。
我们按日期排序 table,我们希望从两个 A=1 之间的所有间隔中获取所有行,其中没有任何 B=1。 (因此,范围在每两个A=1之间,条件是其中没有B=1)。
我画了我正在寻找的概念,所以它变得更清晰。
Link: https://drive.google.com/open?id=0B_zAJFzI2slWQnRwN2gwWk9NSG8
select dt,A,B
from (select dt,A,B
,max (case when A=1 then dt end) over p as p_A1_dt
,max (case when B=1 then dt end) over p as p_B1_dt
,min (case when A=1 then dt end) over f as f_A1_dt
,min (case when B=1 then dt end) over f as f_B1_dt
from mytable
window p as (order by dt rows between unbounded preceding and 1 preceding)
,f as (order by dt rows between 1 following and unbounded following)
) t
where ( p_A1_dt >= p_B1_dt
or ( p_A1_dt is not null
and p_B1_dt is null
)
)
and ( f_A1_dt <= f_B1_dt
or ( f_A1_dt is not null
and f_B1_dt is null
)
)
and coalesce(A,-1) <> 1
相同,但没有 window
decleration
select dt,A,B
from (select dt,A,B
,max (case when A=1 then dt end) over (order by dt rows between unbounded preceding and 1 preceding) as p_A1_dt
,max (case when B=1 then dt end) over (order by dt rows between unbounded preceding and 1 preceding) as p_B1_dt
,min (case when A=1 then dt end) over (order by dt rows between 1 following and unbounded following) as f_A1_dt
,min (case when B=1 then dt end) over (order by dt rows between 1 following and unbounded following) as f_B1_dt
from mytable
) t
where ( p_A1_dt >= p_B1_dt
or ( p_A1_dt is not null
and p_B1_dt is null
)
)
and ( f_A1_dt <= f_B1_dt
or ( f_A1_dt is not null
and f_B1_dt is null
)
)
and coalesce(A,-1) <> 1
我对 Impala/hive 查询还很陌生,我不太确定如何进行查询。
本次查询objective获取的是定义范围内的数据(2个点满足一个条件)
为了更清楚,我们有一个包含 3 列的 table:日期、A 和 B。
我们按日期排序 table,我们希望从两个 A=1 之间的所有间隔中获取所有行,其中没有任何 B=1。 (因此,范围在每两个A=1之间,条件是其中没有B=1)。
我画了我正在寻找的概念,所以它变得更清晰。
Link: https://drive.google.com/open?id=0B_zAJFzI2slWQnRwN2gwWk9NSG8
select dt,A,B
from (select dt,A,B
,max (case when A=1 then dt end) over p as p_A1_dt
,max (case when B=1 then dt end) over p as p_B1_dt
,min (case when A=1 then dt end) over f as f_A1_dt
,min (case when B=1 then dt end) over f as f_B1_dt
from mytable
window p as (order by dt rows between unbounded preceding and 1 preceding)
,f as (order by dt rows between 1 following and unbounded following)
) t
where ( p_A1_dt >= p_B1_dt
or ( p_A1_dt is not null
and p_B1_dt is null
)
)
and ( f_A1_dt <= f_B1_dt
or ( f_A1_dt is not null
and f_B1_dt is null
)
)
and coalesce(A,-1) <> 1
相同,但没有 window
decleration
select dt,A,B
from (select dt,A,B
,max (case when A=1 then dt end) over (order by dt rows between unbounded preceding and 1 preceding) as p_A1_dt
,max (case when B=1 then dt end) over (order by dt rows between unbounded preceding and 1 preceding) as p_B1_dt
,min (case when A=1 then dt end) over (order by dt rows between 1 following and unbounded following) as f_A1_dt
,min (case when B=1 then dt end) over (order by dt rows between 1 following and unbounded following) as f_B1_dt
from mytable
) t
where ( p_A1_dt >= p_B1_dt
or ( p_A1_dt is not null
and p_B1_dt is null
)
)
and ( f_A1_dt <= f_B1_dt
or ( f_A1_dt is not null
and f_B1_dt is null
)
)
and coalesce(A,-1) <> 1