用以前的记录填充 NULLS - Netezza SQL
Filling in NULLS with previous records - Netezza SQL
我在 Aginity Workbench 上使用 Netezza SQL 并且有以下数据:
id DATE1 DATE2
1 2013-07-27 NULL
2 NULL NULL
3 NULL 2013-08-02
4 2013-09-10 2013-09-23
5 2013-12-11 NULL
6 NULL 2013-12-19
我需要用 DATE1 字段中已填充的先前值填充 DATE1 中的所有 NULL 值。对于 DATE2,我需要执行相同的操作,但顺序相反。所以我想要的输出如下:
id DATE1 DATE2
1 2013-07-27 2013-08-02
2 2013-07-27 2013-08-02
3 2013-07-27 2013-08-02
4 2013-09-10 2013-09-23
5 2013-12-11 2013-12-19
6 2013-12-11 2013-12-19
我只有数据的读取权限。所以创建表或视图是不可能的
我认为 Netezza 支持 max()
和 min()
的 order by
子句。所以,你可以这样做:
select max(date1) over (order by date1) as date1,
min(date2) over (order by date2 desc) as date2
. . .
编辑:
在 Netezza 中,您可以使用 last_value()
和 first_value()
:
执行此操作
select last_value(date1 ignore nulls) over (order by id rows between unbounded preceding and 1 preceding) as date1,
first_value(date1 ignore nulls) over (order by id rows between 1 following and unbounded following) as date2
Netezza 似乎不支持 LAG()
上的 IGNORE NULL
s,但支持这些功能。
我只在 Oracle 中测试过,希望它能在 Netezza 中运行:
Fiddle:
http://www.sqlfiddle.com/#!4/7533f/1/0
select id,
coalesce(date1, t1_date1, t2_date1) as date1,
coalesce(date2, t3_date2, t4_date2) as date2
from (select t.*,
t1.date1 as t1_date1,
t2.date1 as t2_date1,
t3.date2 as t3_date2,
t4.date2 as t4_date2,
row_number() over(partition by t.id order by t.id) as rn
from tbl t
left join tbl t1
on t1.id < t.id
and t1.date1 is not null
left join tbl t2
on t2.id > t.id
and t2.date1 is not null
left join tbl t3
on t3.id < t.id
and t3.date2 is not null
left join tbl t4
on t4.id > t.id
and t4.date2 is not null
order by t.id) x
where rn = 1
这是一种使用自联接使用最近的 min/max 非空日期填充 NULL
日期的方法。此查询应该适用于大多数数据库
select t1.id, max(t2.date1), min(t3.date2)
from tbl t1
join tbl t2 on t1.id >= t2.id
join tbl t3 on t1.id <= t3.id
group by t1.id
this怎么样?
select
id
,last_value(date1 ignore nulls) over (
order by id
rows between unbounded preceding and current row
) date1
,first_value(date2 ignore nulls) over (
order by id
rows between current row and unbounded following
) date2
您也可以手动计算,而不是依赖窗口函数。
with chain as (
select
this.*,
prev.date1 prev_date1,
case when prev.date1 is not null then abs(this.id - prev.id) else null end prev_distance,
next.date2 next_date2,
case when next.date2 is not null then abs(this.id - next.id) else null end next_distance
from
Table1 this
left outer join Table1 prev on this.id >= prev.id
left outer join Table1 next on this.id <= next.id
), min_distance as (
select
id,
min(prev_distance) min_prev_distance,
min(next_distance) min_next_distance
from
chain
group by
id
)
select
chain.id,
chain.prev_date1,
chain.next_date2
from
chain
join min_distance on
min_distance.id = chain.id
and chain.prev_distance = min_distance.min_prev_distance
and chain.next_distance = min_distance.min_next_distance
order by chain.id
如果您无法通过减法计算 ID 之间的距离,只需将排序方案替换为 row_number()
调用即可。
我在 Aginity Workbench 上使用 Netezza SQL 并且有以下数据:
id DATE1 DATE2
1 2013-07-27 NULL
2 NULL NULL
3 NULL 2013-08-02
4 2013-09-10 2013-09-23
5 2013-12-11 NULL
6 NULL 2013-12-19
我需要用 DATE1 字段中已填充的先前值填充 DATE1 中的所有 NULL 值。对于 DATE2,我需要执行相同的操作,但顺序相反。所以我想要的输出如下:
id DATE1 DATE2
1 2013-07-27 2013-08-02
2 2013-07-27 2013-08-02
3 2013-07-27 2013-08-02
4 2013-09-10 2013-09-23
5 2013-12-11 2013-12-19
6 2013-12-11 2013-12-19
我只有数据的读取权限。所以创建表或视图是不可能的
我认为 Netezza 支持 max()
和 min()
的 order by
子句。所以,你可以这样做:
select max(date1) over (order by date1) as date1,
min(date2) over (order by date2 desc) as date2
. . .
编辑:
在 Netezza 中,您可以使用 last_value()
和 first_value()
:
select last_value(date1 ignore nulls) over (order by id rows between unbounded preceding and 1 preceding) as date1,
first_value(date1 ignore nulls) over (order by id rows between 1 following and unbounded following) as date2
Netezza 似乎不支持 LAG()
上的 IGNORE NULL
s,但支持这些功能。
我只在 Oracle 中测试过,希望它能在 Netezza 中运行:
Fiddle: http://www.sqlfiddle.com/#!4/7533f/1/0
select id,
coalesce(date1, t1_date1, t2_date1) as date1,
coalesce(date2, t3_date2, t4_date2) as date2
from (select t.*,
t1.date1 as t1_date1,
t2.date1 as t2_date1,
t3.date2 as t3_date2,
t4.date2 as t4_date2,
row_number() over(partition by t.id order by t.id) as rn
from tbl t
left join tbl t1
on t1.id < t.id
and t1.date1 is not null
left join tbl t2
on t2.id > t.id
and t2.date1 is not null
left join tbl t3
on t3.id < t.id
and t3.date2 is not null
left join tbl t4
on t4.id > t.id
and t4.date2 is not null
order by t.id) x
where rn = 1
这是一种使用自联接使用最近的 min/max 非空日期填充 NULL
日期的方法。此查询应该适用于大多数数据库
select t1.id, max(t2.date1), min(t3.date2)
from tbl t1
join tbl t2 on t1.id >= t2.id
join tbl t3 on t1.id <= t3.id
group by t1.id
this怎么样?
select
id
,last_value(date1 ignore nulls) over (
order by id
rows between unbounded preceding and current row
) date1
,first_value(date2 ignore nulls) over (
order by id
rows between current row and unbounded following
) date2
您也可以手动计算,而不是依赖窗口函数。
with chain as (
select
this.*,
prev.date1 prev_date1,
case when prev.date1 is not null then abs(this.id - prev.id) else null end prev_distance,
next.date2 next_date2,
case when next.date2 is not null then abs(this.id - next.id) else null end next_distance
from
Table1 this
left outer join Table1 prev on this.id >= prev.id
left outer join Table1 next on this.id <= next.id
), min_distance as (
select
id,
min(prev_distance) min_prev_distance,
min(next_distance) min_next_distance
from
chain
group by
id
)
select
chain.id,
chain.prev_date1,
chain.next_date2
from
chain
join min_distance on
min_distance.id = chain.id
and chain.prev_distance = min_distance.min_prev_distance
and chain.next_distance = min_distance.min_next_distance
order by chain.id
如果您无法通过减法计算 ID 之间的距离,只需将排序方案替换为 row_number()
调用即可。