Proc sql: 通过将数据从一年复制到另一年来创建视图
Proc sql: Creating a view by copying data from one year to another
我有一个问题,我无法在 Whosebug 或其他地方找到解决方案,我不确定,如果可能的话,请使用 SAS proc sql.
我的目标是根据数据集 (a) 和视图 (b) 生成视图 (z)。问题是 (a) 可以在年初更新到实际年份,而 (b) 只能在今年晚些时候更新。尽管如此,我希望我的观点 (z) 在年初产生数据(即使它是临时的),这自然只有在 (a) 和 (b) 都有可用数据的情况下才有可能。所以我想要的是视图 (z) 使用 (b) 中的最新可用年份并将其用作实际年份(因此基本上将数据归因于 (b) 以获取 (a) 中的最新数据年份。我试着用下面的代码来做到这一点,但它并没有完全按照我想要的方式工作:
proc sql;
create view x1 as
select ste.jahr, ste.gnr, ste.einwg, stk.stkabs
from d18.fg_gji_steinh as ste
inner join
d18.fg_gji_skraft as stk
on
case when exists (select stkabs from d18.fg_gji_skraft where ste.jahr=stk.jahr)
then ste.jahr=stk.jahr and ste.gnr=stk.gnr
else input(ste.jahr,4.)=input(stk.jahr,4.)+1 and ste.gnr=stk.gnr
end
order by ste.jahr, ste.gnr ;
quit;
它会在第一年和最后一年生成预期的数据,但不会在中间的年份生成预期的数据,因为它会为一次观察生成两行数据。第一个观察包含实际年份的数据,另一个观察包含去年的数据。
有没有人知道如何解决这个问题?
看来你只想加入两次。一个 YEAR 匹配,一个 YEAR+1 匹配。然后使用 COALESCE() 函数选择要报告的值。所以像
select a.jahr
, a.gnr
, a.einwg
, coalesce(b.stkabs,c.stkabs) as stkabs
from d18.fg_gji_steinh a
left join d18.fg_gji_skraft b
on a.gnr=b.gnr and a.jahr=b.jahr
left join d18.fg_gji_skraft c
on a.gnr=c.gnr and a.jahr=(c.jahr + 1)
如果不同年份的关键变量值(jahr、gnr)相同,建议的连接(参见下面 Tom 2 月 6 日的回答)工作正常。
然而,还有一个额外的问题,即多个市镇可能会在年初合并为一个市镇。这表现在社区号码(gnr)的变化上。
合并的两个社区获得相同的编号,即较大社区的编号甚至是新编号。
最后已知数据 (stkabs) 的值必须复制到下一年,覆盖丢失的数据。
如果两个社区合并,则必须添加数据。
这可能在视图中表达吗?
这是我的解决方案,使用了一些子查询。也许有更简单的方法来做到这一点?
我用一种格式模拟了不同的领土级别。
它需要领土年份和城市编号
作为论据,并提供新领土级别的最终自治市编号。
/*
Input data to view: tables fg_gji_steinh and fg_gji_skraft,
the tax units and defintive tax force, respectively.
Data in fg_gji_skraft is missing in the last two years. This has to be filled
by the view with the last known data, in this example from year 2019.
Table fg_gji_steinh has to be joined with table fg_gji_skraft.
*/
data work.fg_gji_steinh;
jahr='2018'; gnr='1001'; einwg=1.5; output;
jahr='2018'; gnr='1002'; einwg=1.8; output;
jahr='2018'; gnr='1003'; einwg=2.0; output;
jahr='2019'; gnr='1001'; einwg=1.6; output;
jahr='2019'; gnr='1002'; einwg=1.8; output;
jahr='2019'; gnr='1003'; einwg=2.0; output;
jahr='2020'; gnr='1002'; einwg=1.8; output;
jahr='2020'; gnr='1010'; einwg=1.9; output;
jahr='2021'; gnr='1010'; einwg=2.1; output;
run;
data work.fg_gji_skraft;
jahr='2018'; gnr='1001'; stkabs=10; output;
jahr='2018'; gnr='1002'; stkabs=20; output;
jahr='2018'; gnr='1003'; stkabs=30; output;
jahr='2019'; gnr='1001'; stkabs=10; output;
jahr='2019'; gnr='1002'; stkabs=22; output;
jahr='2019'; gnr='1003'; stkabs=35; output;
jahr='2020'; gnr='1002'; stkabs=.; output;
jahr='2020'; gnr='1010'; stkabs=.; output;
jahr='2021'; gnr='1010'; stkabs=.; output;
run;
/*
Municipal merger format.
We assume that in 2018 there exist three municipalities: 1001, 1002, 1003.
The same in 2019.
At begin of year 2020 municipalities 1001 and 1003 merge to 1010.
At begin of year 2021 municipalities 1002 and 1010 merge to 1010.
*/
proc format;
value $cjrgdgdf
'20201001'='1010'
'20201002'='1002'
'20201003'='1010'
'20201010'='1010'
'20211001'='1010'
'20211002'='1010'
'20211003'='1010'
'20211010'='1010'
;
run;
/* definitive tax force */
%macro select1;
select
jahr,
gnr,
stkabs
from fg_gji_skraft
%mend select1;
/* join tax units and definitive tax force */
%macro select2;
select
t1.*,
t2.stkabs
from
fg_gji_steinh as t1
left join
( %select1 ) as t2
on t1.jahr=t2.jahr and t1.gnr=t2.gnr
%mend select2;
/* last data year of definitive tax force*/
%macro select3;
select max (jahr) as maxjahr
from ( %select1 )
where stkabs ne .
%mend select3;
/* last known definitive tax force */
%macro select4;
select
t1.jahr,
t1.gnr,
t1.stkabs
from ( %select2 ) as t1
where t1.jahr=( %select3 )
%mend select4;
/* convert last known tax force to territorial level of first missing data year. */
%macro select5a;
select
put(input(jahr,4.)+1,4.) as jahr,
put(calculated jahr||gnr,$cjrgdgdf.) as gnr,
sum(stkabs) as stkabs
from ( %select4 )
group by jahr, calculated gnr
%mend select5a;
/* convert last known tax force to territorial level of second missing data year. */
%macro select5b;
select
put(input(jahr,4.)+2,4.) as jahr,
put(calculated jahr||gnr,$cjrgdgdf.) as gnr,
sum(stkabs) as stkabs
from ( %select4 )
group by jahr, calculated gnr
%mend select5b;
/* join definitive tax force with provisional tax force. */
%macro select6;
select
t1.jahr,
t1.gnr,
coalesce(t1.stkabs, t2.stkabs, t3.stkabs) as stkabs
from
(%select2 ) as t1
left join
( %select5a ) as t2
on t1.jahr=t2.jahr AND t1.gnr=t2.gnr
left join
( %select5b ) as t3
on t1.jahr=t3.jahr AND t1.gnr=t3.gnr
%mend select6;
/* finally join tax units with tax force */
proc sql;
create view
fg_gji_steuerfuss_gb AS
select
t1.jahr,
t1.gnr,
t1.einwg,
t2.stkabs
from
fg_gji_steinh as t1,
( %select6 ) as t2
where
t1.jahr=t2.jahr and t1.gnr=t2.gnr
;
quit;
我有一个问题,我无法在 Whosebug 或其他地方找到解决方案,我不确定,如果可能的话,请使用 SAS proc sql.
我的目标是根据数据集 (a) 和视图 (b) 生成视图 (z)。问题是 (a) 可以在年初更新到实际年份,而 (b) 只能在今年晚些时候更新。尽管如此,我希望我的观点 (z) 在年初产生数据(即使它是临时的),这自然只有在 (a) 和 (b) 都有可用数据的情况下才有可能。所以我想要的是视图 (z) 使用 (b) 中的最新可用年份并将其用作实际年份(因此基本上将数据归因于 (b) 以获取 (a) 中的最新数据年份。我试着用下面的代码来做到这一点,但它并没有完全按照我想要的方式工作:
proc sql;
create view x1 as
select ste.jahr, ste.gnr, ste.einwg, stk.stkabs
from d18.fg_gji_steinh as ste
inner join
d18.fg_gji_skraft as stk
on
case when exists (select stkabs from d18.fg_gji_skraft where ste.jahr=stk.jahr)
then ste.jahr=stk.jahr and ste.gnr=stk.gnr
else input(ste.jahr,4.)=input(stk.jahr,4.)+1 and ste.gnr=stk.gnr
end
order by ste.jahr, ste.gnr ;
quit;
它会在第一年和最后一年生成预期的数据,但不会在中间的年份生成预期的数据,因为它会为一次观察生成两行数据。第一个观察包含实际年份的数据,另一个观察包含去年的数据。
有没有人知道如何解决这个问题?
看来你只想加入两次。一个 YEAR 匹配,一个 YEAR+1 匹配。然后使用 COALESCE() 函数选择要报告的值。所以像
select a.jahr
, a.gnr
, a.einwg
, coalesce(b.stkabs,c.stkabs) as stkabs
from d18.fg_gji_steinh a
left join d18.fg_gji_skraft b
on a.gnr=b.gnr and a.jahr=b.jahr
left join d18.fg_gji_skraft c
on a.gnr=c.gnr and a.jahr=(c.jahr + 1)
如果不同年份的关键变量值(jahr、gnr)相同,建议的连接(参见下面 Tom 2 月 6 日的回答)工作正常。 然而,还有一个额外的问题,即多个市镇可能会在年初合并为一个市镇。这表现在社区号码(gnr)的变化上。 合并的两个社区获得相同的编号,即较大社区的编号甚至是新编号。 最后已知数据 (stkabs) 的值必须复制到下一年,覆盖丢失的数据。 如果两个社区合并,则必须添加数据。 这可能在视图中表达吗?
这是我的解决方案,使用了一些子查询。也许有更简单的方法来做到这一点?
我用一种格式模拟了不同的领土级别。 它需要领土年份和城市编号 作为论据,并提供新领土级别的最终自治市编号。
/*
Input data to view: tables fg_gji_steinh and fg_gji_skraft,
the tax units and defintive tax force, respectively.
Data in fg_gji_skraft is missing in the last two years. This has to be filled
by the view with the last known data, in this example from year 2019.
Table fg_gji_steinh has to be joined with table fg_gji_skraft.
*/
data work.fg_gji_steinh;
jahr='2018'; gnr='1001'; einwg=1.5; output;
jahr='2018'; gnr='1002'; einwg=1.8; output;
jahr='2018'; gnr='1003'; einwg=2.0; output;
jahr='2019'; gnr='1001'; einwg=1.6; output;
jahr='2019'; gnr='1002'; einwg=1.8; output;
jahr='2019'; gnr='1003'; einwg=2.0; output;
jahr='2020'; gnr='1002'; einwg=1.8; output;
jahr='2020'; gnr='1010'; einwg=1.9; output;
jahr='2021'; gnr='1010'; einwg=2.1; output;
run;
data work.fg_gji_skraft;
jahr='2018'; gnr='1001'; stkabs=10; output;
jahr='2018'; gnr='1002'; stkabs=20; output;
jahr='2018'; gnr='1003'; stkabs=30; output;
jahr='2019'; gnr='1001'; stkabs=10; output;
jahr='2019'; gnr='1002'; stkabs=22; output;
jahr='2019'; gnr='1003'; stkabs=35; output;
jahr='2020'; gnr='1002'; stkabs=.; output;
jahr='2020'; gnr='1010'; stkabs=.; output;
jahr='2021'; gnr='1010'; stkabs=.; output;
run;
/*
Municipal merger format.
We assume that in 2018 there exist three municipalities: 1001, 1002, 1003.
The same in 2019.
At begin of year 2020 municipalities 1001 and 1003 merge to 1010.
At begin of year 2021 municipalities 1002 and 1010 merge to 1010.
*/
proc format;
value $cjrgdgdf
'20201001'='1010'
'20201002'='1002'
'20201003'='1010'
'20201010'='1010'
'20211001'='1010'
'20211002'='1010'
'20211003'='1010'
'20211010'='1010'
;
run;
/* definitive tax force */
%macro select1;
select
jahr,
gnr,
stkabs
from fg_gji_skraft
%mend select1;
/* join tax units and definitive tax force */
%macro select2;
select
t1.*,
t2.stkabs
from
fg_gji_steinh as t1
left join
( %select1 ) as t2
on t1.jahr=t2.jahr and t1.gnr=t2.gnr
%mend select2;
/* last data year of definitive tax force*/
%macro select3;
select max (jahr) as maxjahr
from ( %select1 )
where stkabs ne .
%mend select3;
/* last known definitive tax force */
%macro select4;
select
t1.jahr,
t1.gnr,
t1.stkabs
from ( %select2 ) as t1
where t1.jahr=( %select3 )
%mend select4;
/* convert last known tax force to territorial level of first missing data year. */
%macro select5a;
select
put(input(jahr,4.)+1,4.) as jahr,
put(calculated jahr||gnr,$cjrgdgdf.) as gnr,
sum(stkabs) as stkabs
from ( %select4 )
group by jahr, calculated gnr
%mend select5a;
/* convert last known tax force to territorial level of second missing data year. */
%macro select5b;
select
put(input(jahr,4.)+2,4.) as jahr,
put(calculated jahr||gnr,$cjrgdgdf.) as gnr,
sum(stkabs) as stkabs
from ( %select4 )
group by jahr, calculated gnr
%mend select5b;
/* join definitive tax force with provisional tax force. */
%macro select6;
select
t1.jahr,
t1.gnr,
coalesce(t1.stkabs, t2.stkabs, t3.stkabs) as stkabs
from
(%select2 ) as t1
left join
( %select5a ) as t2
on t1.jahr=t2.jahr AND t1.gnr=t2.gnr
left join
( %select5b ) as t3
on t1.jahr=t3.jahr AND t1.gnr=t3.gnr
%mend select6;
/* finally join tax units with tax force */
proc sql;
create view
fg_gji_steuerfuss_gb AS
select
t1.jahr,
t1.gnr,
t1.einwg,
t2.stkabs
from
fg_gji_steinh as t1,
( %select6 ) as t2
where
t1.jahr=t2.jahr and t1.gnr=t2.gnr
;
quit;