SAS:在 proc sql 中使用 group by 不会按时间顺序分离实例

SAS: using group by in proc sql doesn't separate out instances chronologically

考虑以下 SAS 代码:

data test;
    format dt date9.
           ctry_cd .
           sn .;
    input ctry_cd sn dt;
    datalines;
    US 1 20000
    US 1 20001
    US 1 20002
    CA 1 20003
    CA 1 20004
    US 1 20005
    US 1 20006
    US 1 20007
    ES 2 20001
    ES 2 20002
    ;
run;

proc sql;
    create table check as
    select
        sn,
        ctry_cd,
        min(dt) as begin_dt format date9.,
        max(dt) as end_dt format date9.
    from test
    group by sn, ctry_cd;
quit;

这个returns:

1 CA 07OCT2014 08OCT2014
1 US 04OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014

我想为proc sql区分国招;也就是说,return

1 US 04OCT2014 06OCT2014
1 CA 07OCT2014 08OCT2014
1 US 09OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014 

所以它仍然按 sn 和 ctry_nm 对实例进行分组,但要注意日期,所以我有一个时间表。

然后您需要创建另一个分组变量:

data test;
  set test;
  prev_ctry_cd=lag(ctry_cd);
  if prev_ctry_cd ^= ctry_cd then group+1;
run;

proc sql;
    create table check as
    select 
        sn,
        ctry_cd,
        min(dt) as begin_dt format date9.,
        max(dt) as end_dt format date9.
    from test
    group by group,  sn, ctry_cd
    order by group;
quit;

如果数据按照您的示例排序,那么您可以在一个数据步骤中实现您的目标,而无需创建额外的变量。

data want;
keep sn ctry_cd begin_dt end_dt; /* keeps required variables and sets variable order */
set test;
by sn ctry_cd notsorted; /* notsorted option needed as ctry_cd is not in order */
retain begin_dt; /* retains value until needed */
if first.ctry_cd then begin_dt=dt; /* store first date for each new ctry_cd */
if last.ctry_cd then do;
    end_dt=dt; /* store last date for each new ctry_cd */
    output; /* output result */
end;
format begin_dt end_dt date9.;
run;