sas 条件总和到新领域

sas conditional sum into new field

我是 SAS 的新手,有一个名为 ORIG_DATA 的简单数据集,我需要从中创建一个新的数据集摘要,它显示 Salesman_ID Day_ID[= 的总数13=]

从本质上讲,SUMMARY 输出应该如下所示,其中数字是总计的总和。

Salesman_ID|Day_1|Day_2
A          |30   |40
B          |60   |0
C          |20   |70

在SQL,我

Select salesman_id, 
sum(case when day_id=1 then total else 0 end) as day_1,
sum(case when day_id=2 then total else 0 end) as day_2
from ORIG_DATA group by salesman_id

但是对于这个问题,我不允许使用 proc sql。我还能如何在 SAS 中执行此操作?目前还没有最雾。 为非表格格式道歉

ORIG_DATA如下

Day_ID|Salesman_ID|Other_field|total
1     |A          |R000       |10
1     |A          |R002       |20
2     |A          |R000       |10
2     |A          |R004       |30
1     |B          |R002       |20
1     |B          |R000       |40
1     |B          |R004       |0
2     |C          |R003       |40
2     |C          |R004       |10
1     |C          |R002       |20
2     |C          |R002       |20

这个怎么样?我不知道每个 salesman_id 每个 day_id 是否只有两个 other_field 记录。以下将适用于 1 到 n 条记录:

输入数据:

data ORIG_DATA  ;
input Day_ID Salesman_ID $ Other_field $ total ;
cards ;
1  A  R000  10
1  A  R002  20
2  A  R000  10
2  A  R004  30
1  B  R002  20
1  B  R000  40
1  B  R004  0
2  C  R003  40
2  C  R004  10
1  C  R002  20
2  C  R002  20
;run; 

转置、求和并转回:

proc sort data=ORIG_DATA ;
  by salesman_id day_id ;
proc transpose data=ORIG_DATA out=D1 ;
  by salesman_id  day_id ;
  var total ;
run ;

data D2 ;
  set D1 ;
  array D(*) col: ;
  _name_=cats('day_',day_id) ;
  by salesman_id day_id;
  total=sum(of D(*)) ; 
run ;

proc transpose data=D2 out=SUMMARY(drop=_name_) name=_name_;
  by salesman_id  ;
  var total ;
run ;

*Add zeros for missing values ;
data SUMMARY ;
  set SUMMARY ;
  array days day_: ;
  do over days ;
    if missing(days) then days=0;
  end ;
run ;

其他方法:

proc summary data=orig_data nway;
class day_id salesman_id;
var total;
output out=sum(drop=_:) sum=;
run;

proc sort data=sum;
by salesman_id day_id;
run;

proc transpose data=sum out=want(drop=_name_) prefix=day_;
by salesman_id;
var total;
run;

您可以通过简单的数据步骤解决问题,请参见下面的代码。 您需要先对数据进行排序,然后指示数据与您在新组开始时将 day_1 和 day_2 重置为零的组一起使用,然后输出到数据集最后的观察。

如果您有任何问题,请告诉我。

data ORIG_DATA  ;
input Day_ID Salesman_ID $ Other_field $ total ;
cards ;
1  A  R000  10
1  A  R002  20
2  A  R000  10
2  A  R004  30
1  B  R002  20
1  B  R000  40
1  B  R004  0
2  C  R003  40
2  C  R004  10
1  C  R002  20
2  C  R002  20
;run;

proc sort;
   by salesman_id; 
RUN; 

data salesman_id (drop=Day_ID Other_field total); 
  set orig_data; 
  by salesman_id; 
   if first.salesman_id then do; 
     day_1 = 0; 
     day_2 = 0;
   end; 
  if day_id=1 then day_1 + total; 
  if day_id=2 then day_2 + total;
  if last.salesman_id then output; 
RUN; 

类似的:

proc sort data = orig_data(drop = Other_field);
by salesman_id day_id;
run;

data test (drop = total);
  retain salesman_id day_id;
    set orig_data ;
  by salesman_id day_id notsorted;

  if first.day_id then sum = total;
  else sum + total;

  if last.day_id then output;

run;

proc transpose data = test out = t(drop=_:) prefix = day_id_;
by salesman_id;
id day_id;
var sum;
run;