具有不断变化的起始值和终止值的 SAS 动态数组
SAS dynamic arrays with changing start and stop values
我正在处理一个纵向数据集,其中每一行都是一个主题,每一列都是一个事件。受试者可以拥有的事件数量没有限制,但事件以几种方式编码。为了这个例子,我们假设其中一种编码方式是二进制的(好,坏)。
我正在寻找
1) 所有事件串由 3 个或更多事件(没有计数限制)组成,并且从开始到结束在 24 小时内彼此(在同一主题上)。在同一主题中也可能有多次成功达到此标准。
2) 对于每次成功(24 小时内 3 个或更多事件的字符串),我需要计算好事件的数量。
我已经包含了生成与我的相似数据的代码。现在我正在简化为 26 个观察结果,但对于单个主题我最多有 42 个观察结果。
data examp;
informat subject 4. epdt1 epdt2 epdt3 epdt4 epdt5 epdt6 epdt7 epdt8 epdt9 epdt10 epdt11 epdt12 epdt13 epdt14 epdt15 epdt16 epdt17 epdt18 epdt19 epdt20 epdt21 epdt22 epdt23 epdt24 epdt25 epdt26 datetime20.
good1 good2 good3 good4 good5 good6 good7 good8 good9 good10 good11 good12 good13 good14 good15 good16 good17 good18 good19 good20 good21 good22 good23 good24 good25 good26 1.;
input subject epdt1 epdt2 epdt3 epdt4 epdt5 epdt6 epdt7 epdt8 epdt9 epdt10 epdt11 epdt12 epdt13 epdt14 epdt15 epdt16 epdt17 epdt18 epdt19 epdt20 epdt21 epdt22 epdt23 epdt24 epdt25 epdt26
good1 good2 good3 good4 good5 good6 good7 good8 good9 good10 good11 good12 good13 good14 good15 good16 good17 good18 good19 good20 good21 good22 good23 good24 good25 good26;
format subject: 4. epdt: datetime20. good: 1.;
datalines;
3098 . . 25JUL1998:01:46:27 25JUL1998:02:16:05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3021 13JAN1999:17:31:37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1982 01FEB1998:02:29:01 12APR1999:19:49:00 03JUN2018:21:00:00 13AUG1999:13:39:00 . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . .
1093 11APR2015:16:10:57 30AUG2015:00:52:28 14SEP2015:08:24:25 09MAY1999:00:28:37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4089 29JUN1998:05:18:34 23JUL1998:18:31:11 07FEB1999:05:25:45 07FEB1999:05:29:26 07FEB1999:05:32:04 07FEB1999:05:34:05 14FEB1999:18:00:13 14FEB1999:18:01:02 14FEB1999:18:03:24 14FEB1999:18:05:55 14FEB1999:18:16:45 14FEB1999:18:19:04 14FEB1999:18:31:57 14FEB1999:18:35:22 28JUL1998:18:32:02 31DEC1998:00:22:33 . . . . . . . . 1 1 1 1 1 1 1 1 1 1 1 . 1 . 1 . . . . . . . . . .
3055 18FEB1998:11:34:00 14JUL1998:01:20:34 13OCT1998:10:49:08 30OCT1998:18:14:58 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1239 07MAR1998:06:02:18 01JUN1998:08:18:20 23JUN1998:07:52:11 04JUL1998:08:47:04 29JUL1998:23:16:41 29JUL1998:23:30:03 29JUL1998:23:42:56 30JUL1998:00:08:03 30JUL1998:00:12:30 30JUL1998:00:14:58 30JUL1998:00:36:00 30JUL1998:00:38:33 30JUL1998:00:57:56 30JUL1998:01:01:03 30JUL1998:01:06:10 30JUL1998:01:16:50 30JUL1998:01:24:19 30JUL1998:01:32:30 30JUL1998:01:42:55 30JUL1998:01:50:24 30JUL1998:02:08:46 30JUL1998:02:20:18 30JUL1998:02:22:08 30JUL1998:02:28:52 30JUL1998:02:31:29 30JUL1998:02:51:29 . . 1 . 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 . 1
9834 10JUL1999:20:22:24 14JUL1999:00:52:02 14JUL1999:17:02:38 14JUL1999:17:30:06 21FEB2000:12:41:34 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
run;
proc sort data=examp; by subject;
data epwide_dt1;
format apppair 00.;
set examp;
by subject;
%macro loops;
array eptm (*)epdt1-epdt26; array apptm (*) good1-good26;
*********using the starting value for identifying pairs;
*******trimmed then for the sake of making the macro work;
%do start=1 %to 26;
%do stop=3 %to 26;
%if &start.<&stop. %then %do ;
/***********to figure out if the difference between the pairs of times are 24 hours;*/
tbtw=eptm[&stop.]-eptm[&start.];
/* *********number of points between them;*/
diff=(&stop.)- (&start.);
*******calculate the summaries between all episodes from start to stop;
array appr&start.&stop. (*) ap&start.-ap&stop.;
array stmct&start.&stop.(*) st&start.-st&stop.;
%do i=&start. %to &stop.;
******calculate the number of appropriate episodes;
if apptm[&i] ne . then appr&start.&stop.[&i]=apptm[&i];
else appr&start.&stop.[&i]=0;
totapp=sum(of appr&start.&stop.(*));
if totapp=. then totapp=0;
****after you calculate the total value dump the array before the next itteration;
/*call missing(of appr&start.&stop.{*});*/
if (eptm[&start.] ne . and eptm[&stop.] ne . and diff>=2 and .<tbtw<86400 and totapp>1 ) then do;
appPair=catx(" ",apppair,"(",strip(put(&start., 3.)),"-",strip(put(&stop.,3.)),":", strip(put(totapp,3.)),"Good)");
end;
%end;
%end;
%end;
%end;
%mend;
%loops ;
run;
下面的错误消息是结果:
ERROR: Array subscript out of range at line 1 column 2.
apppair= subject=1093 epdt1=11APR2015:16:10:57 epdt2=30AUG2015:00:52:28 epdt3=14SEP2015:08:24:25
epdt4=09MAY1999:00:28:37 epdt5=. epdt6=. epdt7=. epdt8=. epdt9=. epdt10=. epdt11=. epdt12=. epdt13=. epdt14=. epdt15=.
epdt16=. epdt17=. epdt18=. epdt19=. epdt20=. epdt21=. epdt22=. epdt23=. epdt24=. epdt25=. epdt26=. good1=. good2=.
good3=. good4=. good5=. good6=. good7=. good8=. good9=. good10=. good11=. good12=. good13=. good14=. good15=. good16=.
good17=. good18=. good19=. good20=. good21=. good22=. good23=. good24=. good25=. good26=. FIRST.subject=1
LAST.subject=1 tbtw=1323117 diff=1 ap1=0 ap2=0 ap3=0 st1=. st2=. st3=. totapp=0 ap4=0 st4=. ap5=0 st5=. ap6=0 st6=.
ap7=0 st7=. ap8=0 st8=. ap9=0 st9=. ap10=0 st10=. ap11=0 st11=. ap12=0 st12=. ap13=0 st13=. ap14=0 st14=. ap15=0
st15=. ap16=0 st16=. ap17=0 st17=. ap18=0 st18=. ap19=0 st19=. ap20=0 st20=. ap21=0 st21=. ap22=0 st22=. ap23=0 st23=.
ap24=0 st24=. ap25=0 st25=. ap26=0 st26=. _ERROR_=1 _N_=1
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 35:20 1 at 57:20 1 at 83:20 1 at 113:20 1 at 147:20 1 at 185:20 1 at 227:20
1 at 273:20 1 at 323:20 1 at 377:20 1 at 435:20 1 at 497:20 1 at 563:20 1 at 633:20
1 at 707:20 1 at 785:20 1 at 867:20 1 at 953:20 1 at 1043:20 1 at 1137:20 1 at 1235:20
1 at 1337:20
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.EXAMP.
WARNING: The data set WORK.EPWIDE_DT1 may be incomplete. When this step was stopped there were 0 observations and
109 variables.
WARNING: Data set WORK.EPWIDE_DT1 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 2.35 seconds
cpu time 2.13 seconds
提前感谢您的任何建议!
我不确定我是否完全理解您的全部问题。但是考虑一下,如果您想对从索引 START 到索引 STOP 的数组中的值子集求和,您只需使用 DO 循环。
例如,将 X10 与 X20 相加,您可以使用如下代码:
array x (100) ;
start=10;
stop=20;
do i=start to stop;
total=sum(total,0,x(i));
end;
所以你不用宏代码应该也能解决这个问题。这应该会使调试更容易。
我终于成功了!!!我使用@Tom 的建议来消除为每对创建子数组的需要,因为它会导致很多问题。我还简化了输出并要求它输出每对 "Good" 以便我能够更轻松地评估它们。之前它正在创建 appPair(我对开始停止循环中每次迭代的评估摘要产生一堆无关的输出)。
data epwide_dt1;
set examp;
by subject;
if first.subject then totapp=0;
%macro loops;
array eptm (*)epdt1-epdt26;
array apptm (*) good1-good26;
*********using the starting value for identifying pairs;
%do start=1 %to 24;
%do stop=3 %to 26;
%if &start.<&stop. %then %do ;
totapp=0;
/***********to figure out if the difference between the pairs of times are 24 hours;*/
tbtw=eptm[&stop.]-eptm[&start.];
/* *********number of points between them;*/
diff=(&stop.)- (&start.);
%do i=&start. %to &stop.;
******calculate the number of good events;
totapp=sum(totapp, 0,apptm[&i]);
***output the summary on the pair that can be evaluated in the next step;
if &i=&stop. and (eptm[&start.] ne . and eptm[&stop.] ne . and diff>=2 and 0<tbtw<86400 and totapp>1 ) then do;
appPair=catx(" ","(",strip(put(&start., 3.)),"-",strip(put(&stop.,3.)),":", strip(put(totapp,3.)),"Good)");
output;
end;
%end;
%end;
%end;
%end;
%mend;
%loops ;
run;
我正在处理一个纵向数据集,其中每一行都是一个主题,每一列都是一个事件。受试者可以拥有的事件数量没有限制,但事件以几种方式编码。为了这个例子,我们假设其中一种编码方式是二进制的(好,坏)。
我正在寻找 1) 所有事件串由 3 个或更多事件(没有计数限制)组成,并且从开始到结束在 24 小时内彼此(在同一主题上)。在同一主题中也可能有多次成功达到此标准。
2) 对于每次成功(24 小时内 3 个或更多事件的字符串),我需要计算好事件的数量。
我已经包含了生成与我的相似数据的代码。现在我正在简化为 26 个观察结果,但对于单个主题我最多有 42 个观察结果。
data examp;
informat subject 4. epdt1 epdt2 epdt3 epdt4 epdt5 epdt6 epdt7 epdt8 epdt9 epdt10 epdt11 epdt12 epdt13 epdt14 epdt15 epdt16 epdt17 epdt18 epdt19 epdt20 epdt21 epdt22 epdt23 epdt24 epdt25 epdt26 datetime20.
good1 good2 good3 good4 good5 good6 good7 good8 good9 good10 good11 good12 good13 good14 good15 good16 good17 good18 good19 good20 good21 good22 good23 good24 good25 good26 1.;
input subject epdt1 epdt2 epdt3 epdt4 epdt5 epdt6 epdt7 epdt8 epdt9 epdt10 epdt11 epdt12 epdt13 epdt14 epdt15 epdt16 epdt17 epdt18 epdt19 epdt20 epdt21 epdt22 epdt23 epdt24 epdt25 epdt26
good1 good2 good3 good4 good5 good6 good7 good8 good9 good10 good11 good12 good13 good14 good15 good16 good17 good18 good19 good20 good21 good22 good23 good24 good25 good26;
format subject: 4. epdt: datetime20. good: 1.;
datalines;
3098 . . 25JUL1998:01:46:27 25JUL1998:02:16:05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3021 13JAN1999:17:31:37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1982 01FEB1998:02:29:01 12APR1999:19:49:00 03JUN2018:21:00:00 13AUG1999:13:39:00 . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . .
1093 11APR2015:16:10:57 30AUG2015:00:52:28 14SEP2015:08:24:25 09MAY1999:00:28:37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4089 29JUN1998:05:18:34 23JUL1998:18:31:11 07FEB1999:05:25:45 07FEB1999:05:29:26 07FEB1999:05:32:04 07FEB1999:05:34:05 14FEB1999:18:00:13 14FEB1999:18:01:02 14FEB1999:18:03:24 14FEB1999:18:05:55 14FEB1999:18:16:45 14FEB1999:18:19:04 14FEB1999:18:31:57 14FEB1999:18:35:22 28JUL1998:18:32:02 31DEC1998:00:22:33 . . . . . . . . 1 1 1 1 1 1 1 1 1 1 1 . 1 . 1 . . . . . . . . . .
3055 18FEB1998:11:34:00 14JUL1998:01:20:34 13OCT1998:10:49:08 30OCT1998:18:14:58 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1239 07MAR1998:06:02:18 01JUN1998:08:18:20 23JUN1998:07:52:11 04JUL1998:08:47:04 29JUL1998:23:16:41 29JUL1998:23:30:03 29JUL1998:23:42:56 30JUL1998:00:08:03 30JUL1998:00:12:30 30JUL1998:00:14:58 30JUL1998:00:36:00 30JUL1998:00:38:33 30JUL1998:00:57:56 30JUL1998:01:01:03 30JUL1998:01:06:10 30JUL1998:01:16:50 30JUL1998:01:24:19 30JUL1998:01:32:30 30JUL1998:01:42:55 30JUL1998:01:50:24 30JUL1998:02:08:46 30JUL1998:02:20:18 30JUL1998:02:22:08 30JUL1998:02:28:52 30JUL1998:02:31:29 30JUL1998:02:51:29 . . 1 . 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 . 1
9834 10JUL1999:20:22:24 14JUL1999:00:52:02 14JUL1999:17:02:38 14JUL1999:17:30:06 21FEB2000:12:41:34 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
run;
proc sort data=examp; by subject;
data epwide_dt1;
format apppair 00.;
set examp;
by subject;
%macro loops;
array eptm (*)epdt1-epdt26; array apptm (*) good1-good26;
*********using the starting value for identifying pairs;
*******trimmed then for the sake of making the macro work;
%do start=1 %to 26;
%do stop=3 %to 26;
%if &start.<&stop. %then %do ;
/***********to figure out if the difference between the pairs of times are 24 hours;*/
tbtw=eptm[&stop.]-eptm[&start.];
/* *********number of points between them;*/
diff=(&stop.)- (&start.);
*******calculate the summaries between all episodes from start to stop;
array appr&start.&stop. (*) ap&start.-ap&stop.;
array stmct&start.&stop.(*) st&start.-st&stop.;
%do i=&start. %to &stop.;
******calculate the number of appropriate episodes;
if apptm[&i] ne . then appr&start.&stop.[&i]=apptm[&i];
else appr&start.&stop.[&i]=0;
totapp=sum(of appr&start.&stop.(*));
if totapp=. then totapp=0;
****after you calculate the total value dump the array before the next itteration;
/*call missing(of appr&start.&stop.{*});*/
if (eptm[&start.] ne . and eptm[&stop.] ne . and diff>=2 and .<tbtw<86400 and totapp>1 ) then do;
appPair=catx(" ",apppair,"(",strip(put(&start., 3.)),"-",strip(put(&stop.,3.)),":", strip(put(totapp,3.)),"Good)");
end;
%end;
%end;
%end;
%end;
%mend;
%loops ;
run;
下面的错误消息是结果:
ERROR: Array subscript out of range at line 1 column 2.
apppair= subject=1093 epdt1=11APR2015:16:10:57 epdt2=30AUG2015:00:52:28 epdt3=14SEP2015:08:24:25
epdt4=09MAY1999:00:28:37 epdt5=. epdt6=. epdt7=. epdt8=. epdt9=. epdt10=. epdt11=. epdt12=. epdt13=. epdt14=. epdt15=.
epdt16=. epdt17=. epdt18=. epdt19=. epdt20=. epdt21=. epdt22=. epdt23=. epdt24=. epdt25=. epdt26=. good1=. good2=.
good3=. good4=. good5=. good6=. good7=. good8=. good9=. good10=. good11=. good12=. good13=. good14=. good15=. good16=.
good17=. good18=. good19=. good20=. good21=. good22=. good23=. good24=. good25=. good26=. FIRST.subject=1
LAST.subject=1 tbtw=1323117 diff=1 ap1=0 ap2=0 ap3=0 st1=. st2=. st3=. totapp=0 ap4=0 st4=. ap5=0 st5=. ap6=0 st6=.
ap7=0 st7=. ap8=0 st8=. ap9=0 st9=. ap10=0 st10=. ap11=0 st11=. ap12=0 st12=. ap13=0 st13=. ap14=0 st14=. ap15=0
st15=. ap16=0 st16=. ap17=0 st17=. ap18=0 st18=. ap19=0 st19=. ap20=0 st20=. ap21=0 st21=. ap22=0 st22=. ap23=0 st23=.
ap24=0 st24=. ap25=0 st25=. ap26=0 st26=. _ERROR_=1 _N_=1
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 35:20 1 at 57:20 1 at 83:20 1 at 113:20 1 at 147:20 1 at 185:20 1 at 227:20
1 at 273:20 1 at 323:20 1 at 377:20 1 at 435:20 1 at 497:20 1 at 563:20 1 at 633:20
1 at 707:20 1 at 785:20 1 at 867:20 1 at 953:20 1 at 1043:20 1 at 1137:20 1 at 1235:20
1 at 1337:20
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.EXAMP.
WARNING: The data set WORK.EPWIDE_DT1 may be incomplete. When this step was stopped there were 0 observations and
109 variables.
WARNING: Data set WORK.EPWIDE_DT1 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 2.35 seconds
cpu time 2.13 seconds
提前感谢您的任何建议!
我不确定我是否完全理解您的全部问题。但是考虑一下,如果您想对从索引 START 到索引 STOP 的数组中的值子集求和,您只需使用 DO 循环。
例如,将 X10 与 X20 相加,您可以使用如下代码:
array x (100) ;
start=10;
stop=20;
do i=start to stop;
total=sum(total,0,x(i));
end;
所以你不用宏代码应该也能解决这个问题。这应该会使调试更容易。
我终于成功了!!!我使用@Tom 的建议来消除为每对创建子数组的需要,因为它会导致很多问题。我还简化了输出并要求它输出每对 "Good" 以便我能够更轻松地评估它们。之前它正在创建 appPair(我对开始停止循环中每次迭代的评估摘要产生一堆无关的输出)。
data epwide_dt1;
set examp;
by subject;
if first.subject then totapp=0;
%macro loops;
array eptm (*)epdt1-epdt26;
array apptm (*) good1-good26;
*********using the starting value for identifying pairs;
%do start=1 %to 24;
%do stop=3 %to 26;
%if &start.<&stop. %then %do ;
totapp=0;
/***********to figure out if the difference between the pairs of times are 24 hours;*/
tbtw=eptm[&stop.]-eptm[&start.];
/* *********number of points between them;*/
diff=(&stop.)- (&start.);
%do i=&start. %to &stop.;
******calculate the number of good events;
totapp=sum(totapp, 0,apptm[&i]);
***output the summary on the pair that can be evaluated in the next step;
if &i=&stop. and (eptm[&start.] ne . and eptm[&stop.] ne . and diff>=2 and 0<tbtw<86400 and totapp>1 ) then do;
appPair=catx(" ","(",strip(put(&start., 3.)),"-",strip(put(&stop.,3.)),":", strip(put(totapp,3.)),"Good)");
output;
end;
%end;
%end;
%end;
%end;
%mend;
%loops ;
run;