如何在 sas 中使用 nobs 以便它可以用于查找百分比频率
how to use nobs in sas so that it can be used to find percentage frequency
我导入了一个数据文件。它有 246 个观测值。通过使用 nobs,我如何替换代码最后一句中的 246 来查找百分比?
proc import datafile='G:\Data file\Dec 2014.csv'
out=out.datafile
dbms=csv
replace;
*SUMMARY TABLE IS MADE;
proc summary data = out.datafile missing;
class AGE sentiment; *age, sentiment are two columns in datafile;
var ;
output out = out.summ_age ;
run;
*SORTING SUMMARY TABLE BY AGE;
proc sort data = out.summ_age;
by age;
run;
*TRANSPOSING THE SORTED DATA WITH AGE AS OBS SENTIMENT VALUES AS COLUMNS;
PROC transpose data = out.summ_age out = out.hsi_age;
by age;
id sentiment;
var count;
run;
data out.hsi_age;
set out.hsi_age;
tot = d+s+i;
PERCENTAGE=round((tot/246)*100,0.01); /*PERCENTAGE PER GROUP*/
run;
如果您想将数据集中的观测值数量放入宏变量中,则可以将此计数放入宏变量中 - 如您的情况:
proc sql ;
select count(*)
into :N
from out.datafile ;
quit ;
然后您可以在最后一步调用它:
data out.hsi_age;
set out.hsi_age;
tot = d+s+i;
PERCENTAGE=round((tot/&N.)*100,0.01); /*PERCENTAGE PER GROUP*/
run;
所以,你有缺失和 200% 的原因就是你的 PROC SUMMARY
产生了比你想要的更多的数据。你想要跨越年龄和情怀,却得到:
- 年龄
- 感悟
- 年龄*感情
- 所有数据(总计)
你需要问你真正想要什么。这确实给了我们一点好处:您不必去寻找 NOBS,您实际上可以从这些额外的行之一(年龄未与情感交叉)中获得它。
假设你有 sashelp.class
,并用性代替情感,你可以这样做:
proc summary data = sashelp.class missing;
class AGE sex; *sex replaces sentiment here;
types sex age*sex; *we want the grand total for each separate sex, and crossed with age;
var ;
output out = summ_age ;
run;
*SORTING SUMMARY TABLE BY AGE;
proc sort data = summ_age; by age;
run;
*TRANSPOSING THE SORTED DATA WITH AGE AS OBS SEX VALUES AS COLUMNS;
PROC transpose data = summ_age out = hsi_age;
by age;
id sex;
var _freq_;
run;
data hsi_age;
set hsi_age;
retain grand_tot;
if _n_=1 then grand_tot=M+F; *the first row has the null age values, grand total by sex;
else do;
tot = sum(M,F);
PERCENTAGE=round((tot/grand_tot)*100,0.01); /*PERCENTAGE PER GROUP*/
output;
end;
run;
您也可以通过 proc tabulate
生成您要查找的内容,但这同样有效。
NOBS 选项创建常规变量,您可以使用与给定数据步骤中任何其他变量相同的方式访问该变量(唯一的区别是该变量永远不会输出到目标数据集)。我想你愿意做的事情可以通过以下方式实现:
data out.hsi_age;
set out.hsi_age nobs=my_nobs_var;
tot = d+s+i;
PERCENTAGE=round((tot/my_nobs_var)*100,0.01); /*PERCENTAGE PER GROUP*/
run;
我导入了一个数据文件。它有 246 个观测值。通过使用 nobs,我如何替换代码最后一句中的 246 来查找百分比?
proc import datafile='G:\Data file\Dec 2014.csv'
out=out.datafile
dbms=csv
replace;
*SUMMARY TABLE IS MADE;
proc summary data = out.datafile missing;
class AGE sentiment; *age, sentiment are two columns in datafile;
var ;
output out = out.summ_age ;
run;
*SORTING SUMMARY TABLE BY AGE;
proc sort data = out.summ_age;
by age;
run;
*TRANSPOSING THE SORTED DATA WITH AGE AS OBS SENTIMENT VALUES AS COLUMNS;
PROC transpose data = out.summ_age out = out.hsi_age;
by age;
id sentiment;
var count;
run;
data out.hsi_age;
set out.hsi_age;
tot = d+s+i;
PERCENTAGE=round((tot/246)*100,0.01); /*PERCENTAGE PER GROUP*/
run;
如果您想将数据集中的观测值数量放入宏变量中,则可以将此计数放入宏变量中 - 如您的情况:
proc sql ;
select count(*)
into :N
from out.datafile ;
quit ;
然后您可以在最后一步调用它:
data out.hsi_age;
set out.hsi_age;
tot = d+s+i;
PERCENTAGE=round((tot/&N.)*100,0.01); /*PERCENTAGE PER GROUP*/
run;
所以,你有缺失和 200% 的原因就是你的 PROC SUMMARY
产生了比你想要的更多的数据。你想要跨越年龄和情怀,却得到:
- 年龄
- 感悟
- 年龄*感情
- 所有数据(总计)
你需要问你真正想要什么。这确实给了我们一点好处:您不必去寻找 NOBS,您实际上可以从这些额外的行之一(年龄未与情感交叉)中获得它。
假设你有 sashelp.class
,并用性代替情感,你可以这样做:
proc summary data = sashelp.class missing;
class AGE sex; *sex replaces sentiment here;
types sex age*sex; *we want the grand total for each separate sex, and crossed with age;
var ;
output out = summ_age ;
run;
*SORTING SUMMARY TABLE BY AGE;
proc sort data = summ_age; by age;
run;
*TRANSPOSING THE SORTED DATA WITH AGE AS OBS SEX VALUES AS COLUMNS;
PROC transpose data = summ_age out = hsi_age;
by age;
id sex;
var _freq_;
run;
data hsi_age;
set hsi_age;
retain grand_tot;
if _n_=1 then grand_tot=M+F; *the first row has the null age values, grand total by sex;
else do;
tot = sum(M,F);
PERCENTAGE=round((tot/grand_tot)*100,0.01); /*PERCENTAGE PER GROUP*/
output;
end;
run;
您也可以通过 proc tabulate
生成您要查找的内容,但这同样有效。
NOBS 选项创建常规变量,您可以使用与给定数据步骤中任何其他变量相同的方式访问该变量(唯一的区别是该变量永远不会输出到目标数据集)。我想你愿意做的事情可以通过以下方式实现:
data out.hsi_age;
set out.hsi_age nobs=my_nobs_var;
tot = d+s+i;
PERCENTAGE=round((tot/my_nobs_var)*100,0.01); /*PERCENTAGE PER GROUP*/
run;