如何使用来自不同 table 的值？

Question

我创建了一个数据集来计算阈值：

Data black;
Set blue;
Lower=p20-2;
Upper=p20+2;
Run;

我想使用这个值输出是这样的：

Variables n  lower upper
Val      123  -0.2  0.1

我想使用上限和下限作为阈值：

Proc sql;
Create table one as
Select * from two
Where (Val < upper and Val > lower)
;quit;

Upper 和 lower 应该来自 black，而 Val 应该来自 two。 two 看起来像

ID Val
42 1471
32 1742
74 4819
...

如何在我的数据集中包含阈值以便过滤来自 two 的值？

一个可能的解决方案是将较低的值和较高的值添加到两列，但我知道如何将值分配给这些列。

Answer 1

如果所有行的边界都是静态的，您可以将它们读入宏变量并在 SQL 查询中引用它们。

data black;
    set blue;
    Lower=p20-2;
    Upper=p20+2;

    /* Save the value of lower/upper to macro variables &lower and &upper */
    call symputx('lower', lower);
    call symputx('upper', upper);
run;

proc sql;
    create table one as
    select * from two
    where &lower. < Val < &upper.
    ;
quit;

如果每个 ID 都有一个特定的阈值，您可以使用散列 table 通过其键查找每个值。哈希 table 在 SAS 中非常有效，是在大 table.

中查找小 table 的好方法

data two;
    set one;

    if(_N_ = 1) then do;
        dcl hash h(dataset: 'blue');
            h.defineKey('id');
            h.defineData('lower', 'upper');
        h.defineDone();

        /* Initialize lower/upper to missing */       
        call missing(lower, upper);
    end;

    /* Take the ID for the current row in the and look it up in hash table. 
       If there is a match, return the value of lower/upper for that ID */
    rc = h.Find();

    /* Only output if the ID is between its threshold */
    if(&lower. < Val < &upper.);
   
    drop rc;
run;

如果您更喜欢使用 SQL，您可以操纵 SQL 优化器强制使用未记录的 magic=103 选项进行散列连接。有时，在较小的 tables 上进行联接会更有效。

proc sql magic=103;
    create table two as
        select t1.*
        from one as t1
        LEFT JOIN
            black as t2
        ON t1.id = t2.id
        where t2.lower < t1.val < t2.upper
    ;
quit;

如何使用来自不同 table 的值？

How to use a value that comes from a different table?

sas

proc-sql