选择sas中字符长度最大的行

Choosing the row with the maximum character length in sas

我有以下数据集:

dataseta:
No. Name1 Name2                       Sales Inv Comp
1   TC    Tribal Council Inc          100   100  0
2.  TC    Tribal Council Limited INC  20    25   65

desired output:
    datasetb:
    No. Name1 Name2                       Sales Inv Comp
    1   TC    Tribal Council Limited Inc  120   125  0

基本上,我需要为列名2选择字符长度最大的行。 我尝试了以下方法,但没有用

proc sql;
create table datasetb as select no,name1,name2,sum(sales),sum(inv),min(comp) from dataseta group by 1,2,3 having length(name2)=max(length(name2));quit;

如果我执行以下代码,它只会部分解决它,并且我会得到重复的行

proc sql;
create table datasetb as select no,name1,max(length(name2)),sum(sales),sum(inv),min(comp) from dataseta group by 1,2 having length(name2)=max(length(name2));quit;

您似乎正在连接两个单独的聚合计算的结果。

假设:

no 是唯一的,以便允许打破平局的标准,第一个(根据 no)最长的 name2 将与 costinvcomp 总计超过 name1

查询将进行很多...

  • name1 中第一个最长的 name2,嵌套子查询需要:

    • 确定最长的name2,然后
    • Select第一个,按no,如果超过一个。
  • 总数超过 name1

    • 总计将是一个连接的子查询,用于提供所需的结果集。

示例 (SQL)

data have;
length no 8 name1  name2  sales inv comp 8;
input 
no name1& name2&                      sales inv  comp; datalines;
1   TC    Tribal Council Inc          100   100  0    * name1=TC group
2   TC    Tribal Council Limited INC  20    25   65
3   TC    Tribal council co           0     0    0 
4   TC    The Tribal council Assoctn  10    10   10
7   LS    Longshore association       10    10   0    * name=LS group
8   LS    The Longshore Group, LLC    2     4    8
9   LS    The Longshore Group, llc    15    15   6
run;

proc sql;
  create table want as

  select 
    first_longest_name2.no,
    first_longest_name2.name1,
    first_longest_name2.name2,
    name1_totals.sales,
    name1_totals.inv,
    name1_totals.comp

  FROM

    (
      select 
        no, name1, name2 
      from 
        ( select 
            no, name1, name2 
          from have
          group by name1
          having length(name2) = max(length(name2))
        ) longest_name2s
      group by name1
      having no = min(no)
    ) as
    first_longest_name2

  LEFT JOIN 

    ( 
      select  
        name1,
        sum(sales) as sales,
        sum(inv) as inv,
        sum(comp) as comp
      from 
        have
      group by name1
    ) as
    name1_totals

  ON

    first_longest_name2.name1 = name1_totals.name1
;
quit;

示例(数据步)

name1 组是连续的行时,以串行方式处理数据,可以使用 DOW 循环技术来完成——这是一个包含 SET 语句的循环。

data want2;
  do until (last.name1);
    set have;
    by name1 notsorted;

    if length(name2) > longest then do;
      longest = length(name2);
      no_at_longest = no;
      name2_at_longest = name2;
    end;

    sales_sum = sum(sales_sum,sales);
    inv_sum   = sum(inv_sum,inv);
    comp_sum  = sum(comp_sum,comp);
  end;

  drop name2 no sales inv comp longest;

  rename 
    no_at_longest = no
    name2_at_longest = name2
    sales_sum = sales
    inv_sum = inv
    comp_sum = comp
  ;
run;