一袋猪文总和

Pig script sum within a bag

总结每个 birthCity/birthState 组合的双倍数和三倍数。输出前 5 个 birthCity/birthState 组合产生了最多双打和三打的球员。

目前我有这个

clean = FOREACH filtered_2 GENERATE id,city,state, dble + tripple AS combined;
dump clean; 

我的问题是如何满足以上条件?很明显我必须按(城市,州)分组。如果我按

分组,我如何获得包内的总和

 counter = foreach clean {
    sum1 = SUM(combined);
    generate id,city,state,sum1;
 };

我在想这样的事情,但它不起作用

按城市、州对关系进行清理,然后使用 SUM 获得每个城市、州的分组总数。

clean = FOREACH filtered_2 GENERATE id,city,state,(dble + tripple) AS combined;
clean_group = GROUP clean BY (city,state);
counter = FOREACH clean_group GENERATE FLATTEN(group) as (city,state),SUM(clean.combined) as sum1;