如何从一袋元组中提取不同的内容?

How to extract distinct from a bag of tuples?

所以我在描述之后在猪中有以下数据结构:

    --------------------------------------------------------------------------------------------------------------------------------------------------------
| summed_hours_and_miles_by_driver     | group:int     | :bag{:tuple(driver_name:chararray)}             | total_hours:long     | total_miles:long     | 
--------------------------------------------------------------------------------------------------------------------------------------------------------
|                                      | 27            | {(Mark Lochbihler), ..., (Mark Lochbihler)}     | 220                  | 11006                | 
--------------------------------------------------------------------------------------------------------------------------------------------------------

想法是驱动程序名称 (Mark Lochbihler) 在一袋元组中被复制多次。 我怎样才能将它限制为单个名称,比如 SQL 中的 DISTINCT?

使用 Distinct,假设 A 是你的关系,如下所示

summed_hours_and_miles_by_driver = FOREACH grp GENERATE 
                                       group,
                                       org.apache.pig.builtin.Distinct(A.driver_name),
                                       total_hours,
                                       total_miles;