没有空值的 Pig Flatten
Pig Flatten without nulls
我有一个猪包
(1139-50052,Aquatic,Consumer,6,makarina,2,{(),(Unknown)})
(1139-50052,Aquatic,Consumer,6,jabong,2,{(),(),(),(Unknown)})
我需要将其展平,不留空值。
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
请指教
一个选项是您可以在 BagToString()
函数中传递包,这样空值将被丢弃,然后根据分隔符 '_'
拆分您的包值。
FLATTEN(STRSPLIT(BagToString(BagName),'_+'))
除了您的输入,它也适用于其他组合,示例如下。
输入
1139-50052 Aquatic Consumer 6 makarina 2 {(),(Unknown)}
1139-50052 Aquatic Consumer 6 jabong 2 {(),(),(),(Unknown)}
1139-50052 Aquatic Consumer 6 test1 2 {(unknown1),(),(),(Unknown2)}
1139-50052 Aquatic Consumer 6 test2 2 {(unknown1),(unknown2),(),(Unknown3)}
PigScript:
A = LOAD 'input' USING PigStorage() AS (f0,f1,f2,f3,f4,f5,B:{T:(f7)});
B = FOREACH A GENERATE f0,f1,f2,f3,f4,f5,FLATTEN(STRSPLIT(BagToString(B),'_+'));
DUMP B;
输出:
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
(1139-50052,Aquatic,Consumer,6,test1,2,unknown1,Unknown2)
(1139-50052,Aquatic,Consumer,6,test2,2,unknown1,unknown2,Unknown3)
我有一个猪包
(1139-50052,Aquatic,Consumer,6,makarina,2,{(),(Unknown)})
(1139-50052,Aquatic,Consumer,6,jabong,2,{(),(),(),(Unknown)})
我需要将其展平,不留空值。
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
请指教
一个选项是您可以在 BagToString()
函数中传递包,这样空值将被丢弃,然后根据分隔符 '_'
拆分您的包值。
FLATTEN(STRSPLIT(BagToString(BagName),'_+'))
除了您的输入,它也适用于其他组合,示例如下。
输入
1139-50052 Aquatic Consumer 6 makarina 2 {(),(Unknown)}
1139-50052 Aquatic Consumer 6 jabong 2 {(),(),(),(Unknown)}
1139-50052 Aquatic Consumer 6 test1 2 {(unknown1),(),(),(Unknown2)}
1139-50052 Aquatic Consumer 6 test2 2 {(unknown1),(unknown2),(),(Unknown3)}
PigScript:
A = LOAD 'input' USING PigStorage() AS (f0,f1,f2,f3,f4,f5,B:{T:(f7)});
B = FOREACH A GENERATE f0,f1,f2,f3,f4,f5,FLATTEN(STRSPLIT(BagToString(B),'_+'));
DUMP B;
输出:
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
(1139-50052,Aquatic,Consumer,6,test1,2,unknown1,Unknown2)
(1139-50052,Aquatic,Consumer,6,test2,2,unknown1,unknown2,Unknown3)