查找特定字符何时在猪的字符串中倒数第二个
Find when a specific char is 2nd to last in a string in pig
我有以下数据:
address|some_mask_value
123 Main | 10100011110
124 Main | 10100011100
我使用的是 Apache Pig 版本 0.15.0.2.4.2.0-258
我正在尝试创建一个指标,其中 'some_mask_value' 中的倒数第二个字符是 1。我试过:
load_data = LOAD '/myfile.txt' USING PigStorage('|') AS (address:String, some_mask_value:String);
grunt> case_test = FOREACH load_data GENERATE (CASE trial
>> WHEN LAST_INDEX_OF(name, '1') 2 THEN yes
>> ELSE no);
2017-04-20 16:59:50,522 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 5, column 30> mismatched input '2' expecting THEN
基本上,如果倒数第二个字符是 1,那么我稍后会过滤掉该行
a = load 'data.txt' using PigStorage('|')
as (address: chararray, some_mask_value:chararray);
如果掩码字段是固定长度的,就像在您的样本数据中一样,那么:
b = foreach a generate [=11=] .. , (
CASE SUBSTRING(some_mask_value, 9, 10)
WHEN '1' THEN 'YES'
ELSE 'NO'
END
) as inidcator;
dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)
如果掩码不是固定长度:
b = foreach a generate [=12=] .. , (
CASE SUBSTRING(some_mask_value, (int)SIZE(some_mask_value) - 2, (int)SIZE(some_mask_value) - 1)
WHEN '1' THEN 'YES'
ELSE 'NO'
END
) as indicator;
dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)
这假设掩码字段没有前导或尾随空格。
我有以下数据:
address|some_mask_value
123 Main | 10100011110
124 Main | 10100011100
我使用的是 Apache Pig 版本 0.15.0.2.4.2.0-258
我正在尝试创建一个指标,其中 'some_mask_value' 中的倒数第二个字符是 1。我试过:
load_data = LOAD '/myfile.txt' USING PigStorage('|') AS (address:String, some_mask_value:String);
grunt> case_test = FOREACH load_data GENERATE (CASE trial
>> WHEN LAST_INDEX_OF(name, '1') 2 THEN yes
>> ELSE no);
2017-04-20 16:59:50,522 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 5, column 30> mismatched input '2' expecting THEN
基本上,如果倒数第二个字符是 1,那么我稍后会过滤掉该行
a = load 'data.txt' using PigStorage('|')
as (address: chararray, some_mask_value:chararray);
如果掩码字段是固定长度的,就像在您的样本数据中一样,那么:
b = foreach a generate [=11=] .. , (
CASE SUBSTRING(some_mask_value, 9, 10)
WHEN '1' THEN 'YES'
ELSE 'NO'
END
) as inidcator;
dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)
如果掩码不是固定长度:
b = foreach a generate [=12=] .. , (
CASE SUBSTRING(some_mask_value, (int)SIZE(some_mask_value) - 2, (int)SIZE(some_mask_value) - 1)
WHEN '1' THEN 'YES'
ELSE 'NO'
END
) as indicator;
dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)
这假设掩码字段没有前导或尾随空格。