查找特定字符何时在猪的字符串中倒数第二个

Find when a specific char is 2nd to last in a string in pig

我有以下数据:

address|some_mask_value
123 Main | 10100011110
124 Main | 10100011100

我使用的是 Apache Pig 版本 0.15.0.2.4.2.0-258

我正在尝试创建一个指标,其中 'some_mask_value' 中的倒数第二个字符是 1。我试过:

load_data = LOAD '/myfile.txt' USING PigStorage('|') AS (address:String, some_mask_value:String);

grunt> case_test = FOREACH load_data GENERATE (CASE trial
>> WHEN LAST_INDEX_OF(name, '1') 2 THEN yes
>> ELSE no);

2017-04-20 16:59:50,522 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 5, column 30>  mismatched input '2' expecting THEN

基本上,如果倒数第二个字符是 1,那么我稍后会过滤掉该行

a = load 'data.txt' using PigStorage('|') 
       as (address: chararray, some_mask_value:chararray);

如果掩码字段是固定长度的,就像在您的样本数据中一样,那么:

b = foreach a generate [=11=] .. , (
        CASE SUBSTRING(some_mask_value, 9, 10)
            WHEN '1' THEN 'YES'
            ELSE 'NO'
        END
    ) as inidcator;

dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)

如果掩码不是固定长度:

b = foreach a generate [=12=] .. , (
        CASE SUBSTRING(some_mask_value, (int)SIZE(some_mask_value) - 2, (int)SIZE(some_mask_value) - 1)
            WHEN '1' THEN 'YES'
            ELSE 'NO'
        END
    ) as indicator;
dump b;
(123 Main,10100011110,YES)
(124 Main,10100011100,NO)

这假设掩码字段没有前导或尾随空格。