Hive 不识别 Thorn 字符定界符

Thorn character delimiter is not recognized in Hive

如post所述 Using the Icelandic Thorn character as a delimiter in Hive thorn 字符定界符在 Hive 中无法识别

示例 table

CREATE EXTERNAL TABLE IF NOT EXISTS zzzzz_raw ( spot_id INT, activity_type_id INT, activity_type STRING, activity_id INT, activity_sub_type STRING, report_name STRING, tag_method_id INT ) PARTITIONED BY ( dt DATE ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-2' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/raw/data/networkmatchtablesactivity/activity_cat';

输出

select * 来自activity_cat_raw限制1;

4552126þ805759þeaasv101þ2275868þbfeaac01þBF_EA Access_Info Pageþ2       NULL    NULL    NULL    NULL    NULL    NULL    2015-03-24

我是不是漏掉了什么?

我找到了答案。 我使用了 '-61' 分隔符而不是 '-2'(thorn 分隔符),然后使用子字符串来删除附加符号,如下所示

CREATE EXTERNAL TABLE IF NOT EXISTS SSSSSS ( spot_id STRING, activity_type_id STRING, activity_type STRING, activity_id STRING, activity_sub_type STRING, report_name STRING, tag_method_id STRING ) PARTITIONED BY ( dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-61' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 'SSSSSS';

然后用substring去除其他符号

INSERT OVERWRITE TABLE vvvvvv PARTITION (dt) SELECT spot_id STRING, substr(activity_type_id,2), dt FROM SSSSS

希望对您有所帮助..