使用 Teradata 正则表达式标记键和值

Tag key & value using Teradata Regular Expression

我有一个类似于以下内容的 TERADATA 数据集:

'Project: Hercules IssueType: Improvement Components: core AffectsVersions: 2.4.1 Priority: Minor Time: 15:25:23 04/06/2020'

我想根据键从上面提取标签值。

例如:

with comm as 
(
select  'Project: Hercules IssueType: Improvement Components: core AffectsVersions: 2.4.1 Priority: Minor' as text
)
select regexp_substr(comm.text,'[^: ]+',1,4)
 from comm where regexp_substr(comm.text,'[^: ]+',1,3) = 'IssueType';

有没有一种无需更改每个标签的位置参数即可进行查询的方法。 我还发现最后一个字段对于日期和时间字段有点棘手。

感谢任何帮助。

谢谢。

NVP 函数可以访问 Name/Value-pair 数据,但要拆分成多行,您需要 strtok_split_to_tableregexp_split_to_table。你的情况中棘手的部分是定界符,如果它们是唯一的而不是 ' '':':

会更容易
WITH comm AS 
 (
   SELECT 1 as keycol, -- should be a key column in your table, either numeric or varchar
      'Project: Hercules IssueType: Improvement Components: core AffectsVersions: 2.4.1 Priority: Minor Time: 15:25:23 04/06/2020' AS text
 )
SELECT id, tokennum, token, 
   -- get the key
   StrTok(token,':', 1) AS "Key",
   -- get the value (can't use StrTok because of ':' delimiter)
   Substring(token From Position(': ' IN token)+2) AS "Value"
FROM TABLE
 ( RegExp_Split_To_Table(comm.keycol
                         ,comm.text
                         ,'( )(?=[^ ]+: )' -- assuming names don't contain spaces: split at the last space before ': '
                         , 'c') 
RETURNS (id INT , tokennum INTEGER, token VARCHAR(1000) CHARACTER SET Latin)) AS dt