如何使用 REGEXP_SUBSTR 函数从字符串中提取子串?
how to use the REGEXP_SUBSTR function to extract a substring from a string?
我需要从错误日志中提取单词和单词序列。
以下日志示例:
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50.
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ee)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ee)
我需要提取子字符串:
error_log:
[Teradata DBMS] : No more spool space in aload50.
没有(例如)
和用户名:
例如:
aload50
用户名可以是:
aload01 到 aload999
和
dload01 到 dload999
select
REGEXP_SUBSTR('2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] error_message[Teradata DBMS] : No more spool space in aload50.',' regexp_for_error_log') AS error_log,
REGEXP_SUBSTR('2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.',' regexp_for_user_name') AS user_name,
FROM DUAL;
我们可以尝试在此处使用 REGEXP_REPLACE
和捕获组:
SELECT
REGEXP_REPLACE(log, '.*(\[Teradata DBMS\] : .* [^.]+)\..*', '') AS error_log,
REGEXP_REPLACE(log, '.*\[Teradata DBMS\] : .* ([^.]+)\..*', '') AS user_name
FROM yourTable;
我需要从错误日志中提取单词和单词序列。
以下日志示例:
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50.
2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ef)
message=[NCR] [Teradata DBMS] : No more spool space in aload50. (ee)
message=[NCR] [Teradata DBMS] : No more spool space in dload50. (ee)
我需要提取子字符串:
error_log:
[Teradata DBMS] : No more spool space in aload50.
没有(例如)
和用户名: 例如:
aload50
用户名可以是:
aload01 到 aload999
和
dload01 到 dload999
select
REGEXP_SUBSTR('2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] error_message[Teradata DBMS] : No more spool space in aload50.',' regexp_for_error_log') AS error_log,
REGEXP_SUBSTR('2019.06.08 14:32:36 ERR 10298587 2019-06-07 PROJECT_NAME script.sql 4483 2646 HY000 [NCR] [Teradata DBMS] : No more spool space in aload50.',' regexp_for_user_name') AS user_name,
FROM DUAL;
我们可以尝试在此处使用 REGEXP_REPLACE
和捕获组:
SELECT
REGEXP_REPLACE(log, '.*(\[Teradata DBMS\] : .* [^.]+)\..*', '') AS error_log,
REGEXP_REPLACE(log, '.*\[Teradata DBMS\] : .* ([^.]+)\..*', '') AS user_name
FROM yourTable;