正则表达式替换子串
Regular Expression replacing substring
table 中的 'ABCURLCBSURLDMSURLWER' 列可用。 URL 在该列中重复。我想检索两个 URL 之间的语句,如下所示。
Column
------
CBS
DMS
我写了如下查询,但我写的查询没有检索到我想要的结果。
SELECT
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLALI','URL',','),'ABC',',') AS AB FROM
DUAL),'[^,]+',1,LEVEL) AS AB
FROM
DUAL
CONNECT BY
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLWER','URL',','),'ABC',',') AS AB FROM
DUAL),'[^,]+',1,LEVEL)
IS NOT NULL;
AB
---
CBS
DMS
WER
我该如何解决这个问题?
试试这样的查询。复制您的解决方案以将 URL
文本替换为 ,
使正则表达式更容易拆分字符串。
更新查询
WITH some_data (ab) AS (SELECT 'ABCURLCBSURLDMSURLWER' FROM DUAL)
SELECT REGEXP_SUBSTR (REPLACE (sd.ab, 'URL', ','),
'[^,]+',
1,
lines.COLUMN_VALUE) AS ab
FROM some_data sd,
TABLE (CAST (MULTISET ( SELECT LEVEL AS level_num
FROM DUAL
CONNECT BY INSTR (sd.ab,
'URL',
1,
LEVEL) > 0) AS SYS.odciNumberList)) lines
WHERE lines.COLUMN_VALUE > 1;
输出
AB
______
CBS
DMS
您不需要正则表达式,可以使用简单的字符串函数来实现:
WITH bounds ( id, value, start_pos, end_pos ) AS (
SELECT id,
value,
INSTR( value, 'URL', 1, 1 ) + 3,
INSTR( value, 'URL', 1, 2 )
FROM table_name
UNION ALL
SELECT id,
value,
end_pos + 3,
INSTR( value, 'URL', end_pos + 3, 1 )
FROM bounds
WHERE end_pos > 0
)
SELECT id,
start_pos,
SUBSTR( value, start_pos, end_pos - start_pos ) AS url
FROM bounds
WHERE end_pos > 0
ORDER BY id, start_pos;
因此,对于示例数据:
CREATE TABLE table_name ( id, value ) AS
SELECT 1, 'ABCURLCBSURLDMSURLWER' FROM DUAL UNION ALL
SELECT 2, 'ABCURLURLDEFURLGHIURL' FROM DUAL;
这输出:
ID | START_POS | URL
-: | --------: | :---
1 | 7 | CBS
1 | 13 | DMS
2 | 7 | null
2 | 10 | DEF
2 | 16 | GHI
db<>fiddle here
选项 2
如果您确实想使用正则表达式,那么您可以使用:
SELECT t.id,
x.COLUMN_VALUE AS url
FROM table_name t
CROSS APPLY TABLE(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR(
t.value,
'(.*?)URL',
INSTR( t.value, 'URL' ) + 3,
LEVEL,
NULL,
1
)
FROM DUAL
CONNECT BY
LEVEL <= REGEXP_COUNT(
t.value, '(.*?)URL', INSTR( t.value, 'URL' ) + 3
)
)
AS SYS.ODCIVARCHAR2LIST
)
) x;
其中,对于相同的测试数据,输出:
ID | URL
-: | :---
1 | CBS
1 | DMS
2 | null
2 | DEF
2 | GHI
db<>fiddle here
还有一个选择;查看代码中的注释:
SQL> with test (col) as
2 -- sample data
3 (select 'ABCURLCBSURLDMSURLWER' from dual),
4 rpl as
5 -- replace URL with a semi-colon (a single/simple delimiter)
6 (select replace(col, 'URL', ';') col
7 from test
8 ),
9 rmv as
10 -- remove everything in front of the 1st delimiter and everything after the last delimiter
11 (select substr(col, instr(col, ';') + 1,
12 instr(col, ';', -1, 1) - instr(col, ';') - 1) val
13 from rpl
14 )
15 select regexp_substr(val, '[^;]+', 1, level) result
16 from rmv
17 connect by level <= regexp_count(val, ';') + 1;
RESULT
--------------------
CBS
DMS
SQL>
table 中的 'ABCURLCBSURLDMSURLWER' 列可用。 URL 在该列中重复。我想检索两个 URL 之间的语句,如下所示。
Column
------
CBS
DMS
我写了如下查询,但我写的查询没有检索到我想要的结果。
SELECT
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLALI','URL',','),'ABC',',') AS AB FROM
DUAL),'[^,]+',1,LEVEL) AS AB
FROM
DUAL
CONNECT BY
REGEXP_SUBSTR((SELECT REPLACE(REPLACE('ABCURLCBSURLDMSURLWER','URL',','),'ABC',',') AS AB FROM
DUAL),'[^,]+',1,LEVEL)
IS NOT NULL;
AB
---
CBS
DMS
WER
我该如何解决这个问题?
试试这样的查询。复制您的解决方案以将 URL
文本替换为 ,
使正则表达式更容易拆分字符串。
更新查询
WITH some_data (ab) AS (SELECT 'ABCURLCBSURLDMSURLWER' FROM DUAL)
SELECT REGEXP_SUBSTR (REPLACE (sd.ab, 'URL', ','),
'[^,]+',
1,
lines.COLUMN_VALUE) AS ab
FROM some_data sd,
TABLE (CAST (MULTISET ( SELECT LEVEL AS level_num
FROM DUAL
CONNECT BY INSTR (sd.ab,
'URL',
1,
LEVEL) > 0) AS SYS.odciNumberList)) lines
WHERE lines.COLUMN_VALUE > 1;
输出
AB
______
CBS
DMS
您不需要正则表达式,可以使用简单的字符串函数来实现:
WITH bounds ( id, value, start_pos, end_pos ) AS (
SELECT id,
value,
INSTR( value, 'URL', 1, 1 ) + 3,
INSTR( value, 'URL', 1, 2 )
FROM table_name
UNION ALL
SELECT id,
value,
end_pos + 3,
INSTR( value, 'URL', end_pos + 3, 1 )
FROM bounds
WHERE end_pos > 0
)
SELECT id,
start_pos,
SUBSTR( value, start_pos, end_pos - start_pos ) AS url
FROM bounds
WHERE end_pos > 0
ORDER BY id, start_pos;
因此,对于示例数据:
CREATE TABLE table_name ( id, value ) AS
SELECT 1, 'ABCURLCBSURLDMSURLWER' FROM DUAL UNION ALL
SELECT 2, 'ABCURLURLDEFURLGHIURL' FROM DUAL;
这输出:
ID | START_POS | URL -: | --------: | :--- 1 | 7 | CBS 1 | 13 | DMS 2 | 7 | null 2 | 10 | DEF 2 | 16 | GHI
db<>fiddle here
选项 2
如果您确实想使用正则表达式,那么您可以使用:
SELECT t.id,
x.COLUMN_VALUE AS url
FROM table_name t
CROSS APPLY TABLE(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR(
t.value,
'(.*?)URL',
INSTR( t.value, 'URL' ) + 3,
LEVEL,
NULL,
1
)
FROM DUAL
CONNECT BY
LEVEL <= REGEXP_COUNT(
t.value, '(.*?)URL', INSTR( t.value, 'URL' ) + 3
)
)
AS SYS.ODCIVARCHAR2LIST
)
) x;
其中,对于相同的测试数据,输出:
ID | URL -: | :--- 1 | CBS 1 | DMS 2 | null 2 | DEF 2 | GHI
db<>fiddle here
还有一个选择;查看代码中的注释:
SQL> with test (col) as
2 -- sample data
3 (select 'ABCURLCBSURLDMSURLWER' from dual),
4 rpl as
5 -- replace URL with a semi-colon (a single/simple delimiter)
6 (select replace(col, 'URL', ';') col
7 from test
8 ),
9 rmv as
10 -- remove everything in front of the 1st delimiter and everything after the last delimiter
11 (select substr(col, instr(col, ';') + 1,
12 instr(col, ';', -1, 1) - instr(col, ';') - 1) val
13 from rpl
14 )
15 select regexp_substr(val, '[^;]+', 1, level) result
16 from rmv
17 connect by level <= regexp_count(val, ';') + 1;
RESULT
--------------------
CBS
DMS
SQL>