Oracle REGEXP_REPLACE 并保留其中的一部分
Oracle REGEXP_REPLACE and retain part of it
我在专栏中有一段文字,类似于
Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.
我想使用 REGEXP_REPLACE
(或任何)函数将 %UC#< value >#UC%
替换为 UNISTR(< value >)
。从上面的例子来看,结果应该是
Hello World (UNISTR of abc). How are you (UNISTR of def). Have a nice day (UNISTR of ghi).
基本上它应该剥离 %UC#
并将其中的值替换为值的 UNISTR
。
有什么方法可以实现吗?
这可能是 11g 及更高版本中的一种方式:
with test(s) as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' from dual)
select listagg (str) within group ( order by lev)
from (
select regexp_substr(s, '(^|#UC%)(.*?)(%UC#)', 1, level, '', 2) ||
UPPER(regexp_substr(s, '(%UC#)(.*?)(#UC%)', 1, level, '', 2)) as str,
level as lev
from test
connect by instr(s, '%UC#', 1, level ) > 0
)
这给出了(我使用 UPPER
而不是 UNISTR
来使结果清晰):
Hello World ABC. How are you DEF. Have a nice day GHI.
这里的想法是使用常用的拆分字符串技术,将 '%UC#...#UC%'
包裹的部分视为分隔符;请注意,我在输入字符串中添加了一个小字符串 ('%UC##UC%'
) 来处理输入字符串的最后部分,使查询认为要处理的字符串以 and (empty) '%UC#...#UC%'
序列结束。
在 Oracle 10g 中,我们不能像我那样使用 listagg
和 regexp_substr
,因此,解决方案有点复杂。
这里我完全不使用正则表达式,通过SYS_CONNECT_BY_PATH
计算聚合;为此,我需要确定一个永远不会出现在您的输入文本中的字符串,比如 '@@'
:
with test as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' as s from dual)
with test as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' as s from dual)
select replace ( sys_connect_by_path (
substr(s, case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end, instr(s, '%UC#', 1, level) -case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end ) ||
UPPER(substr(s, instr(s, '%UC#', 1, level) + 4, instr(s,'#UC%', 1, level) - (instr(s, '%UC#', 1, level) + 4)) )
, '@@'
),
'@@') str
from test
where connect_by_isleaf = 1
connect by instr(s, '%UC#', 1, level ) > 0
我在专栏中有一段文字,类似于
Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.
我想使用 REGEXP_REPLACE
(或任何)函数将 %UC#< value >#UC%
替换为 UNISTR(< value >)
。从上面的例子来看,结果应该是
Hello World (UNISTR of abc). How are you (UNISTR of def). Have a nice day (UNISTR of ghi).
基本上它应该剥离 %UC#
并将其中的值替换为值的 UNISTR
。
有什么方法可以实现吗?
这可能是 11g 及更高版本中的一种方式:
with test(s) as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' from dual)
select listagg (str) within group ( order by lev)
from (
select regexp_substr(s, '(^|#UC%)(.*?)(%UC#)', 1, level, '', 2) ||
UPPER(regexp_substr(s, '(%UC#)(.*?)(#UC%)', 1, level, '', 2)) as str,
level as lev
from test
connect by instr(s, '%UC#', 1, level ) > 0
)
这给出了(我使用 UPPER
而不是 UNISTR
来使结果清晰):
Hello World ABC. How are you DEF. Have a nice day GHI.
这里的想法是使用常用的拆分字符串技术,将 '%UC#...#UC%'
包裹的部分视为分隔符;请注意,我在输入字符串中添加了一个小字符串 ('%UC##UC%'
) 来处理输入字符串的最后部分,使查询认为要处理的字符串以 and (empty) '%UC#...#UC%'
序列结束。
在 Oracle 10g 中,我们不能像我那样使用 listagg
和 regexp_substr
,因此,解决方案有点复杂。
这里我完全不使用正则表达式,通过SYS_CONNECT_BY_PATH
计算聚合;为此,我需要确定一个永远不会出现在您的输入文本中的字符串,比如 '@@'
:
with test as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' as s from dual)
with test as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' as s from dual)
select replace ( sys_connect_by_path (
substr(s, case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end, instr(s, '%UC#', 1, level) -case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end ) ||
UPPER(substr(s, instr(s, '%UC#', 1, level) + 4, instr(s,'#UC%', 1, level) - (instr(s, '%UC#', 1, level) + 4)) )
, '@@'
),
'@@') str
from test
where connect_by_isleaf = 1
connect by instr(s, '%UC#', 1, level ) > 0