在 oracle sql 中连接最大长度的字符串行
Concatenate string rows up to length in oracle sql
我想要实现的是将长度不超过 10 的字符串与回车连接起来 return。如果该行的长度超过 10,则应将其添加到下一个串联行。
示例,具有以下数据集
SELECT '0123' col FROM DUAL
UNION ALL
SELECT '45 67' FROM DUAL
UNION ALL
SELECT '89A' FROM DUAL
UNION ALL
SELECT 'BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL
我对结果的期望
SELECT '0123
45 67' col FROM DUAL
UNION ALL
SELECT '89A
BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL
我是 运行 oracle 12.1,由于性能原因,我不想在 PLSQL 中执行此操作。我正在处理更高的数字。
我发布了一个简单的例子,这样会更容易。我的最终目标是以某种方式使用 listagg,其中每行最多有 4k 个字符
如果要将成对的相邻行组合在一起,则需要一个列来定义行的顺序。让我假设你有这样的专栏,叫做 id
.
然后,您可以使用递归查询。这个想法是逐行遍历数据集,连接值直到长度超过 10,此时必须开始一个新组。外层查询returns每组最新行:
with
data (id, col, rn) as (
select t.*, row_number() over(order by id) rn
from mytable t
),
cte (id, rn, newcol, grp) as (
select id, rn, col, 1 from data d where rn = 1
union all
select d.id, d.rn,
case when length(c.newcol) + length(d.col) < 10
then c.newcol || chr(13) || d.col
else d.col
end,
case when length(c.newcol) + length(d.col) < 10
then c.grp
else d.rn
end
from cte c
inner join data d on d.rn = c.rn + 1
)
select max(newcol) as newcol
from cte
group by grp order by min(id)
您可以使用 MATCH_RECOGNIZE
对行进行分组,然后 LISTAGG
将它们连接起来:
SELECT LISTAGG( col, CHR(10) ) WITHIN GROUP ( ORDER BY rn ) AS col
FROM ( SELECT ROWNUM AS rn, col FROM table_name )
MATCH_RECOGNIZE(
ORDER BY rn
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( short_strings* last_string )
DEFINE short_strings AS NEXT(LENGTH(col)) <= 10 - SUM(LENGTH(col) + 1)
)
GROUP BY mno;
其中,对于示例数据:
CREATE TABLE table_name ( col ) AS
SELECT '0123' FROM DUAL UNION ALL
SELECT '45 67' FROM DUAL UNION ALL
SELECT '89A' FROM DUAL UNION ALL
SELECT 'BC' FROM DUAL UNION ALL
SELECT 'DEFGHI' FROM DUAL;
输出:
| COL |
| :----- |
| 0123 |
| 45 67 |
| ------ |
| 89A |
| BC |
| ------ |
| DEFGHI |
db<>fiddle here
这里有一个match_recognize
解决方案,需要Oracle 12.1或更高版本。我做出以下附加假设:换行符在 Unix 中是 chr(10)
,最后一行的末尾不需要换行符,并且所有输入行字符串的长度最多等于限制。 (限制 10 可以更改为绑定变量。)我假设还有一个排序列,我称之为 ORD。
with
sample_data (ord, col) as (
select 1, '0123' from dual union all
select 2, '45 67' from dual union all
select 3, '89A' from dual union all
select 4, 'BC' from dual union all
select 5, 'DEFGHI' from dual
)
select rn, listagg(col, chr(10)) within group (order by ord) as fragment
from sample_data
match_recognize (
order by ord
measures match_number() as rn
all rows per match
pattern (a+)
define a as sum(length(col)) + count(*) - 1 <= 10
)
group by rn
order by rn
;
RN FRAGMENT
----- ------------
1 0123
45 67
2 89A
BC
3 DEFGHI
我想要实现的是将长度不超过 10 的字符串与回车连接起来 return。如果该行的长度超过 10,则应将其添加到下一个串联行。
示例,具有以下数据集
SELECT '0123' col FROM DUAL
UNION ALL
SELECT '45 67' FROM DUAL
UNION ALL
SELECT '89A' FROM DUAL
UNION ALL
SELECT 'BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL
我对结果的期望
SELECT '0123
45 67' col FROM DUAL
UNION ALL
SELECT '89A
BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL
我是 运行 oracle 12.1,由于性能原因,我不想在 PLSQL 中执行此操作。我正在处理更高的数字。 我发布了一个简单的例子,这样会更容易。我的最终目标是以某种方式使用 listagg,其中每行最多有 4k 个字符
如果要将成对的相邻行组合在一起,则需要一个列来定义行的顺序。让我假设你有这样的专栏,叫做 id
.
然后,您可以使用递归查询。这个想法是逐行遍历数据集,连接值直到长度超过 10,此时必须开始一个新组。外层查询returns每组最新行:
with
data (id, col, rn) as (
select t.*, row_number() over(order by id) rn
from mytable t
),
cte (id, rn, newcol, grp) as (
select id, rn, col, 1 from data d where rn = 1
union all
select d.id, d.rn,
case when length(c.newcol) + length(d.col) < 10
then c.newcol || chr(13) || d.col
else d.col
end,
case when length(c.newcol) + length(d.col) < 10
then c.grp
else d.rn
end
from cte c
inner join data d on d.rn = c.rn + 1
)
select max(newcol) as newcol
from cte
group by grp order by min(id)
您可以使用 MATCH_RECOGNIZE
对行进行分组,然后 LISTAGG
将它们连接起来:
SELECT LISTAGG( col, CHR(10) ) WITHIN GROUP ( ORDER BY rn ) AS col
FROM ( SELECT ROWNUM AS rn, col FROM table_name )
MATCH_RECOGNIZE(
ORDER BY rn
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN ( short_strings* last_string )
DEFINE short_strings AS NEXT(LENGTH(col)) <= 10 - SUM(LENGTH(col) + 1)
)
GROUP BY mno;
其中,对于示例数据:
CREATE TABLE table_name ( col ) AS
SELECT '0123' FROM DUAL UNION ALL
SELECT '45 67' FROM DUAL UNION ALL
SELECT '89A' FROM DUAL UNION ALL
SELECT 'BC' FROM DUAL UNION ALL
SELECT 'DEFGHI' FROM DUAL;
输出:
| COL | | :----- | | 0123 | | 45 67 | | ------ | | 89A | | BC | | ------ | | DEFGHI |
db<>fiddle here
这里有一个match_recognize
解决方案,需要Oracle 12.1或更高版本。我做出以下附加假设:换行符在 Unix 中是 chr(10)
,最后一行的末尾不需要换行符,并且所有输入行字符串的长度最多等于限制。 (限制 10 可以更改为绑定变量。)我假设还有一个排序列,我称之为 ORD。
with
sample_data (ord, col) as (
select 1, '0123' from dual union all
select 2, '45 67' from dual union all
select 3, '89A' from dual union all
select 4, 'BC' from dual union all
select 5, 'DEFGHI' from dual
)
select rn, listagg(col, chr(10)) within group (order by ord) as fragment
from sample_data
match_recognize (
order by ord
measures match_number() as rn
all rows per match
pattern (a+)
define a as sum(length(col)) + count(*) - 1 <= 10
)
group by rn
order by rn
;
RN FRAGMENT
----- ------------
1 0123
45 67
2 89A
BC
3 DEFGHI