在 oracle sql 中连接最大长度的字符串行

Concatenate string rows up to length in oracle sql

我想要实现的是将长度不超过 10 的字符串与回车连接起来 return。如果该行的长度超过 10,则应将其添加到下一个串联行。

示例,具有以下数据集

SELECT '0123' col FROM DUAL
UNION ALL
SELECT '45 67' FROM DUAL
UNION ALL
SELECT '89A' FROM DUAL
UNION ALL
SELECT 'BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL

我对结果的期望

SELECT '0123
45 67' col FROM DUAL
UNION ALL
SELECT '89A
BC' FROM DUAL
UNION ALL
SELECT 'DEFGHI' FROM DUAL

我是 运行 oracle 12.1,由于性能原因,我不想在 PLSQL 中执行此操作。我正在处理更高的数字。 我发布了一个简单的例子,这样会更容易。我的最终目标是以某种方式使用 listagg,其中每行最多有 4k 个字符

如果要将成对的相邻行组合在一起,则需要一个列来定义行的顺序。让我假设你有这样的专栏,叫做 id.

然后,您可以使用递归查询。这个想法是逐行遍历数据集,连接值直到长度超过 10,此时必须开始一个新组。外层查询returns每组最新行:

with 
    data (id, col, rn) as (
        select t.*, row_number() over(order by id) rn 
        from mytable t
    ),
    cte (id, rn, newcol, grp) as (
        select id, rn, col, 1 from data d where rn = 1
        union all
        select d.id, d.rn, 
            case when length(c.newcol) + length(d.col) < 10
                then c.newcol || chr(13) || d.col
                else d.col
            end,
            case when length(c.newcol) + length(d.col) < 10
                then c.grp
                else d.rn
            end
        from cte c
        inner join data d on d.rn = c.rn + 1
    )
select max(newcol) as newcol 
from cte 
group by grp order by min(id)

Demo on DB Fiddle

您可以使用 MATCH_RECOGNIZE 对行进行分组,然后 LISTAGG 将它们连接起来:

SELECT LISTAGG( col, CHR(10) ) WITHIN GROUP ( ORDER BY rn ) AS col
FROM   ( SELECT ROWNUM AS rn, col FROM table_name )
MATCH_RECOGNIZE(
  ORDER     BY rn
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN ( short_strings* last_string )
  DEFINE short_strings AS NEXT(LENGTH(col)) <= 10 - SUM(LENGTH(col) + 1)
)
GROUP BY mno;

其中,对于示例数据:

CREATE TABLE table_name ( col ) AS
SELECT '0123'   FROM DUAL UNION ALL
SELECT '45 67'  FROM DUAL UNION ALL
SELECT '89A'    FROM DUAL UNION ALL
SELECT 'BC'     FROM DUAL UNION ALL
SELECT 'DEFGHI' FROM DUAL;

输出:

| COL    |
| :----- |
| 0123   |
| 45 67  |
| ------ |
| 89A    |
| BC     |
| ------ |
| DEFGHI |

db<>fiddle here

这里有一个match_recognize解决方案,需要Oracle 12.1或更高版本。我做出以下附加假设:换行符在 Unix 中是 chr(10),最后一行的末尾不需要换行符,并且所有输入行字符串的长度最多等于限制。 (限制 10 可以更改为绑定变量。)我假设还有一个排序列,我称之为 ORD。

with
  sample_data (ord, col) as (
    select 1, '0123'   from dual union all
    select 2, '45 67'  from dual union all
    select 3, '89A'    from dual union all
    select 4, 'BC'     from dual union all
    select 5, 'DEFGHI' from dual
  )
select rn, listagg(col, chr(10)) within group (order by ord) as fragment
from   sample_data
match_recognize (
  order by ord
  measures match_number() as rn
  all rows per match
  pattern (a+)
  define  a as sum(length(col)) + count(*) - 1 <= 10
)
group  by rn
order  by rn
;

   RN  FRAGMENT
-----  ------------
    1  0123
       45 67
    2  89A
       BC
    3  DEFGHI