使用正则表达式的postgres子字符串拆分文本

Question

我有以下字符串模式，我想将文本分成 4 个字段。

NIFTY21JUN11100CE --> NIFTY, 21JUN, 11100, CE

在上面的字符串中，只有2种字符串格式是常量。例如：21JUN 代表年和月，它是常量 5 个字符表示。在此之前代表名称，可以是任意数量的字符。我认为正则表达式会像 (([1-2][0-9]))(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)

最后2个字符是常量，其值可以是PE|CE。 21JUN 和 CE|PE 之间的值代表行使价，它始终是数字，但可以是任意数字。

现在我希望将它们分成 4 个字段并努力获取正则表达式。是否有人熟悉此要求的 Postgres 命令？

Answer 1

您可以使用 SELECT regexp_match('NIFTY21JUN11100CE','^(\D+)(\d{2}[A-Z]{3})(\d+)(PE|CE)$');

一步一步：

^          Beginning of the string
(          start capture
\D+        more than zero non-digit chars
)          end capture
(          start capture
\d{2}      exactly 2 digits
[A-Z]{3}   exactly 3 chars in the range from A to Z
)          end capture
(          start capture
\d+        more than zero digit chars
)          end capture
(          start capture
PE|CE      one of 'PE' or 'CE'
)          end capture
$          end of the string

你的问题中使用字符类 [1-2][0-9] 和交替 (JAN|FEB|...) 的年月正则表达式更严格一些，也可以使用。

使用正则表达式的postgres子字符串拆分文本

postgres substring split text using regex

postgresql

substring