Postgres Regex：在最后一次出现模式和行尾之后提取字符串

Question

请帮我提取最后一次出现 Schedule : 和行尾之后的文本。

Lane Closures : Lane one will be closed
Reason : Roadworks are planned
Status : Pending
Schedule : Expect disruption everyday between 20:00 and 06:00 from 5 October 2020 to 9 October 2020
Schedule : Expect disruption everyday between 20:00 and 06:00 from 12 October 2020 to 16 October 2020
Schedule : Expect disruption everyday between 20:00 and 06:00 from 19 October 2020 to 23 October 2020
Schedule : Expect disruption everyday between 20:00 and 06:00 from 26 October 2020 to 31 October 2020
Lanes Closed : There will be one of two lanes closed

在上面的例子中，我需要提取Expect disruption everyday between 20:00 and 06:00 from 26 October 2020 to 31 October 2020

到目前为止，我只想到了以下内容：

(?<=Schedule : ).*(?![\s\S]*Schedule)

但它在 Postgres 中不起作用。它 returns 错误： invalid regular expression: invalid escape \ sequence

我也尝试根据 Postgres documentation 将 \s 和 \S 替换为 [:space:] 和 ^[:space:] 但它也不起作用

提前致谢。

Answer 1

由于. in a PostgreSQL regex matches any char including line break chars，你需要引入两个变化：

第一个 .* 应替换为 [^\r\n]+ 以匹配除常见换行符以外的任何字符
前瞻中的 [\s\S] 应仅替换为 .。

您可以使用

(?<=Schedule : )[^\r\n]+(?!.*Schedule)

见online demo:

SELECT REGEXP_MATCHES(
    E'Lane Closures : Lane one will be closed\nReason : Roadworks are planned\nStatus : Pending\nSchedule : Expect disruption everyday between 20:00 and 06:00 from 5 October 2020 to 9 October 2020\nSchedule : Expect disruption everyday between 20:00 and 06:00 from 12 October 2020 to 16 October 2020\nSchedule : Expect disruption everyday between 20:00 and 06:00 from 19 October 2020 to 23 October 2020\nSchedule : Expect disruption everyday between 20:00 and 06:00 from 26 October 2020 to 31 October 2020\nLanes Closed : There will be one of two lanes closed', 
    '(?<=Schedule : )[^\r\n]+(?!.*Schedule)', 
    'g')

输出：

Postgres Regex：在最后一次出现模式和行尾之后提取字符串

Postgres Regex: extract string after the last occurence of the pattern and the end of line

regex

string

postgresql

pattern-matching