从包含单词及其后的单词的地址中删除子字符串

Question

我一直在尝试计算这个正则表达式，但我只完成了一半。

想象这样的字符串（地址）

White Lund Industrial Estate  Unit 11a Southgate
White Lund Industrial Estate  Suite 124 Southgate
White Lund Industrial Estate  flat A Southgate

我想成为

White Lund Industrial Estate Southgate
White Lund Industrial Estate Southgate
White Lund Industrial Estate Southgate

有一个规律就是，如果字符串中出现Unit、Flat、Suite，则去掉它们和后面的词。

我在 postgres 中这样做，到目前为止我做到了：

select REGEXP_REPLACE(lower('White Lund Industrial Estate Unit 11 a Southgate'), 'unit\S*', '');

这给了我：

white lund industrial estate 11 a southgate

如何让正则表达式也删除它后面的词？

谢谢！

Answer 1

您可以使用

\s*\y(unit|flat|suite)\s+\S+

此外，您可以使用嵌入式标志选项使其不区分大小写 (?i):

(?i)\s*\y(unit|flat|suite)\s+\S+

详情

select REGEXP_REPLACE(lower('White Lund Industrial Estate Unit 11a Southgate'), '\s*\y(unit|flat|suite)\s+\S+', '');

或者，

select REGEXP_REPLACE('White Lund Industrial Estate Unit 11a Southgate', '(?i)\s*\y(unit|flat|suite)\s+\S+', '');

输出：

remove substring from address containing word and the word after it