我想用“|”替换"or"不使用 PHP 双引号

I want to replace "or" with "|" which are not in double quote using PHP

输入:

"Supermajority Vote for State Taxes or fees" or taxes or "ssd or ffF"

预期输出:

"Supermajority Vote for State Taxes or fees" | taxes | "ssd or ffF"

我试过了,但我无法处理多次出现:

preg_replace("/(\".*\")\s+(or)\s+(.*)/", " | ", $input);

检查直到字符串末尾的引号数量是否为偶数

\bor\b(?=([^\"]|\"[^\"]+\")+$)

demo and some explanations

\b - 单词边界

(?= - 在

之后出现的积极前瞻测试

([^\"]|\"[^\"]+\") - 没有引号或 "some things in quotes"

您在问题中提供的正则表达式可能有一个修复程序。但是,如果您需要在输入中引用怎么办?

"Supermajority Vote for \"State Taxes\" or \"fees\"" or taxes or "ssd or ffF"

好的,现在您要查找引号之间的字符串,除非引号前面有反斜杠。但是,如果您想要在字符串末尾使用反斜杠怎么办?

"Supermajority Vote for State Taxes or fees\" or taxes or "ssd or ffF"

所以现在您想查找引号之间的字符串,除非它前面有反斜杠,除非该反斜杠前面有另一个反斜杠。

您可以这样继续,但是不可能编写支持无限反斜杠的正则表达式。要正确执行此操作,您需要 build a lexer.

(*SKIP)(*FAIL)的完美示例:

"[^"]+"(*SKIP)(*FAIL)|\bor\b

这需要替换为 |,请参阅 a demo on regex101.com


PHP:

<?php

$string = '"Supermajority Vote for State Taxes or fees" or taxes or "ssd or ffF"';
$regex = '~"[^"]+"(*SKIP)(*FAIL)|\bor\b~';

$string = preg_replace($regex, '|', $string);

echo $string;
?>

产生

"Supermajority Vote for State Taxes or fees" | taxes | "ssd or ffF"


分解后,表达式的意思是:

"[^"]+"        # everything between "..."
(*SKIP)(*FAIL) # "forget" everything to the left
|              # or
\bor\b         # or with boundaries on both sides (meaning neither for nor nor, etc.)


正如 @mickmackusa 指出的那样,您甚至可以使用转义反斜杠,请参阅 a demo on regex101.com.