PHP 正则表达式在标签和文本之间添加 space
PHP regex add space between tags and text
我有一些内联内容,例如:
<p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in<a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, en die kloof is te groot, "Arment geskryf.</p>
我想在标签和其他标签(强、斜体等)前添加一个 space,前提是标签紧挨着一个字母(可以是日文符号)也是)并且仅当后面的字符也是字母而不是标点符号(例如 ., !, ?...
时,才在标签后添加 space
你知道我该如何实现吗?
到目前为止我的正则表达式是:
preg_replace('/<a(.*)>(.*)<\/a>?/', ' [=13=]', $out);
显然没有条件...非常感谢您的帮助。
描述
\s?<(a|strong|italic)(?=[\s>])(?:[^>=]|=(?:'[^']*'|"[^"]*"|[^'"\s]*))*\s?\/?>.*?<\/>(?=[\s,.;?!]|(?=.*?(\s)))
替换为: _[=14=]
注意这是一个 Space,然后是 [=15=]
和一个 </code>。</p>
<p><WBIMG:4213608-1.png></p>
<p><sub>** 要更好地查看图像,只需右键单击图像并 select 在新 window</sub></p> 中查看
<p>此正则表达式将执行以下操作:</p>
<ul>
<li>在标签前匹配一个可选的前导space,如果有space,则自动替换,如果没有space,则插入</li>
<li>仅在末尾没有 space 且下一个字符不是标点符号时才插入 space。</li>
</ul>
<p>如果页面上没有更多额外的 space,页面上的最后一个标签就会有问题。 </p>
<h1>例子</h1>
<p><strong>现场演示</strong></p>
<p><a href="https://regex101.com/r/bR2gZ3/1" rel="nofollow">https://regex101.com/r/bR2gZ3/1</a></p>
<p><strong>示例文本</strong></p>
<pre><code><p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in<a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, <a href=Droid.jpg onmouseover=' var s=" <a href=NotTheDroidsYouAreLookingFor.jpg </a> "; ' >Not the Droid you are looking for</a>en die kloof is te groot, "Arment geskryf.</p>
替换后
<p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in <a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, <a href=Droid.jpg onmouseover=' var s=" <a href=NotTheDroidsYouAreLookingFor.jpg </a> "; ' >Not the Droid you are looking for</a> en die kloof is te groot, "Arment geskryf.</p>
说明
NODE EXPLANATION
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
< '<'
----------------------------------------------------------------------
( group and capture to :
----------------------------------------------------------------------
a 'a'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
strong 'strong'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
italic 'italic'
----------------------------------------------------------------------
) end of
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
[\s>] any character of: whitespace (\n, \r,
\t, \f, and " "), '>'
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
[^>=] any character except: '>', '='
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
= '='
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
' '\''
----------------------------------------------------------------------
[^']* any character except: ''' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
' '\''
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
[^"]* any character except: '"' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
[^'"\s]* any character except: ''', '"',
whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
\/? '/' (optional (matching the most amount
possible))
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
< '<'
----------------------------------------------------------------------
\/ '/'
----------------------------------------------------------------------
what was matched by capture
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
[\s,.;?!] any character of: a space, ',', '.', ';', '?',
'!'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
.*? any character except \n (0 or more
times (matching the least amount
possible))
----------------------------------------------------------------------
( group and capture to :
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
) end of
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
我有一些内联内容,例如:
<p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in<a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, en die kloof is te groot, "Arment geskryf.</p>
我想在标签和其他标签(强、斜体等)前添加一个 space,前提是标签紧挨着一个字母(可以是日文符号)也是)并且仅当后面的字符也是字母而不是标点符号(例如 ., !, ?...
时,才在标签后添加 space你知道我该如何实现吗?
到目前为止我的正则表达式是:
preg_replace('/<a(.*)>(.*)<\/a>?/', ' [=13=]', $out);
显然没有条件...非常感谢您的帮助。
描述
\s?<(a|strong|italic)(?=[\s>])(?:[^>=]|=(?:'[^']*'|"[^"]*"|[^'"\s]*))*\s?\/?>.*?<\/>(?=[\s,.;?!]|(?=.*?(\s)))
替换为: _[=14=]
注意这是一个 Space,然后是 [=15=]
和一个 </code>。</p>
<p><WBIMG:4213608-1.png></p>
<p><sub>** 要更好地查看图像,只需右键单击图像并 select 在新 window</sub></p> 中查看
<p>此正则表达式将执行以下操作:</p>
<ul>
<li>在标签前匹配一个可选的前导space,如果有space,则自动替换,如果没有space,则插入</li>
<li>仅在末尾没有 space 且下一个字符不是标点符号时才插入 space。</li>
</ul>
<p>如果页面上没有更多额外的 space,页面上的最后一个标签就会有问题。 </p>
<h1>例子</h1>
<p><strong>现场演示</strong></p>
<p><a href="https://regex101.com/r/bR2gZ3/1" rel="nofollow">https://regex101.com/r/bR2gZ3/1</a></p>
<p><strong>示例文本</strong></p>
<pre><code><p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in<a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, <a href=Droid.jpg onmouseover=' var s=" <a href=NotTheDroidsYouAreLookingFor.jpg </a> "; ' >Not the Droid you are looking for</a>en die kloof is te groot, "Arment geskryf.</p>
替换后
<p>"Geen nuwe inisiatief, bestuur verandering, of verkryging in <a href="http://business.time.com/2013/09/24/the-fatal-mistake-that-doomed-blackberry/">2007 kon gered het die BlackBerry</a>. Dit was te laat, <a href=Droid.jpg onmouseover=' var s=" <a href=NotTheDroidsYouAreLookingFor.jpg </a> "; ' >Not the Droid you are looking for</a> en die kloof is te groot, "Arment geskryf.</p>
说明
NODE EXPLANATION
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
< '<'
----------------------------------------------------------------------
( group and capture to :
----------------------------------------------------------------------
a 'a'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
strong 'strong'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
italic 'italic'
----------------------------------------------------------------------
) end of
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
[\s>] any character of: whitespace (\n, \r,
\t, \f, and " "), '>'
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
[^>=] any character except: '>', '='
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
= '='
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
' '\''
----------------------------------------------------------------------
[^']* any character except: ''' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
' '\''
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
[^"]* any character except: '"' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
[^'"\s]* any character except: ''', '"',
whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
\/? '/' (optional (matching the most amount
possible))
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
< '<'
----------------------------------------------------------------------
\/ '/'
----------------------------------------------------------------------
what was matched by capture
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
[\s,.;?!] any character of: a space, ',', '.', ';', '?',
'!'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
.*? any character except \n (0 or more
times (matching the least amount
possible))
----------------------------------------------------------------------
( group and capture to :
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
) end of
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------