如何使用 preg_replace 在第 3 段和第 4 段之间插入文本字符串?
How do I use preg_replace to insert text string between 3rd and 4th paragraph?
我正在尝试弄清楚如何在 Wordpress post 中创建一个名为 'pullquote' 的普通报纸设备。 (但这不是严格意义上的 Wordpress 问题;它更像是一个通用的 Regex 问题。)我有一个标签来包围 post 中的文本。我想复制标签之间的文本(我知道该怎么做)并将其插入 post.
中 p 标签的第 3 个和第 4 个实例之间
下面的函数找到文本并去除标签,但只是将匹配的文本添加到开头。我需要帮助定位第 3/4 段
或者...也许我在想这个问题。也许有一些方法可以像 jQuery nth-child?
那样定位元素
Post:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p>And here is a 4th paragraph.</p>
想要的结果
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>
<p>And here is a 4th paragraph.</p>
到目前为止,这是我的代码:
function jchwebdev_pullquote( $content ) {
$newcontent = $content;
$replacement = '';
$matches = array();
$pattern = "~\[callout\](.*?)\[/callout\]~s";
// strip out 'shortcode'
$newcontent = preg_replace($pattern, $replacement, $content);
if( preg_match($pattern, $content, $matches)) {
// now have formatted pullquote
$pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
// now how do I target and insert $pullquote
// between 3rd and 4th paragraph?
preg_replace(rd_4th_pattern, rd_4th_replacement,
$newcontent);
return $newcontent;
}
return $content;
}
add_filter( 'the_content' , 'jchwebdev_pullquote');
编辑:我想将我的问题修改为更具体一点的 Wordpress。 Wordpress 实际上将换行符转换为
个字符。大多数 Wordpress post 甚至不使用显式 'p' 标签,因为不需要它们。到目前为止,解决方案的问题是它们似乎去掉了换行符,所以如果 post(源文本)有换行符,它看起来很奇怪。
典型的现实世界 Wordpress post:
If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].
If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.
And here is a 3rd paragraph.
And here is a 5th paragraph.
Wordpress 是这样呈现的:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p></p>
<p>And here is a 5th paragraph.</p>
所以在一个完美的世界里,我想 'typical real world post' 并让 preg_replace 将其渲染为:
If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.
If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.
And here is a 3rd paragraph.
<blockquote class="callout">Tatort or Bukow & Konig</blockquote>
And here is a 5th paragraph.
...然后 Wordpress 将呈现为:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="callout">Tatort or Bukow & Konig</blockquote>
<p>And here is a 5th paragraph.</p>
也许这离题太远了,我应该在 Wordpress 论坛中重新 post,但我-认为-我需要的是改变 preg_replace 以使用换行符作为分隔符而不是
并弄清楚如何 - 不 - 从返回的字符串中删除那些换行符。
感谢迄今为止的所有帮助!
您可以在一个 preg_replace
函数中完成此操作。
$re = "~^(?:(?!/p).)*<p>(?:(?!/p).)*\[callout\](.*?)\[/callout\].*?</p>(?:[^<>]*<p>.*?</p>){2}[^<]*\K~s";
$str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>\n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>\n<p>And here is a 3rd paragraph.</p>\n<p>And here is a 4th paragraph.</p>";
$subst = "<blockquote class=\"pullquote\"></blockquote>\n";
$result = preg_replace($re, $subst, $str);
echo $result;
简单地使用(.*?</p>){3}\K
和s
修饰符,你可以实现你想要的:
preg_replace("@(.*?</p>){3}\K@s", $pullquote, $content);
我对您的功能进行了一些更改以使其正常工作:
function jchwebdev_pullquote( $content )
{
$pattern = "~\[callout\](.*?)\[/callout\]~s";
if(preg_match($pattern, $content, $matches))
{
$content = preg_replace($pattern, '', $content);
$pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
$content = preg_replace("@(.*?</p>){3}\K@s", $pullquote, $content);
return $content;
}
return $content;
}
更新#1
优化:使用单个 preg_replace
以避免应用多个模式:
function jchwebdev_pullquote( $content )
{
$pattern = "\[callout\](.*?)\[/callout\]";
if(preg_match("@(?s)$pattern@", $content, $matches))
{
$content = preg_replace("@(?s)($pattern)((.*?</p>){3})@", '<blockquote class="pullquote"></blockquote>', $content);
return $content;
}
return $content;
}
如果要使用PHPHTML/XML解析,请参考How do you parse and process HTML/XML in PHP?。
对于正则表达式解决方案,这是一个正则表达式解决方案:
查找: (?s)((?:<p>.*?<\/p>\s*){3})
此正则表达式将只捕获前 3 个 <p>
标记,然后在它们之后添加一个节点。
替换: <blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>\n
代码:
$re = "/(?s)((?:<p>.*?<\/p>\s*){3})/";
$str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>\n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>\n<p>And here is a 3rd paragraph.</p>\n<p>And here is a 4th paragraph.</p>";
$subst = "<blockquote class=\"pullquote\">Tatort or Bukow & Konig</blockquote>\n";
$result = preg_replace($re, $subst, $str, 1);
我正在尝试弄清楚如何在 Wordpress post 中创建一个名为 'pullquote' 的普通报纸设备。 (但这不是严格意义上的 Wordpress 问题;它更像是一个通用的 Regex 问题。)我有一个标签来包围 post 中的文本。我想复制标签之间的文本(我知道该怎么做)并将其插入 post.
中 p 标签的第 3 个和第 4 个实例之间下面的函数找到文本并去除标签,但只是将匹配的文本添加到开头。我需要帮助定位第 3/4 段
或者...也许我在想这个问题。也许有一些方法可以像 jQuery nth-child?
那样定位元素Post:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p>And here is a 4th paragraph.</p>
想要的结果
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>
<p>And here is a 4th paragraph.</p>
到目前为止,这是我的代码:
function jchwebdev_pullquote( $content ) {
$newcontent = $content;
$replacement = '';
$matches = array();
$pattern = "~\[callout\](.*?)\[/callout\]~s";
// strip out 'shortcode'
$newcontent = preg_replace($pattern, $replacement, $content);
if( preg_match($pattern, $content, $matches)) {
// now have formatted pullquote
$pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
// now how do I target and insert $pullquote
// between 3rd and 4th paragraph?
preg_replace(rd_4th_pattern, rd_4th_replacement,
$newcontent);
return $newcontent;
}
return $content;
}
add_filter( 'the_content' , 'jchwebdev_pullquote');
编辑:我想将我的问题修改为更具体一点的 Wordpress。 Wordpress 实际上将换行符转换为
个字符。大多数 Wordpress post 甚至不使用显式 'p' 标签,因为不需要它们。到目前为止,解决方案的问题是它们似乎去掉了换行符,所以如果 post(源文本)有换行符,它看起来很奇怪。
典型的现实世界 Wordpress post:
If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].
If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.
And here is a 3rd paragraph.
And here is a 5th paragraph.
Wordpress 是这样呈现的:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p></p>
<p>And here is a 5th paragraph.</p>
所以在一个完美的世界里,我想 'typical real world post' 并让 preg_replace 将其渲染为:
If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.
If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.
And here is a 3rd paragraph.
<blockquote class="callout">Tatort or Bukow & Konig</blockquote>
And here is a 5th paragraph.
...然后 Wordpress 将呈现为:
<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="callout">Tatort or Bukow & Konig</blockquote>
<p>And here is a 5th paragraph.</p>
也许这离题太远了,我应该在 Wordpress 论坛中重新 post,但我-认为-我需要的是改变 preg_replace 以使用换行符作为分隔符而不是
并弄清楚如何 - 不 - 从返回的字符串中删除那些换行符。感谢迄今为止的所有帮助!
您可以在一个 preg_replace
函数中完成此操作。
$re = "~^(?:(?!/p).)*<p>(?:(?!/p).)*\[callout\](.*?)\[/callout\].*?</p>(?:[^<>]*<p>.*?</p>){2}[^<]*\K~s";
$str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>\n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>\n<p>And here is a 3rd paragraph.</p>\n<p>And here is a 4th paragraph.</p>";
$subst = "<blockquote class=\"pullquote\"></blockquote>\n";
$result = preg_replace($re, $subst, $str);
echo $result;
简单地使用(.*?</p>){3}\K
和s
修饰符,你可以实现你想要的:
preg_replace("@(.*?</p>){3}\K@s", $pullquote, $content);
我对您的功能进行了一些更改以使其正常工作:
function jchwebdev_pullquote( $content )
{
$pattern = "~\[callout\](.*?)\[/callout\]~s";
if(preg_match($pattern, $content, $matches))
{
$content = preg_replace($pattern, '', $content);
$pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
$content = preg_replace("@(.*?</p>){3}\K@s", $pullquote, $content);
return $content;
}
return $content;
}
更新#1
优化:使用单个 preg_replace
以避免应用多个模式:
function jchwebdev_pullquote( $content )
{
$pattern = "\[callout\](.*?)\[/callout\]";
if(preg_match("@(?s)$pattern@", $content, $matches))
{
$content = preg_replace("@(?s)($pattern)((.*?</p>){3})@", '<blockquote class="pullquote"></blockquote>', $content);
return $content;
}
return $content;
}
如果要使用PHPHTML/XML解析,请参考How do you parse and process HTML/XML in PHP?。
对于正则表达式解决方案,这是一个正则表达式解决方案:
查找: (?s)((?:<p>.*?<\/p>\s*){3})
此正则表达式将只捕获前 3 个 <p>
标记,然后在它们之后添加一个节点。
替换: <blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>\n
代码:
$re = "/(?s)((?:<p>.*?<\/p>\s*){3})/";
$str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>\n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>\n<p>And here is a 3rd paragraph.</p>\n<p>And here is a 4th paragraph.</p>";
$subst = "<blockquote class=\"pullquote\">Tatort or Bukow & Konig</blockquote>\n";
$result = preg_replace($re, $subst, $str, 1);