PHP - DOM - 获取带有标签的文本

PHP - DOM - get text with tags

我正在尝试从如下所示的字符串中获取所有内容(文本+标签):

'<div id='1' data-AAA='something1' data-BBB='something2'><em>My</em></div>
<div id='5' data-AAA='something5' data-BBB='something6'><span style='color:red;'>Web</span></div>
    ...'

当我这样做时:

        $dom = new DOMDocument;
        $dom->loadHTML($value);
        foreach ($dom->getElementsByTagName('div') as $ST) {
             $valueSub = $Sub->nodeValue;
             var_dump($valueSub);
        }die;

我明白了:

string 'My' (length=2)
string 'Web' (length=3)

但我期望的是相同的,但标签将文本包装在每个 div 中,如下所示:

string '<em>My</em>' (length=2)
string '<span style='color:red;'Web</span>' (length=3)

请问我该怎么办?

谢谢

您可以使用以下代码使用 XPath:

$string = <<<EOF
<div id='1' data-AAA='something1' data-BBB='something2'><em>My</em></div>
<div id='5' data-AAA='something5' data-BBB='something6'><span style='color:red;'>Web</span></div>
EOF;

$doc = new DOMDocument();
$doc->loadHTML($string);

$selector = new DOMXPath($doc);

// Select the parent elements of text nodes somewhere
// in div elements
foreach($selector->query('//div//text()/..') as $node) {
    var_dump($doc->saveHTML($node));
}

输出:

string(11) "<em>My</em>"
string(35) "<span style="color:red;">Web</span>"