PHP DOMDocument::getElementById 因 DOMDocumentFragment 而失败

PHP DOMDocument::getElementById fails with DOMDocumentFragment

我只想知道为什么会失败。这是测试用例:

<?php
error_reporting(E_ALL);
ini_set('display_errors', '1');

$doc = new DOMDocument();
$doc->loadHTML("<!DOCTYPE html><html><body><div id='testId'>Test</div></body></html>");
echo "This works: ".$doc->getElementById('testId')->nodeValue.'<br/>';

$fragment = $doc->createDocumentFragment();
$fragment->appendXML("<p id='testId2'>Test 2</p>");
$doc->getElementById('testId')->appendChild($fragment);

echo "This still works: ".$doc->getElementById('testId')->nodeValue.'<br/>';
echo "This doesn't work: ".$doc->getElementById('testId2')->nodeValue.'<br/>';

解决方法是使用

$xpath = new \DOMXpath($doc);
$nodes = $xpath->query('//*[@id="testId2"]')[0]->nodeValue;

original DOM 2.0 specification 说:

getElementById introduced in DOM Level 2

Returns the Element whose ID is given by elementId. If no such element exists, returns null. Behavior is not defined if more than one element has this ID.

Note: The DOM implementation must have information that says which attributes are of type ID. Attributes with the name "ID" are not of type ID unless so defined. Implementations that do not know whether attributes are of type ID or not are expected to return null.

其中的重要部分是名称为“ID”的属性不属于 ID 类型,除非如此定义。

当您使用 HTML 时,内置 DTD 将“id”定义为元素 ID 属性:

<!ENTITY % coreattrs
 "id          ID             #IMPLIED  -- document-wide unique id --
  class       CDATA          #IMPLIED  -- space-separated list of classes --
  style       %StyleSheet;   #IMPLIED  -- associated style info --
  title       %Text;         #IMPLIED  -- advisory title --"
  >

但是,当您使用 DomDocumentFragment::appendXML() 将元素附加到文档片段时,您使用的是原始 XML,它没有这样的 DTD。 (这看起来并不直观,因为您已将其附加到 HTML 文档,但整个 DomDocument API 远非直观!)

PHP 确实在 the documentation 中为 DomDocument::createGetElementById() 解决了这个问题:

For this function to work, you will need either to set some ID attributes with DOMElement::setIdAttribute or a DTD which defines an attribute to be of type ID.

因此,解决方案是简单地告诉解析器 id 属性实际上是 ID 属性:

$doc = new DOMDocument();
$doc->loadHTML("<!DOCTYPE html><html><body><div id='testId'>Test</div></body></html>");
echo "This works: ".$doc->getElementById('testId')->nodeValue.'<br/>';

$fragment = $doc->createDocumentFragment();
$fragment->appendXML("<p id='testId2'>Test 2</p>");
// here's the magic
$fragment->childNodes[0]->setIdAttribute("id", true);
$doc->getElementById('testId')->appendChild($fragment);

echo "This still works: ".$doc->getElementById('testId')->nodeValue.'<br/>';
echo "And so does this: ".$doc->getElementById('testId2')->nodeValue.'<br/>';