如何将 XML 字符串转换为具有不同结构的 PHP 数组?

How to convert XML string to PHP array with a different structure?

我有这种方法可以将 XML 字符串转换为具有不同键和值的 PHP 数组,以适当地充分理解 XML。但是,当有多个相同类型的 children 时,我无法从数组中获得所需的结果,我对如何更改方法感到困惑。

方法是这样的:

/**
 * Converts a XML string to an array
 *
 * @param $xmlString
 * @return array
 */
private function parseXml($xmlString)
{
    $doc = new DOMDocument;
    $doc->loadXML($xmlString);
    $root = $doc->documentElement;
    $output[$root->tagName] = $this->domnodeToArray($root, $doc);

    return $output;
}

/**
 * @param $node
 * @param $xmlDocument
 * @return array|string
 */
private function domNodeToArray($node, $xmlDocument)
{
    $output = [];
    switch ($node->nodeType)
    {
        case XML_CDATA_SECTION_NODE:
        case XML_TEXT_NODE:
            $output = trim($node->textContent);
            break;
        case XML_ELEMENT_NODE:
            for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
            {
                $child = $node->childNodes->item($i);
                $v = $this->domNodeToArray($child, $xmlDocument);

                if (isset($child->tagName))
                {
                    $t = $child->tagName;

                    if (!isset($output['value'][$t]))
                    {
                        $output['value'][$t] = [];
                    }
                    $output['value'][$t][] = $v;
                }
                else if ($v || $v === '0')
                {
                    $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
                }
            }

            if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
            {
                $output = ['value' => $output['value']];
            }

            if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
            {
                $output = ['attributes' => [], 'value' => $output['value']];
            }

            if ($node->attributes->length)
            {
                $a = [];
                foreach ($node->attributes as $attrName => $attrNode)
                {
                    $a[$attrName] = (string)$attrNode->value;
                }
                $output['attributes'] = $a;
            }
            else
            {
                $output['attributes'] = [];
            }

            if (isset($output['value']) && is_array($output['value']))
            {
                foreach ($output['value'] as $t => $v)
                {
                    if (is_array($v) && count($v) == 1 && $t != 'attributes')
                    {
                        $output['value'][$t] = $v[0];
                    }
                }
            }
            break;
    }

    return $output;
}

这里有一些例子XML:

<?xml version="1.0" encoding="UTF-8"?>
<characters>
   <character>
      <name2>Sno</name2>
      <friend-of>Pep</friend-of>
      <since>1950-10-04</since>
      <qualification>extroverted beagle</qualification>
   </character>
   <character>
      <name2>Pep</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-22</since>
      <qualification>bold, brash and tomboyish</qualification>
   </character>
</characters>

运行 方法并将 XML 作为其参数传递,将得到此数组:

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:1 [▼
      "character" => array:2 [▼
        0 => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "attributes" => []
              "value" => "Sno"
            ]
            "friend-of" => array:2 [▼
              "attributes" => []
              "value" => "Pep"
            ]
            "since" => array:2 [▼
              "attributes" => []
              "value" => "1950-10-04"
            ]
            "qualification" => array:2 [▼
              "attributes" => []
              "value" => "extroverted beagle"
            ]
          ]
          "attributes" => []
        ]
        1 => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "attributes" => []
              "value" => "Pep"
            ]
            "friend-of" => array:2 [▼
              "attributes" => []
              "value" => "Sno"
            ]
            "since" => array:2 [▼
              "attributes" => []
              "value" => "1966-08-22"
            ]
            "qualification" => array:2 [▼
              "attributes" => []
              "value" => "bold, brash and tomboyish"
            ]
          ]
          "attributes" => []
        ]
      ]
    ]
    "attributes" => []
  ]
]

我希望它的结果是(缩进可能是错误的):

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:2 [▼
      0 => [
        "character" => array:1 [▼
            "value" => array:4 [▼
              "name2" => array:2 [▼
                  "attributes" => []
                  "value" => "Sno"
                ]
                "friend-of" => array:2 [▼
                  "attributes" => []
                  "value" => "Pep"
                ]
                "since" => array:2 [▼
                  "attributes" => []
                  "value" => "1950-10-04"
                ]
                "qualification" => array:2 [▼
                  "attributes" => []
                  "value" => "extroverted beagle"
                ]
              ]
              "attributes" => []
            ]
          ]
        ]
        1 => array:2 [▼
          "character" => array:1 [▼
            "value" => array:4 [▼
              "name2" => array:2 [▼
                "attributes" => []
                "value" => "Pep"
              ]
              "friend-of" => array:2 [▼
                "attributes" => []
                "value" => "Sno"
              ]
              "since" => array:2 [▼
                "attributes" => []
                "value" => "1966-08-22"
              ]
              "qualification" => array:2 [▼
                "attributes" => []
                "value" => "bold, brash and tomboyish"
              ]
            ]
            "attributes" => []
          ]
        ]
      ]
    ]
    "attributes" => []
  ]
]

所以基本上,我希望 characters 键的 value 键是两个项目的数组,其中基本上包括 2 个 character 键。只有在同一个分支上有许多相同的元素时才会发生这种情况。目前的方式是,character 键是一个包含 2 个元素的数组,在我的情况下不起作用。

我还无法改变上述方法来反映我的需求,我不确定应该采用哪种方法。从 DOMDocument 实例改变这样的数组似乎很复杂。

我对你的功能做了一些修改,但我不确定这是否是你需要的。

private function domNodeToArray($node, $xmlDocument)
{
    $output = ['value' => [], 'attributes' => []];

    switch ($node->nodeType) {
    case XML_CDATA_SECTION_NODE:
    case XML_TEXT_NODE:
        $output = trim($node->textContent);
        break;
    case XML_ELEMENT_NODE:
        for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
            $child = $node->childNodes->item($i);
            $v = $this->domNodeToArray($child, $xmlDocument);

            if (isset($child->tagName)) {
                $t = $child->tagName;

                if (isset($output['value'][$t])) {
                    $output['value'][] = [$t => $output['value'][$t]];
                    $output['value'][] = [$t => $v];
                    unset($output['value'][$t]);
                } else {
                    $output['value'][$t] = $v;
                }
            } elseif (($v && is_string($v)) || $v === '0') {
                $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
            }
        }

        if ($node->attributes->length) {
            foreach ($node->attributes as $attrName => $attrNode) {
                $output['attributes'][$attrName] = (string) $attrNode->value;
            }
        }

        break;
    }

    return $output;
}

输出

array:1 [▼
  "characters" => array:2 [▼
    "value" => array:2 [▼
      0 => array:1 [▼
        "character" => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "value" => "Sno"
              "attributes" => []
            ]
            "friend-of" => array:2 [▼
              "value" => "Pep"
              "attributes" => []
            ]
            "since" => array:2 [▼
              "value" => "1950-10-04"
              "attributes" => []
            ]
            "qualification" => array:2 [▼
              "value" => "extroverted beagle"
              "attributes" => []
            ]
          ]
          "attributes" => []
        ]
      ]
      1 => array:1 [▼
        "character" => array:2 [▼
          "value" => array:4 [▼
            "name2" => array:2 [▼
              "value" => "Pep"
              "attributes" => []
            ]
            "friend-of" => array:2 [▼
              "value" => "Sno"
              "attributes" => []
            ]
            "since" => array:2 [▼
              "value" => "1966-08-22"
              "attributes" => []
            ]
            "qualification" => array:2 [▼
              "value" => "bold, brash and tomboyish"
              "attributes" => []
            ]
          ]
          "attributes" => []
        ]
      ]
    ]
    "attributes" => []
  ]
]

问题是何时添加新级别以及何时继续仅添加数据。我已经改变了这个逻辑,在代码中添加了注释以帮助理解发生了什么以及什么时候...

private function domNodeToArray($node, $xmlDocument)
{
    $output = [];
    switch ($node->nodeType)
    {
        case XML_CDATA_SECTION_NODE:
        case XML_TEXT_NODE:
            $output = trim($node->textContent);
            break;
        case XML_ELEMENT_NODE:
            for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++)
            {
                $child = $node->childNodes->item($i);
                $v = $this->domNodeToArray($child, $xmlDocument);

                if (isset($child->tagName))
                {
                    $t = $child->tagName;

//                     if (!isset($output['value'][$t]))
//                     {
//                         $output['value'][$t] = [];
//                     }
                    // If the element already exists
                    if (isset($output['value'][$t]))
                    {
                        // Copy the existing value to new level
                        $output['value'][] = [$t => $output['value'][$t]];
                        // Add in new value
                        $output['value'][] = [$t => $v];
                        // Remove old element
                        unset($output['value'][$t]);
                    }
                    // If this has already been added at a new level
                    elseif ( isset($output['value'][0][$t]))   
                    {
                        // Add it to existing extra level
                        $output['value'][] = [$t => $v];
                    }
                    else    {
                        $output['value'][$t] = $v;
                    }
                }
                else if ($v || $v === '0')
                {
                    $output['value'] = htmlspecialchars((string)$v, ENT_XML1 | ENT_COMPAT, 'UTF-8');
                }
            }

            if (isset($output['value']) && $node->attributes->length && !is_array($output['value']))
            {
                $output = ['value' => $output['value']];
            }

            if (!$node->attributes->length && isset($output['value']) && !is_array($output['value']))
            {
                $output = ['attributes' => [], 'value' => $output['value']];
            }

            if ($node->attributes->length)
            {
                $a = [];
                foreach ($node->attributes as $attrName => $attrNode)
                {
                    $a[$attrName] = (string)$attrNode->value;
                }
                $output['attributes'] = $a;
            }
            else
            {
                $output['attributes'] = [];
            }
            break;
    }

    return $output;
}

我试过...

<?xml version="1.0" encoding="UTF-8"?>
<characters>
   <character>
      <name2>Sno</name2>
      <friend-of>Pep</friend-of>
      <since>1950-10-04</since>
      <qualification>extroverted beagle</qualification>
   </character>
   <character>
      <name2>Pep</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-22</since>
      <qualification>bold, brash and tomboyish</qualification>
   </character>
   <character>
      <name2>Pep2</name2>
      <friend-of>Sno</friend-of>
      <since>1966-08-23</since>
      <qualification>boldish, brashish and tomboyish</qualification>
   </character>
</characters>

检查 <character> 元素是否全部添加到正确的级别。