Perl XML::LibXML, findnodes 只能读取 XML 文件的根目录

Perl XML::LibXML, findnodes can only read the root of the XML file

我正在尝试解析此 .kml 文件:

<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="PostalCodeCanada" id="PostalCodeCanada">
    <SimpleField name="ZIP" type="string"></SimpleField>
    <SimpleField name="VERTICES" type="int"></SimpleField>
</Schema>
<Folder><name>PostalCodeCanada</name>
  <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
        <SimpleData name="ZIP">G1Y1B1</SimpleData>
        <SimpleData name="VERTICES">5</SimpleData>
    </SchemaData></ExtendedData>
      <Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
  </Placemark>
  <Placemark>
    <Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
    <ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
        <SimpleData name="ZIP">G1Y1B2</SimpleData>
        <SimpleData name="VERTICES">5</SimpleData>
    </SchemaData></ExtendedData>
      <Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
  </Placemark>
</Folder>
</Document></kml>

我将 Perl 与 XML::LibXML 一起使用,但 findnodes 无法读取除“/”之外的任何节点。这是我的代码:

#!/usr/bin/env perl

use XML::LibXML;
use strict;
use warnings;

my $outputFilename = "PostalCodesCollegePro.kml";

my $intro = '<?xml version="1.0" encoding="utf-8" ?>'."\n".'<kml xmlns="http://www.opengis.net/kml/2.2">'."\n".'<Document id="root_doc">'."\n".'<Schema name="PostalCodeCanada" id="PostalCodeCanada">'."\n\t".'<SimpleField name="ZIP" type="string"></SimpleField>'."\n\t".'<SimpleField name="VERTICES" type="int"></SimpleField>'."\n".'</Schema>'."\n".'<Folder><name>PostalCodeCanada</name>'."\n";
my $outro = '</Folder>'."\n".'</Document></kml>'."\n";

open (my $fh, ">".$outputFilename) or die "Impossible d'ouvrir le fichier d'écriture";
print $fh $intro;

my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
foreach my $node ( $data->findnodes('//Folder') ) {
    print ($node->toString);
#   my($zip) = $node->findnodes('./ExtendedData/SchemaData/SimpleData');
#   print ($zip->to_literal."\n");
#   if ($zip->to_literal =~ /(^G1Y)|(^G3A)|(^G2G)|(^G3L)|(^G3H)|G0A2R0|G0A1T0|G0A1L0|G0A3H0|G0A3G0|G0A2Y0|G0A2Z0|G0A4N0|G0A2J0|G0A3M0|G0A4A0|G0A1A0|G0A1Y0|G0A1S0|G0A4B0|G0A3T0|G0A3B0|G0A4H0|G0A1W0|G0A3L0|G0A4L0|G0A3A0/){
#       print $fh $node->to_literal;
#   }
}

print $fh $outro; 
close $fh or warn "Impossible de fermer le fichier après écriture";`

感谢所有愿意提供帮助的人! PS: 这是一个缩小后的.kml文件,实际上真正的文件有加拿大所有邮政编码的所有地理信息。我正在尝试生成另一个仅包含所需邮政编码的 .kml,以便使用 Google 地图 API.

生成地图

你的问题是你的节点都在一个命名空间内,所以你需要处理这个问题。最简单的方法可能是使用 XML::LibXML::XPathContext 对象。

my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");

my $xpc = XML::LibXML::XPathContext->new($data);
$xpc->registerNs('k', 'http://www.opengis.net/kml/2.2');

foreach my $node ( $xpc->findnodes('//k:Folder') ) {
  ...
}

您的 XML 数据使用默认名称空间,您在使用 XPath 访问它时必须明确指定该名称空间。其中 XML::LibXML is concerned, that means you must create an XML::LibXML::XPathContext 对象搜索数据

这是一个可以满足您需要的示例程序

#!/usr/bin/env perl

use strict;
use warnings;

use XML::LibXML;

my $doc = XML::LibXML->load_xml(location => 'PostalCodeCanada.kml');
my $xpc = XML::LibXML::XPathContext->new($doc);
$xpc->registerNs( gis => 'http://www.opengis.net/kml/2.2');

for my $folder ( $xpc->findnodes('/gis:kml/gis:Document/gis:Folder') ) {

    my ($zip) = $xpc->findnodes('gis:Placemark/gis:ExtendedData/gis:SchemaData/gis:SimpleData', $folder);
    $zip = $zip->to_literal;

    print "$zip\n";

    if ( $zip =~ /(?:G0A(?:1A0|1L0|1S0|1T0|1W0|1Y0|2J0|2R0|2Y0|2Z0|3A0|3B0|3G0|3H0|3L0|3M0|3T0|4A0|4B0|4H0|4L0|4N0)|G1Y|G1Y1B1|G2G|G3A|G3H|G3L)/){
        print $folder->to_literal;
    }
}

您已经在 XML::LibXML 中找到了答案。但是我要指出——如果你使用 XML::Twig 你可以忽略命名空间。 (那是因为它并不真正支持它们——如果你只有一个,那也没关系!)

#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;

my $twig = XML::Twig -> new -> parsefile ( 'input.kml');
foreach my $node ( $twig -> findnodes ( '//Folder') ) {
   print $node -> text,"\n";
}