Perl XML::LibXML, findnodes 只能读取 XML 文件的根目录
Perl XML::LibXML, findnodes can only read the root of the XML file
我正在尝试解析此 .kml 文件:
<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="PostalCodeCanada" id="PostalCodeCanada">
<SimpleField name="ZIP" type="string"></SimpleField>
<SimpleField name="VERTICES" type="int"></SimpleField>
</Schema>
<Folder><name>PostalCodeCanada</name>
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
<SimpleData name="ZIP">G1Y1B1</SimpleData>
<SimpleData name="VERTICES">5</SimpleData>
</SchemaData></ExtendedData>
<Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
</Placemark>
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
<SimpleData name="ZIP">G1Y1B2</SimpleData>
<SimpleData name="VERTICES">5</SimpleData>
</SchemaData></ExtendedData>
<Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
</Placemark>
</Folder>
</Document></kml>
我将 Perl 与 XML::LibXML 一起使用,但 findnodes 无法读取除“/”之外的任何节点。这是我的代码:
#!/usr/bin/env perl
use XML::LibXML;
use strict;
use warnings;
my $outputFilename = "PostalCodesCollegePro.kml";
my $intro = '<?xml version="1.0" encoding="utf-8" ?>'."\n".'<kml xmlns="http://www.opengis.net/kml/2.2">'."\n".'<Document id="root_doc">'."\n".'<Schema name="PostalCodeCanada" id="PostalCodeCanada">'."\n\t".'<SimpleField name="ZIP" type="string"></SimpleField>'."\n\t".'<SimpleField name="VERTICES" type="int"></SimpleField>'."\n".'</Schema>'."\n".'<Folder><name>PostalCodeCanada</name>'."\n";
my $outro = '</Folder>'."\n".'</Document></kml>'."\n";
open (my $fh, ">".$outputFilename) or die "Impossible d'ouvrir le fichier d'écriture";
print $fh $intro;
my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
foreach my $node ( $data->findnodes('//Folder') ) {
print ($node->toString);
# my($zip) = $node->findnodes('./ExtendedData/SchemaData/SimpleData');
# print ($zip->to_literal."\n");
# if ($zip->to_literal =~ /(^G1Y)|(^G3A)|(^G2G)|(^G3L)|(^G3H)|G0A2R0|G0A1T0|G0A1L0|G0A3H0|G0A3G0|G0A2Y0|G0A2Z0|G0A4N0|G0A2J0|G0A3M0|G0A4A0|G0A1A0|G0A1Y0|G0A1S0|G0A4B0|G0A3T0|G0A3B0|G0A4H0|G0A1W0|G0A3L0|G0A4L0|G0A3A0/){
# print $fh $node->to_literal;
# }
}
print $fh $outro;
close $fh or warn "Impossible de fermer le fichier après écriture";`
感谢所有愿意提供帮助的人!
PS: 这是一个缩小后的.kml文件,实际上真正的文件有加拿大所有邮政编码的所有地理信息。我正在尝试生成另一个仅包含所需邮政编码的 .kml,以便使用 Google 地图 API.
生成地图
你的问题是你的节点都在一个命名空间内,所以你需要处理这个问题。最简单的方法可能是使用 XML::LibXML::XPathContext 对象。
my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
my $xpc = XML::LibXML::XPathContext->new($data);
$xpc->registerNs('k', 'http://www.opengis.net/kml/2.2');
foreach my $node ( $xpc->findnodes('//k:Folder') ) {
...
}
您的 XML 数据使用默认名称空间,您在使用 XPath 访问它时必须明确指定该名称空间。其中 XML::LibXML
is concerned, that means you must create an XML::LibXML::XPathContext
对象搜索数据
这是一个可以满足您需要的示例程序
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(location => 'PostalCodeCanada.kml');
my $xpc = XML::LibXML::XPathContext->new($doc);
$xpc->registerNs( gis => 'http://www.opengis.net/kml/2.2');
for my $folder ( $xpc->findnodes('/gis:kml/gis:Document/gis:Folder') ) {
my ($zip) = $xpc->findnodes('gis:Placemark/gis:ExtendedData/gis:SchemaData/gis:SimpleData', $folder);
$zip = $zip->to_literal;
print "$zip\n";
if ( $zip =~ /(?:G0A(?:1A0|1L0|1S0|1T0|1W0|1Y0|2J0|2R0|2Y0|2Z0|3A0|3B0|3G0|3H0|3L0|3M0|3T0|4A0|4B0|4H0|4L0|4N0)|G1Y|G1Y1B1|G2G|G3A|G3H|G3L)/){
print $folder->to_literal;
}
}
您已经在 XML::LibXML
中找到了答案。但是我要指出——如果你使用 XML::Twig
你可以忽略命名空间。 (那是因为它并不真正支持它们——如果你只有一个,那也没关系!)
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> new -> parsefile ( 'input.kml');
foreach my $node ( $twig -> findnodes ( '//Folder') ) {
print $node -> text,"\n";
}
我正在尝试解析此 .kml 文件:
<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="PostalCodeCanada" id="PostalCodeCanada">
<SimpleField name="ZIP" type="string"></SimpleField>
<SimpleField name="VERTICES" type="int"></SimpleField>
</Schema>
<Folder><name>PostalCodeCanada</name>
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
<SimpleData name="ZIP">G1Y1B1</SimpleData>
<SimpleData name="VERTICES">5</SimpleData>
</SchemaData></ExtendedData>
<Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
</Placemark>
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#PostalCodeCanada">
<SimpleData name="ZIP">G1Y1B2</SimpleData>
<SimpleData name="VERTICES">5</SimpleData>
</SchemaData></ExtendedData>
<Polygon><altitudeMode>relativeToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>relativeToGround</altitudeMode><coordinates>-73.604399,45.545611 -73.603988,45.545886 -73.602861,45.547715 -73.602861,45.547715 -73.604399,45.545611 -73.604399,45.545611</coordinates></LinearRing></outerBoundaryIs></Polygon>
</Placemark>
</Folder>
</Document></kml>
我将 Perl 与 XML::LibXML 一起使用,但 findnodes 无法读取除“/”之外的任何节点。这是我的代码:
#!/usr/bin/env perl
use XML::LibXML;
use strict;
use warnings;
my $outputFilename = "PostalCodesCollegePro.kml";
my $intro = '<?xml version="1.0" encoding="utf-8" ?>'."\n".'<kml xmlns="http://www.opengis.net/kml/2.2">'."\n".'<Document id="root_doc">'."\n".'<Schema name="PostalCodeCanada" id="PostalCodeCanada">'."\n\t".'<SimpleField name="ZIP" type="string"></SimpleField>'."\n\t".'<SimpleField name="VERTICES" type="int"></SimpleField>'."\n".'</Schema>'."\n".'<Folder><name>PostalCodeCanada</name>'."\n";
my $outro = '</Folder>'."\n".'</Document></kml>'."\n";
open (my $fh, ">".$outputFilename) or die "Impossible d'ouvrir le fichier d'écriture";
print $fh $intro;
my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
foreach my $node ( $data->findnodes('//Folder') ) {
print ($node->toString);
# my($zip) = $node->findnodes('./ExtendedData/SchemaData/SimpleData');
# print ($zip->to_literal."\n");
# if ($zip->to_literal =~ /(^G1Y)|(^G3A)|(^G2G)|(^G3L)|(^G3H)|G0A2R0|G0A1T0|G0A1L0|G0A3H0|G0A3G0|G0A2Y0|G0A2Z0|G0A4N0|G0A2J0|G0A3M0|G0A4A0|G0A1A0|G0A1Y0|G0A1S0|G0A4B0|G0A3T0|G0A3B0|G0A4H0|G0A1W0|G0A3L0|G0A4L0|G0A3A0/){
# print $fh $node->to_literal;
# }
}
print $fh $outro;
close $fh or warn "Impossible de fermer le fichier après écriture";`
感谢所有愿意提供帮助的人! PS: 这是一个缩小后的.kml文件,实际上真正的文件有加拿大所有邮政编码的所有地理信息。我正在尝试生成另一个仅包含所需邮政编码的 .kml,以便使用 Google 地图 API.
生成地图你的问题是你的节点都在一个命名空间内,所以你需要处理这个问题。最简单的方法可能是使用 XML::LibXML::XPathContext 对象。
my $xml = XML::LibXML->new();
my $data = $xml->parse_file("PostalCodeCanada.kml");
my $xpc = XML::LibXML::XPathContext->new($data);
$xpc->registerNs('k', 'http://www.opengis.net/kml/2.2');
foreach my $node ( $xpc->findnodes('//k:Folder') ) {
...
}
您的 XML 数据使用默认名称空间,您在使用 XPath 访问它时必须明确指定该名称空间。其中 XML::LibXML
is concerned, that means you must create an XML::LibXML::XPathContext
对象搜索数据
这是一个可以满足您需要的示例程序
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(location => 'PostalCodeCanada.kml');
my $xpc = XML::LibXML::XPathContext->new($doc);
$xpc->registerNs( gis => 'http://www.opengis.net/kml/2.2');
for my $folder ( $xpc->findnodes('/gis:kml/gis:Document/gis:Folder') ) {
my ($zip) = $xpc->findnodes('gis:Placemark/gis:ExtendedData/gis:SchemaData/gis:SimpleData', $folder);
$zip = $zip->to_literal;
print "$zip\n";
if ( $zip =~ /(?:G0A(?:1A0|1L0|1S0|1T0|1W0|1Y0|2J0|2R0|2Y0|2Z0|3A0|3B0|3G0|3H0|3L0|3M0|3T0|4A0|4B0|4H0|4L0|4N0)|G1Y|G1Y1B1|G2G|G3A|G3H|G3L)/){
print $folder->to_literal;
}
}
您已经在 XML::LibXML
中找到了答案。但是我要指出——如果你使用 XML::Twig
你可以忽略命名空间。 (那是因为它并不真正支持它们——如果你只有一个,那也没关系!)
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> new -> parsefile ( 'input.kml');
foreach my $node ( $twig -> findnodes ( '//Folder') ) {
print $node -> text,"\n";
}