如何使用 Perl 从 XML 中的多个根元素打印结果

How to print a result from multiple root elements in XML using Perl

我从 Perl 开始,用这门语言做所有你能做的坏事。该程序使用 XML::Simple 和正则表达式,互联网上说除非你讨厌自己,否则不要这样做。

好的,计划是从网站上读取一个XML文件,读取需要更新的CentOS 6.6的包。对于那些不熟悉 CentOS 或 Steve Meier 勘误表 XML 的人,它与 CEBA 编号的排列不尽如人意,如下所示...

解析后的例子

<opt>
    <CEBA-2005--169 description="Not available" from="centos-announce@centos.org" issue_date="2005-04-07 01:27:35" notes="Not available" product="CentOS Linux" references="http://rhn.redhat.com/errata/RHBA-2005-169.html http://lists.centos.org/pipermail/centos-announce/2005-April/011555.html" release="1" solution="Not available" synopsis="CentOS and up2date - bugfix update" topic="Not available" type="Bug Fix Advisory">
        <os_arch>i386</os_arch>
        <os_arch>x86_64</os_arch>
        <os_release>4</os_release>
        <packages>up2date-4.4.5.6-2.centos4.i386.rpm</packages>
        <packages>up2date-4.4.5.6-2.centos4.src.rpm</packages>
    </CEBA-2005--169>
    <CEBA-2005--842 description="Not available" from="centos-announce@centos.org" issue_date="2005-11-18 17:52:49" multirelease="1" notes="Not available" product="CentOS Linux" references="https://rhn.redhat.com/errata/RHBA-2005-842.html http://lists.centos.org/pipermail/centos-announce/2005-November/012437.html http://lists.centos.org/pipermail/centos-announce/2005-November/012438.html" release="2" solution="Not available" synopsis="Important CentOS shadow-utils - bugfix update" topic="Not available" type="Bug Fix Advisory">
        <os_arch>i386</os_arch>
        <os_arch>x86_64</os_arch>
        <os_release>4</os_release> 
        <packages>shadow-utils-4.0.3-58.RHEL4.i386.rpm</packages>
    </CEBA-2005--842>

如您所见,XML 文件的根元素发生了变化。所以我不得不使用正则表达式来 "read" 通过文件。但是当我使用我的程序时,它没有打印出结果。问题可能出在使用的正则表达式或元素搜索的编写方式上。我不是 100% 确定问题出在哪里,如有任何帮助,我们将不胜感激。

计划

# Script to parse XML file to show updates.

use strict;
use XML::Simple;
use Data::Dumper;
use LWP::Simple;

my $parser = new XML::Simple;

my $url = 'http://cefs.steve-meier.de/errata.latest.xml';
my $content = get $url or die "Unable to get $url \n";
my $list = $parser->XMLin ($content);
my $CEBA = '(CEBA-([\d]+)--([\d]+))';

foreach my $CEBA (@{$list->{/(CEBA-([\d]+)--([\d]+))/}}) {
     if )$CEBS->{os_release eq '6') {
           print $CEBA->{packages} /. "\n";
     }
}

您在 foreach 中所做的工作没有 Perl,并且您在 if 中缺少一些标点符号。这样的事情应该有效:

#!/usr/bin/perl

use warnings;
use 5.010;

use XML::Simple;
use Data::Dumper;
use LWP::Simple;

# "indirect object" notation (new XML::Simple) is frowned upon
my $parser = XML::Simple->new;

# used this for testing so i wouldn't have to download
# the file for every run.
my $content = do { open my $fh, '<', 'errata.latest.xml' or die; undef $/; <$fh> };
# my $url = 'http://cefs.steve-meier.de/errata.latest.xml';
# my $content = get $url or die "Unable to get $url \n";

my $list = $parser->XMLin($content);
# print Dumper($list);

for my $CEBA (keys %$list) {
    # if the key doesn't match what you want
    # and os_release != 6, then skip to the
    # next entry.
    next unless $CEBA =~ /\ACEBA-\d+--\d+\z/
            and $list->{$CEBA}{os_release} == 6;

    say for @{ $list->{$CEBA}{packages} };
    ## ^-- essentially the same as --v
    # for my $pkg (@{ $list->{$CEBA}{packages} }) {
    #     print "$pkg\n";
    # }
}

也就是说,甚至 the XML::Simple documentation 都说不要使用它。当 CEBA 条目中只有一个 package 标签时,您可能会 运行 遇到问题。