Nokogiri 捕获了正确的 css 选择器,但选择器发生了变化

Nokogiri capturing the correct css selector but selector changes

我正在编写一个程序,它将从网络中提取我们打印机的信息并输出重要的内容。

我有一些 css 会根据打印机的 maintenance/toner 的位置而变化,但是我需要做的是为 Toner 捕获 css 和不适用于 Maintenance

我已经使用此代码成功捕获了信息: print "Toner left: ", page.css('.hpConsumableBlockHeaderText')[1].text, "\n"

问题是这只捕获了 36% 而不是 26%

例子:

请注意,两者在同一个 span 中,我不知道如何捕获一个而不是另一个?

使用示例:

[]$ ruby clean_printer laser15
Toner left: 
Maintenance Kit����31%
110V-Q5421A, 220V-Q5422A

[]$ 

来源(出于安全原因省略了一些信息):

#!/usr/local/bin/ruby

require 'colored'
require 'nokogiri'
require 'restclient'

class CleanPrinter

  attr_accessor :printer, :amount

  def initialize(printer, amount)
    @printer = printer
    @amount = amount.to_i
  end

  def check_argv
    if ARGV[0] == nil || ARGV[1] == nil
      puts <<-EOF

      USAGE: clean_printer <printer-name> <number-of-copies>
      EOF
      .yellow.bold
    else
      send_print_jobs
    end
  end

  def create_jobs
    system("lp -d #{@printer} test.txt")
  end

  def send_print_jobs
    @amount.times do
      create_jobs
    end
  end

  def parse_4100
    page = Nokogiri::HTML(RestClient.get("#{@printer}.com"))
    #page.css('font').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /6%/ } <= Used to find the correct selector
    print "Toner left: ", page.css('font')[28].to_s[/\d[%]/], "\n"
    powersave = page.css('td')[9].to_s[/(?<=POWERSAVE\ )\w+(?=<)/]
    powersave == "ON" ? (puts "Powersave Mode: ON") : (puts "Powersave Mode: OFF")
  end

  def parse_4350
    page = Nokogiri::HTML(RestClient.get("#{@printer}.com/hp/device/this.LCDispatcher"))
    #page.css('hpConsumableBlockHeaderText').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /26%/ }
    print "Toner left: ",  page.css('.hpConsumableBlockHeaderText')[1].text, "\n"
  end

  def parse_brother
  end
end

mr_clean = CleanPrinter.new(ARGV[0], ARGV[1])
mr_clean.parse_4350

更新:

发现使用此正则表达式:[/\d{1,3}[%]/] 将从 Maintenance

捕获 31%
[]$ ruby clean_printer laser15
Toner left: 31%
[]$ 

大概 page.css('.hpConsumableBlockHeaderText')return 这两个元素,但您使用的 page.css('.hpConsumableBlockHeaderText')[1] 将 return 第二个,而您似乎想要第一个。试试这个:

page.css('.hpConsumableBlockHeaderText')[0]