Nokogiri 捕获了正确的 css 选择器,但选择器发生了变化
Nokogiri capturing the correct css selector but selector changes
我正在编写一个程序,它将从网络中提取我们打印机的信息并输出重要的内容。
我有一些 css 会根据打印机的 maintenance/toner 的位置而变化,但是我需要做的是为 Toner
捕获 css 和不适用于 Maintenance
我已经使用此代码成功捕获了信息:
print "Toner left: ", page.css('.hpConsumableBlockHeaderText')[1].text, "\n"
问题是这只捕获了 36%
而不是 26%
例子:
请注意,两者在同一个 span
中,我不知道如何捕获一个而不是另一个?
使用示例:
[]$ ruby clean_printer laser15
Toner left:
Maintenance Kit����31%
110V-Q5421A, 220V-Q5422A
[]$
来源(出于安全原因省略了一些信息):
#!/usr/local/bin/ruby
require 'colored'
require 'nokogiri'
require 'restclient'
class CleanPrinter
attr_accessor :printer, :amount
def initialize(printer, amount)
@printer = printer
@amount = amount.to_i
end
def check_argv
if ARGV[0] == nil || ARGV[1] == nil
puts <<-EOF
USAGE: clean_printer <printer-name> <number-of-copies>
EOF
.yellow.bold
else
send_print_jobs
end
end
def create_jobs
system("lp -d #{@printer} test.txt")
end
def send_print_jobs
@amount.times do
create_jobs
end
end
def parse_4100
page = Nokogiri::HTML(RestClient.get("#{@printer}.com"))
#page.css('font').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /6%/ } <= Used to find the correct selector
print "Toner left: ", page.css('font')[28].to_s[/\d[%]/], "\n"
powersave = page.css('td')[9].to_s[/(?<=POWERSAVE\ )\w+(?=<)/]
powersave == "ON" ? (puts "Powersave Mode: ON") : (puts "Powersave Mode: OFF")
end
def parse_4350
page = Nokogiri::HTML(RestClient.get("#{@printer}.com/hp/device/this.LCDispatcher"))
#page.css('hpConsumableBlockHeaderText').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /26%/ }
print "Toner left: ", page.css('.hpConsumableBlockHeaderText')[1].text, "\n"
end
def parse_brother
end
end
mr_clean = CleanPrinter.new(ARGV[0], ARGV[1])
mr_clean.parse_4350
更新:
发现使用此正则表达式:[/\d{1,3}[%]/]
将从 Maintenance
捕获 31%
[]$ ruby clean_printer laser15
Toner left: 31%
[]$
大概 page.css('.hpConsumableBlockHeaderText')
return 这两个元素,但您使用的 page.css('.hpConsumableBlockHeaderText')[1]
将 return 第二个,而您似乎想要第一个。试试这个:
page.css('.hpConsumableBlockHeaderText')[0]
我正在编写一个程序,它将从网络中提取我们打印机的信息并输出重要的内容。
我有一些 css 会根据打印机的 maintenance/toner 的位置而变化,但是我需要做的是为 Toner
捕获 css 和不适用于 Maintenance
我已经使用此代码成功捕获了信息:
print "Toner left: ", page.css('.hpConsumableBlockHeaderText')[1].text, "\n"
问题是这只捕获了 36%
而不是 26%
例子:
请注意,两者在同一个 span
中,我不知道如何捕获一个而不是另一个?
使用示例:
[]$ ruby clean_printer laser15
Toner left:
Maintenance Kit����31%
110V-Q5421A, 220V-Q5422A
[]$
来源(出于安全原因省略了一些信息):
#!/usr/local/bin/ruby
require 'colored'
require 'nokogiri'
require 'restclient'
class CleanPrinter
attr_accessor :printer, :amount
def initialize(printer, amount)
@printer = printer
@amount = amount.to_i
end
def check_argv
if ARGV[0] == nil || ARGV[1] == nil
puts <<-EOF
USAGE: clean_printer <printer-name> <number-of-copies>
EOF
.yellow.bold
else
send_print_jobs
end
end
def create_jobs
system("lp -d #{@printer} test.txt")
end
def send_print_jobs
@amount.times do
create_jobs
end
end
def parse_4100
page = Nokogiri::HTML(RestClient.get("#{@printer}.com"))
#page.css('font').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /6%/ } <= Used to find the correct selector
print "Toner left: ", page.css('font')[28].to_s[/\d[%]/], "\n"
powersave = page.css('td')[9].to_s[/(?<=POWERSAVE\ )\w+(?=<)/]
powersave == "ON" ? (puts "Powersave Mode: ON") : (puts "Powersave Mode: OFF")
end
def parse_4350
page = Nokogiri::HTML(RestClient.get("#{@printer}.com/hp/device/this.LCDispatcher"))
#page.css('hpConsumableBlockHeaderText').each_with_index { |e,i| puts "Matched at #{i}" if e.text =~ /26%/ }
print "Toner left: ", page.css('.hpConsumableBlockHeaderText')[1].text, "\n"
end
def parse_brother
end
end
mr_clean = CleanPrinter.new(ARGV[0], ARGV[1])
mr_clean.parse_4350
更新:
发现使用此正则表达式:[/\d{1,3}[%]/]
将从 Maintenance
31%
[]$ ruby clean_printer laser15
Toner left: 31%
[]$
大概 page.css('.hpConsumableBlockHeaderText')
return 这两个元素,但您使用的 page.css('.hpConsumableBlockHeaderText')[1]
将 return 第二个,而您似乎想要第一个。试试这个:
page.css('.hpConsumableBlockHeaderText')[0]