访问 Nokogiri 元素 children
Accessing Nokogiri element children
在解析 html table 后,我能够将 table 的第一行作为 Nokogiri 元素。
2.2.1 :041 > pp content[1]; nil
#(Element:0x3feee917d1e0 {
name = "tr",
children = [
#(Element:0x3feee917cfd8 {
name = "td",
attributes = [
#(Attr:0x3feee917cf74 { name = "valign", value = "top" })],
children = [
#(Element:0x3feee917ca60 {
name = "a",
attributes = [
#(Attr:0x3feee917c9fc {
name = "href",
value = "/cgi-bin/own-disp?action=getowner&CIK=0001513362"
})],
children = [ #(Text "Maestri Luca")]
})]
}),
#(Text "\n"),
#(Element:0x3feee917c150 {
name = "td",
children = [
#(Element:0x3feee917d794 {
name = "a",
attributes = [
#(Attr:0x3feee9179fb8 {
name = "href",
value = "/cgi-bin/browse-edgar?action=getcompany&CIK=0001513362"
})],
children = [ #(Text "0001513362")]
})]
}),
#(Text "\n"),
#(Element:0x3feee91796a8 {
name = "td",
children = [ #(Text "2016-09-04")]
}),
#(Text "\n"),
#(Element:0x3feee9179194 {
name = "td",
children = [ #(Text "officer: Senior Vice President, CFO")]
}),
#(Text "\n")]
})
=> nil
这是行中的内容:
Maestri Luca 0001513362 2016-09-04 官员:高级副总裁,首席财务官
我需要从 Nokogiri 元素访问姓名、号码、日期和标题。
一种方法如下:
2.2.1 :042 > pp content[1].text; nil
"Maestri Luca\n0001513362\n2016-09-04\nofficer: Senior Vice President, CFO\n"
但是,我正在寻找一种单独访问元素的方法,而不是像一个带有换行符的长字符串。我该怎么做?
name, number, date, title = *content[1].css('td').map(&:text)
如果 content[1]
是一个 tr
,content[1].css('td')
会找到它下面的所有 td
元素,.map(&:text)
会为每个元素调用 td.text
那些 td
并将其放入一个数组中,然后我们用 *
拼写,这样我们就可以进行多重分配。
(注意:下次请包含原始 HTML 片段,而不是 Nokogiri 节点检查结果。)
在解析 html table 后,我能够将 table 的第一行作为 Nokogiri 元素。
2.2.1 :041 > pp content[1]; nil
#(Element:0x3feee917d1e0 {
name = "tr",
children = [
#(Element:0x3feee917cfd8 {
name = "td",
attributes = [
#(Attr:0x3feee917cf74 { name = "valign", value = "top" })],
children = [
#(Element:0x3feee917ca60 {
name = "a",
attributes = [
#(Attr:0x3feee917c9fc {
name = "href",
value = "/cgi-bin/own-disp?action=getowner&CIK=0001513362"
})],
children = [ #(Text "Maestri Luca")]
})]
}),
#(Text "\n"),
#(Element:0x3feee917c150 {
name = "td",
children = [
#(Element:0x3feee917d794 {
name = "a",
attributes = [
#(Attr:0x3feee9179fb8 {
name = "href",
value = "/cgi-bin/browse-edgar?action=getcompany&CIK=0001513362"
})],
children = [ #(Text "0001513362")]
})]
}),
#(Text "\n"),
#(Element:0x3feee91796a8 {
name = "td",
children = [ #(Text "2016-09-04")]
}),
#(Text "\n"),
#(Element:0x3feee9179194 {
name = "td",
children = [ #(Text "officer: Senior Vice President, CFO")]
}),
#(Text "\n")]
})
=> nil
这是行中的内容:
Maestri Luca 0001513362 2016-09-04 官员:高级副总裁,首席财务官
我需要从 Nokogiri 元素访问姓名、号码、日期和标题。
一种方法如下:
2.2.1 :042 > pp content[1].text; nil
"Maestri Luca\n0001513362\n2016-09-04\nofficer: Senior Vice President, CFO\n"
但是,我正在寻找一种单独访问元素的方法,而不是像一个带有换行符的长字符串。我该怎么做?
name, number, date, title = *content[1].css('td').map(&:text)
如果 content[1]
是一个 tr
,content[1].css('td')
会找到它下面的所有 td
元素,.map(&:text)
会为每个元素调用 td.text
那些 td
并将其放入一个数组中,然后我们用 *
拼写,这样我们就可以进行多重分配。
(注意:下次请包含原始 HTML 片段,而不是 Nokogiri 节点检查结果。)