rvest 正确检查 html_text
rvest to properply check for html_text
我正在尝试使用此函数根据以下元素分别为男孩 <a class="boy" href="/boys-names">Male</a>
或女孩 <a class="girl" href="/girls-names">Female</a>
获取姓名列表的性别。
library(rvest)
gender_from_name <- function(name){
name_url <- paste("https://nameberry.com/babyname/", name, sep = "")
is_it_a_boy <- read_html(name_url) %>%
html_nodes(".girl") %>%
html_text(trim=TRUE) %>%
length() == 0
return (if(is_it_a_boy){"Male"}else{"Female"})
}
但是,它不适用于 gender_from_name("Aaron")
。我试过 length()<2
但它仍然关闭...
这是直接return性别的方法。您正在查找具有 class=meta-section.
的“跨度”节点下的“跨度”节点下的“a”节点下的文本
library(rvest)
gender_from_name <- function(name){
name_url <- paste("https://nameberry.com/babyname/", name, sep = "")
is_it_a_boy <- read_html(name_url)
gender <- is_it_a_boy %>%
html_nodes("span.meta-section span a") %>%
html_text(trim=TRUE)
return (gender)
}
gender_from_name("Aaron")
gender_from_name("Mary")
gender_from_name("William")
gender_from_name("Dianne")
我正在尝试使用此函数根据以下元素分别为男孩 <a class="boy" href="/boys-names">Male</a>
或女孩 <a class="girl" href="/girls-names">Female</a>
获取姓名列表的性别。
library(rvest)
gender_from_name <- function(name){
name_url <- paste("https://nameberry.com/babyname/", name, sep = "")
is_it_a_boy <- read_html(name_url) %>%
html_nodes(".girl") %>%
html_text(trim=TRUE) %>%
length() == 0
return (if(is_it_a_boy){"Male"}else{"Female"})
}
但是,它不适用于 gender_from_name("Aaron")
。我试过 length()<2
但它仍然关闭...
这是直接return性别的方法。您正在查找具有 class=meta-section.
的“跨度”节点下的“跨度”节点下的“a”节点下的文本library(rvest)
gender_from_name <- function(name){
name_url <- paste("https://nameberry.com/babyname/", name, sep = "")
is_it_a_boy <- read_html(name_url)
gender <- is_it_a_boy %>%
html_nodes("span.meta-section span a") %>%
html_text(trim=TRUE)
return (gender)
}
gender_from_name("Aaron")
gender_from_name("Mary")
gender_from_name("William")
gender_from_name("Dianne")