替换向量中的值,要替换的项目数不是替换长度的倍数
replacing values in vector, number of items to replace is not a multiple of replacement length
我在替换数据框中的值时遇到一些奇怪的问题。我想将字符串转换为日期格式。我必须用两种方式来解决这个问题,因为有两种数据格式。
library(rvest)
library(stringi)
urlOnetWybory <- "http://wiadomosci.onet.pl/wybory-prezydenckie/xcnpc"
htmlOnetWybory <- html(urlOnetWybory)
nodes <- html_nodes(htmlOnetWybory, ".datePublished , .itemTitle")
text <- html_text(nodes)
dataRaw <- text[seq(1, length(text), 2)]
#"dzisiaj" = "today", "wczoraj"="yesterday"
data <- sapply(dataRaw, function(x){
#converting string of the first type to data
stri_replace_all_fixed(x, "dzisiaj", as.character(Sys.Date()))
stri_replace_all_fixed(x, "wczoraj", as.character(Sys.Date() - 1))
})
#indexes in dataRaw where there's a word "dzisiaj" or "wczoraj"
indeksDzis <- stri_detect_regex(dataRaw, "dzisiaj [0-9]{2}:[0-9]{2}")
indeksWczo <- stri_detect_regex(dataRaw, "wczoraj [0-9]{2}:[0-9]{2}")
#indexes for cells where date is in the second format.
indDoZmiany <- !(indeksDzis | indeksWczo)
#I get message here. Why? The length is the same.
data[indDoZmiany] <- strptime(data[indDoZmiany], "%d %b, %H:%M")
有人知道怎么解决吗?为什么我会得到一些列表?
POSIXlt 有一些陷阱,因为它是一个列表。试试这个最小的例子:
x <- strptime(c("15:40 12 mar", "12:58 11 mar"), "%d %b, %H:%M")
unclass(x)
将最后一行替换为:
data[indDoZmiany] <- as.POSIXct( strptime(data[indDoZmiany], "%d %b, %H:%M"))
我在替换数据框中的值时遇到一些奇怪的问题。我想将字符串转换为日期格式。我必须用两种方式来解决这个问题,因为有两种数据格式。
library(rvest)
library(stringi)
urlOnetWybory <- "http://wiadomosci.onet.pl/wybory-prezydenckie/xcnpc"
htmlOnetWybory <- html(urlOnetWybory)
nodes <- html_nodes(htmlOnetWybory, ".datePublished , .itemTitle")
text <- html_text(nodes)
dataRaw <- text[seq(1, length(text), 2)]
#"dzisiaj" = "today", "wczoraj"="yesterday"
data <- sapply(dataRaw, function(x){
#converting string of the first type to data
stri_replace_all_fixed(x, "dzisiaj", as.character(Sys.Date()))
stri_replace_all_fixed(x, "wczoraj", as.character(Sys.Date() - 1))
})
#indexes in dataRaw where there's a word "dzisiaj" or "wczoraj"
indeksDzis <- stri_detect_regex(dataRaw, "dzisiaj [0-9]{2}:[0-9]{2}")
indeksWczo <- stri_detect_regex(dataRaw, "wczoraj [0-9]{2}:[0-9]{2}")
#indexes for cells where date is in the second format.
indDoZmiany <- !(indeksDzis | indeksWczo)
#I get message here. Why? The length is the same.
data[indDoZmiany] <- strptime(data[indDoZmiany], "%d %b, %H:%M")
有人知道怎么解决吗?为什么我会得到一些列表?
POSIXlt 有一些陷阱,因为它是一个列表。试试这个最小的例子:
x <- strptime(c("15:40 12 mar", "12:58 11 mar"), "%d %b, %H:%M")
unclass(x)
将最后一行替换为:
data[indDoZmiany] <- as.POSIXct( strptime(data[indDoZmiany], "%d %b, %H:%M"))