删除零并将它们添加回时间序列

Removing zeros and adding them back in time series

我有以下数据

library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1   14:00"),by="hour")
data<-xts(values,order.by=time1)
data

  [,1]
2013-01-01 00:00:00    2
2013-01-01 01:00:00    2
2013-01-01 02:00:00    2
2013-01-01 03:00:00    4
2013-01-01 04:00:00    2
2013-01-01 05:00:00    3
2013-01-01 06:00:00    0
2013-01-01 07:00:00    0
2013-01-01 08:00:00    0
2013-01-01 09:00:00    0
2013-01-01 10:00:00    0
2013-01-01 11:00:00    1
2013-01-01 12:00:00    2
2013-01-01 13:00:00    3
2013-01-01 14:00:00    2

现在我想删除所有的零,这可以通过

轻松实现
remove_zerro = apply(data, 1, function(row) all(row !=0 ))
data[remove_zerro,]

问题是,在我使用不带零的数据并进行一些修改后,我想在同一日期和时间将零插入回我的数据。任何想法将不胜感激

我在 @zx8754 的评论基础上进行构建。

一种方法是拆分数据框。如果您担心弄乱索引或将数据帧连接在一起,那么下面是另一种方法。

创建 T/F 的索引。

idx <- df[,col] != 0
df$col[idx] <- 2007 # or whatever operation. 

这里有两种可能的方法:

# re-create your data set
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1   14:00"),by="hour")
data<-xts(values,order.by=time1)
data

###############################################
# SOLUTION 1 : 
# make a union of the "zero" series and the "zero-free" series

# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
zeroSeries <- data[!isNotZero,]

# do you calculations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10

# union
unionSeries <- rbind(zeroSeries,zeroFreeSeries)

# now unionSeries contains what you desire
unionSeries

###############################################
# SOLUTION 2 : 
# keep the original series copy and after doing your operations
# on the "zero-free" series, update the original series copy with
# with the new values (it doesn't work well if you remove some date from the 
# zero-free series)

# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]

# do you operations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10

# modify the original data by setting the new values
data[time(zeroFreeSeries),] <- zeroFreeSeries

# now data contains what you desire
data

显然这就是解决方案

no<-data[ data[,1] != 0, ] #data without zeros
yes<-data[ data[,1] == 0, ]# data with only zeros

together<-c(no, yes)# both data combined together

您似乎想要使用 sparse vectors/matrices:

install.packages("spam")
library(spam)
sx <- c(0,0,3, 3.2, 0,0,0,-3:1,0,0,2,0,0,5,0,0)
apply.spam(spam(sx), NULL, function(x){1 / x})
           [,1]
 [1,]  0.0000000
 [2,]  0.0000000
 [3,]  0.3333333
 [4,]  0.3125000
 [5,]  0.0000000
 [6,]  0.0000000
 [7,]  0.0000000
 [8,] -0.3333333
 [9,] -0.5000000
[10,] -1.0000000
[11,]  0.0000000
[12,]  1.0000000
[13,]  0.0000000
[14,]  0.0000000
[15,]  0.5000000
[16,]  0.0000000
[17,]  0.0000000
[18,]  0.2000000
[19,]  0.0000000
[20,]  0.0000000

如果您使用零值执行此操作:

> apply(matrix(sx), 1, function(x){1 / x})
 [1]        Inf        Inf  0.3333333  0.3125000        Inf        Inf
 [7]        Inf -0.3333333 -0.5000000 -1.0000000        Inf  1.0000000
[13]        Inf        Inf  0.5000000        Inf        Inf  0.2000000
[19]        Inf        Inf

所以你可以看到 apply.spam 忽略了零,但是自动把它们放回去

缺点是您必须在处理后重新贴回时间标签。