如何根据日期R计算月数
How to count number of Month based on date R
我有下面提到的数据帧:
DF_1
ID Date
123 18/03/2018 16:45
456 10/03/2018 20:15
DF_2
ID Date1 Date2
123 2018-03-18 06:37:22 1519109133704
123 2018-03-18 06:37:21 1520324827462
123 2018-03-16 04:03:01 1520690354458
456 2018-03-10 14:46:03 1517319313151
456 2018-03-10 14:46:04 1515143046429
456 2018-03-10 14:46:03 1515838021062
456 2018-03-10 14:46:15 1488092209241
- 考虑到
Date2
,我想将月数计算为 Month
DF_2就ID
和DF_1比较ID
。
- 每月创建的行的平均数
Avg
(即如果有 3
基于 Date2
包含 90 行的月份比平均值为 30)。
- 并且每天的平均行数为
Day
(即如果有 3 个月
包含 90 行,而不是 Day
). 的值 1
Last5
最近 5 天创建的行数(考虑 Date1
)
关于 Sys.Date()
我有下面提到的相同代码:
library(tidyverse)
library(lubridate)
DF_2 <- tibble(ID = c(123L, 123L, 123L, 456L, 456L, 456L, 456L),
Date1 = c("2018-03-18 06:37:22", "2018-03-18 06:37:21", "2018-03-16 04:03:01",
"2018-03-10 14:46:03", "2018-03-10 14:46:04", "2018-03-10 14:46:03",
"2018-03-10 14:46:15"),
Date2 = c(1519109133704, 1520324827462, 1520690354458, 1517319313151, 1515143046429, 1515838021062, 1488092209241)
)
DF_2 <- DF_2 %>% mutate(Date1 = ymd_hms(Date1),
Date2 = as.POSIXct(Date2/1000,origin = "1970-01-01"))
DF_2_tab <- DF_2 %>% group_by(ID) %>% summarise(date1 = sum(date(Date1)==date(DF_1$Date1[DF_1$ID==ID])),
Total = n(),
Month = month(count(Date2)),
Avg = mean #Don;t know how to calculate
Day = day(Date2),
Last5 = sum( (Sys.Date()-date(Date1)) < 5 )
)
你的说法1不是很清楚,DF_1有什么用。无论如何,请查看下面的代码以您想要的方式总结 DF_2。如果我有不同的月数和总记录,则第 2 点和第 3 点已完成(假设您每月只花 30 天,如上所述)。第 4 点在代码中完成 -
DF_2 = data.table(DF_2)
DF = DF_2[, list(num_mth = uniqueN(format(Date2, "%Y%m")), num_rec=.N,
numrec_5d=length(ID[as.numeric(difftime(today(), Date2), units = "days")<=5])),
by=ID]
自从您解释了 DF_1 的用法后,我已经编辑了我的代码。现在先合并ID和date1上的两个数据集,然后总结-
DF_2 <- tibble(ID = c(123L, 123L, 123L, 456L, 456L, 456L, 456L),
Date1 = c("2018-03-18 06:37:22", "2018-03-18 06:37:21", "2018-03-16 04:03:01",
"2018-03-10 14:46:03", "2018-03-10 14:46:04", "2018-03-10 14:46:03",
"2018-03-10 14:46:15"),
Date2 = c(1519109133704, 1520324827462, 1520690354458, 1517319313151, 1515143046429, 1515838021062, 1488092209241)
)
DF_2 <- DF_2 %>% mutate(Date1 = ymd_hms(Date1),
Date2 = as.POSIXct(Date2/1000,origin = "1970-01-01"))
DF_1 <- tibble(ID = c(123L, 456L),
Date1 = c("18/03/2018 16:45", "10/03/2018 20:15"))
DF_1 <- DF_1 %>% mutate(Date1 = dmy_hm(Date1))
DF_2 = data.table(DF_2)
DF_1 = data.table(DF_1)
DF_2 = DF_2[, Date1:= date(Date1)]
DF_2 = DF_2[, Date2:= date(Date2)]
DF_1 = DF_1[, Date1:= date(Date1)]
DF_1[DF_2, on = c("ID","Date1") , nomatch=0L]
DF = DF_2[, list(num_mth = uniqueN(format(Date2, "%Y%m")), num_rec=.N,
num_day = uniqueN(format(Date2, "%Y%m%d")),
numrec_5d=length(ID[as.numeric(difftime(today(), Date2), units = "days")<=5])),
by=ID]
DF[, recpermonth := num_rec/num_mth][, recperday := num_rec/num_day][, recperday2 := num_mth/num_day/30]
我有下面提到的数据帧:
DF_1
ID Date
123 18/03/2018 16:45
456 10/03/2018 20:15
DF_2
ID Date1 Date2
123 2018-03-18 06:37:22 1519109133704
123 2018-03-18 06:37:21 1520324827462
123 2018-03-16 04:03:01 1520690354458
456 2018-03-10 14:46:03 1517319313151
456 2018-03-10 14:46:04 1515143046429
456 2018-03-10 14:46:03 1515838021062
456 2018-03-10 14:46:15 1488092209241
- 考虑到
Date2
,我想将月数计算为Month
DF_2就ID
和DF_1比较ID
。 - 每月创建的行的平均数
Avg
(即如果有 3 基于Date2
包含 90 行的月份比平均值为 30)。 - 并且每天的平均行数为
Day
(即如果有 3 个月 包含 90 行,而不是Day
). 的值 1
Last5
最近 5 天创建的行数(考虑Date1
) 关于Sys.Date()
我有下面提到的相同代码:
library(tidyverse)
library(lubridate)
DF_2 <- tibble(ID = c(123L, 123L, 123L, 456L, 456L, 456L, 456L),
Date1 = c("2018-03-18 06:37:22", "2018-03-18 06:37:21", "2018-03-16 04:03:01",
"2018-03-10 14:46:03", "2018-03-10 14:46:04", "2018-03-10 14:46:03",
"2018-03-10 14:46:15"),
Date2 = c(1519109133704, 1520324827462, 1520690354458, 1517319313151, 1515143046429, 1515838021062, 1488092209241)
)
DF_2 <- DF_2 %>% mutate(Date1 = ymd_hms(Date1),
Date2 = as.POSIXct(Date2/1000,origin = "1970-01-01"))
DF_2_tab <- DF_2 %>% group_by(ID) %>% summarise(date1 = sum(date(Date1)==date(DF_1$Date1[DF_1$ID==ID])),
Total = n(),
Month = month(count(Date2)),
Avg = mean #Don;t know how to calculate
Day = day(Date2),
Last5 = sum( (Sys.Date()-date(Date1)) < 5 )
)
你的说法1不是很清楚,DF_1有什么用。无论如何,请查看下面的代码以您想要的方式总结 DF_2。如果我有不同的月数和总记录,则第 2 点和第 3 点已完成(假设您每月只花 30 天,如上所述)。第 4 点在代码中完成 -
DF_2 = data.table(DF_2)
DF = DF_2[, list(num_mth = uniqueN(format(Date2, "%Y%m")), num_rec=.N,
numrec_5d=length(ID[as.numeric(difftime(today(), Date2), units = "days")<=5])),
by=ID]
自从您解释了 DF_1 的用法后,我已经编辑了我的代码。现在先合并ID和date1上的两个数据集,然后总结-
DF_2 <- tibble(ID = c(123L, 123L, 123L, 456L, 456L, 456L, 456L),
Date1 = c("2018-03-18 06:37:22", "2018-03-18 06:37:21", "2018-03-16 04:03:01",
"2018-03-10 14:46:03", "2018-03-10 14:46:04", "2018-03-10 14:46:03",
"2018-03-10 14:46:15"),
Date2 = c(1519109133704, 1520324827462, 1520690354458, 1517319313151, 1515143046429, 1515838021062, 1488092209241)
)
DF_2 <- DF_2 %>% mutate(Date1 = ymd_hms(Date1),
Date2 = as.POSIXct(Date2/1000,origin = "1970-01-01"))
DF_1 <- tibble(ID = c(123L, 456L),
Date1 = c("18/03/2018 16:45", "10/03/2018 20:15"))
DF_1 <- DF_1 %>% mutate(Date1 = dmy_hm(Date1))
DF_2 = data.table(DF_2)
DF_1 = data.table(DF_1)
DF_2 = DF_2[, Date1:= date(Date1)]
DF_2 = DF_2[, Date2:= date(Date2)]
DF_1 = DF_1[, Date1:= date(Date1)]
DF_1[DF_2, on = c("ID","Date1") , nomatch=0L]
DF = DF_2[, list(num_mth = uniqueN(format(Date2, "%Y%m")), num_rec=.N,
num_day = uniqueN(format(Date2, "%Y%m%d")),
numrec_5d=length(ID[as.numeric(difftime(today(), Date2), units = "days")<=5])),
by=ID]
DF[, recpermonth := num_rec/num_mth][, recperday := num_rec/num_day][, recperday2 := num_mth/num_day/30]