dplyr：lubridate：如何按周计算发生次数并在每日数据上传播

Question

嗨，我有下一个数据框

    report_date Revenue Day_type
1   2017-01-01  260.96     Sale
2   2017-01-02  540.12     Sale
3   2017-01-03  511.59     Sale
4   2017-01-04  343.29     Sale
5   2017-01-05  507.09     Sale
6   2017-01-06 1023.32     Sale
7   2017-01-07  580.19     Sale
8   2017-01-08  826.74     Sale
9   2017-01-09  753.78     Sale
10  2017-01-10  468.44     Sale
11  2017-01-11  526.57     Sale
12  2017-01-12  419.10     Sale
13  2017-01-13  243.10  Avg day
14  2017-01-14  456.64  Avg day
15  2017-01-15  659.91  Avg day
16  2017-01-16  516.98  Avg day
17  2017-01-17  447.00     Sale
18  2017-01-18  222.70     Sale
19  2017-01-19  129.48     Sale
20  2017-01-20  205.44     Sale

我正在尝试按周对收入总和进行分组，然后展开 Day_type 列，以便计算每周出现的次数

最终成品应该是这样的

   year  week Revenue  Sale  Avg day
  <dbl> <dbl>   <dbl>
1  2017     1   3767.   7      0
2  2017     2   3694.   5      2
3  2017     3   2320.   5      2
4  2017     4   3315.   7      0
5  2017     5   1998.   7      0
6  2017     6   1757.   7      0

使用此代码，我可以按周分组并对收入求和，但我需要帮助展开和计算 Day_type 列。

fulldata <- fulldata %>% 
  group_by(year = year(report_date), 
           week = week(report_date)) %>% 
  summarise_if(is.numeric, sum) %>% 
  summarise_if(is.factor, count)

感谢您的帮助

Answer 1

创建一个逻辑向量并在按 'year' 和 'week' 分组后得到它的 sum，然后对 [=] 的那些未观察到的组合执行 complete 19=]

library(lubridate)
library(dplyr)
df1 %>%
   mutate(report_date = ymd(report_date)) %>%
   group_by(year = year(report_date), week = week(report_date)) %>% 
   summarise(Revenue = sum(Revenue),
             Sale = sum(Day_type == "Sale"), 
             Avg_day = sum(Day_type == 'Avg day'))%>%
   complete(year, week = 1:6, fill = list(Sale = 7, Avg_day = 0))

如果特定组上没有元素，则返回 0，因为所有元素都是 FALSE（强制为 0）

dplyr：lubridate：如何按周计算发生次数并在每日数据上传播

dplyr: lubridate: How to count the number of occurrences by week and spread on daily data

r

lubridate

dplyr