在 Netezza 中按滚动日期间隔分组

Grouping by rolling date interval in Netezza

我在 Netezza 中有一个 table,看起来像这样

Date         Stock    Return
2015-01-01   A        xxx
2015-01-02   A        xxx
2015-01-03   A        0
2015-01-04   A        0
2015-01-05   A        xxx
2015-01-06   A        xxx
2015-01-07   A        xxx
2015-01-08   A        xxx
2015-01-09   A        xxx
2015-01-10   A        0
2015-01-11   A        0
2015-01-12   A        xxx
2015-01-13   A        xxx
2015-01-14   A        xxx
2015-01-15   A        xxx
2015-01-16   A        xxx
2015-01-17   A        0
2015-01-18   A        0
2015-01-19   A        xxx
2015-01-20   A        xxx

数据表示各种股票和日期的股票 returns。我需要做的是按给定的时间间隔和该时间间隔的日期对数据进行分组。另一个困难是周末 (0s) 必须打折(忽略 public 假期)。第一个间隔的开始日期应该是任意日期。

例如我的输出应该是这样的

Interval    Q01    Q02    Q03    Q04    Q05
1           xxx    xxx    xxx    xxx    xxx
2           xxx    xxx    xxx    xxx    xxx
3           xxx    xxx    xxx    xxx    xxx 
4           xxx    xxx    xxx    xxx    xxx

此输出将代表 5 个工作日长度的间隔,根据上面的原始数据,平均 returns 作为结果, 开始日期 1 月 1 日,第一个间隔包括 1/2/5/6/7(3 和 4 是周末,被忽略)Q01 是第 1 个,Q02 是第 2 个,Q03 是第 5 个等等。第二个间隔从 8/9 开始/12/13/14.

我尝试使用

没有成功
CEIL(CAST(EXTRACT(DOY FROM DATE) AS FLOAT) / CAST (10 AS FLOAT)) AS interval
EXTRACT(DAY FROM DATE) % 10 AS DAYinInterval

我也试过使用滚动计数器和可变开始日期将我的 DOY 设置为零 s.th 像这样

CEIL(CAST(EXTRACT(DOY FROM DATE) - EXTRACT(DOY FROM 'start-date' AS FLOAT) / CAST (10 AS FLOAT)) AS Interval

最接近我预期的一件事是这个 SUM(Number) OVER(PARTITION BY STOCK ORDER BY DATE ASC rows 10 preceding) AS Counter

不幸的是,它从 1 到 10,然后是 11,它应该再次从 1 到 10。

我很想看看如何以优雅的方式实现它。谢谢

我不完全确定我理解这个问题,但我 认为 我可能会,所以我要用一些窗口聚合和子查询来解决这个问题。

这是示例数据,插入了一些工作日的随机非零数据。

    DATE    | STOCK | RETURN
------------+-------+--------
 2015-01-01 | A     |     16
 2015-01-02 | A     |     80
 2015-01-03 | A     |      0
 2015-01-04 | A     |      0
 2015-01-05 | A     |     60
 2015-01-06 | A     |     25
 2015-01-07 | A     |     12
 2015-01-08 | A     |      1
 2015-01-09 | A     |     81
 2015-01-10 | A     |      0
 2015-01-11 | A     |      0
 2015-01-12 | A     |     35
 2015-01-13 | A     |     20
 2015-01-14 | A     |     69
 2015-01-15 | A     |     72
 2015-01-16 | A     |     89
 2015-01-17 | A     |      0
 2015-01-18 | A     |      0
 2015-01-19 | A     |    100
 2015-01-20 | A     |     67
(20 rows)

这是我的做法,带有嵌入式评论。

select avg(return),
   date_period,
   day_period
from (
        -- use row_number to generate a sequential value for each DOW,
        -- with a WHERE to filter out the weekends
      select date,
         stock,
         return,
         date_period ,
         row_number() over (partition by date_period order by date asc) day_period
      from (
            -- bin out the entries by date_period using the first_value of the entire set as the starting point
            -- modulo 7
            select date,
               stock,
               return,
               date + (first_value(date) over (order by date asc) - date) % 7 date_period
            from stocks
            where date >= '2015-01-01'
            -- setting the starting period date here
         )
         foo
      where extract (dow from date) not in (1,7)
   )
   foo
group by date_period, day_period
order by date_period asc;

结果:

    AVG     | DATE_PERIOD | DAY_PERIOD
------------+-------------+------------
  16.000000 | 2015-01-01  |          1
  80.000000 | 2015-01-01  |          2
  60.000000 | 2015-01-01  |          3
  25.000000 | 2015-01-01  |          4
  12.000000 | 2015-01-01  |          5
   1.000000 | 2015-01-08  |          1
  81.000000 | 2015-01-08  |          2
  35.000000 | 2015-01-08  |          3
  20.000000 | 2015-01-08  |          4
  69.000000 | 2015-01-08  |          5
  72.000000 | 2015-01-15  |          1
  89.000000 | 2015-01-15  |          2
 100.000000 | 2015-01-15  |          3
  67.000000 | 2015-01-15  |          4
(14 rows)

将开始日期更改为“2015-01-03”以查看是否调整正确:

...
from stocks
where date >= '2015-01-03'
...

结果:

   AVG     | DATE_PERIOD | DAY_PERIOD
------------+-------------+------------
  60.000000 | 2015-01-03  |          1
  25.000000 | 2015-01-03  |          2
  12.000000 | 2015-01-03  |          3
   1.000000 | 2015-01-03  |          4
  81.000000 | 2015-01-03  |          5
  35.000000 | 2015-01-10  |          1
  20.000000 | 2015-01-10  |          2
  69.000000 | 2015-01-10  |          3
  72.000000 | 2015-01-10  |          4
  89.000000 | 2015-01-10  |          5
 100.000000 | 2015-01-17  |          1
  67.000000 | 2015-01-17  |          2
(12 rows)