大查询比较两周的数据
big query compare two weeks' data
我在大查询中有一个 table。像这样:
date(week) vendor_name value
2021-11-14 rick 8000
2021-11-14 rose 7000
2021-11-14 axel 6500
2021-11-14 boris 6000
2021-11-14 cliff 5500
2021-11-07 rose 9500
2021-11-07 axel 8750
2021-11-07 rick 4000
2021-11-07 dean 3500
2021-11-07 evan 3000
.....
日期栏表示每周的开始日期。 vendor_name 列显示当周销量前 5 名的供应商。值列显示总销售额。每周的顶级供应商可能会有所不同。预期输出:
date(week) vendor_name value previous_date previous_top_vendors previous_value change
2021-11-14 rick 8000 2021-11-07 rose 9500 -%26
2021-11-14 rose 7000 2021-11-07 axel 8750 -25
2021-11-14 axel 6500 2021-11-07 rick 4000 %100
2021-11-14 boris 6000 2021-11-07 dean 3500 Null
2021-11-14 cliff 5500 2021-11-07 evan 3000 Null
收到最新日期后,前一周的数据就会到达。这两周,将对卖家进行比较,他们的变化百分比将显示在名为变化的列中。
注意:供应商的排名每周可能会发生变化。可以对新供应商进行排名(在这种情况下,它应该在“更改”列中显示为 Null)
我试过这个:
SELECT * FROM (select*from `table.top20_vendor` where date = '2021-11-14'),
(select date as previous_date, vendor_name as previous_top_vendors,value as previous_value
from `table.top20_vendor` where date ='2021-11-07')
但输出不正确:
date(week) vendor_name value previous_date previous_top_vendors previous_value
2021-11-14 rick 8000 2021-11-07 rose 9500
2021-11-14 rick 8000 2021-11-07 axel 8750
2021-11-14 rick 8000 2021-11-07 rick 4000
2021-11-14 rick 8000 2021-11-07 dean 3500
2021-11-14 rick 8000 2021-11-07 evan 3000
另外,我不知道如何计算“更改”列
我仍然不明白您在比较哪些记录方面的更改逻辑,但这应该可以帮助您在几周之间比较供应商及其价值方面的大部分内容:
with sample_data as (
select '2021-11-14' as week, 'rick' as vendor_name, 8000 as value UNION ALL
select '2021-11-14' as week, 'rose' as vendor_name, 7000 as value UNION ALL
select '2021-11-14' as week, 'axel' as vendor_name, 6500 as value UNION ALL
select '2021-11-14' as week, 'boris' as vendor_name, 6000 as value UNION ALL
select '2021-11-14' as week, 'cliff' as vendor_name, 5500 as value UNION ALL
select '2021-11-07' as week, 'rose' as vendor_name, 9500 as value UNION ALL
select '2021-11-07' as week, 'axel' as vendor_name, 8750 as value UNION ALL
select '2021-11-07' as week, 'rick' as vendor_name, 4000 as value UNION ALL
select '2021-11-07' as week, 'dean' as vendor_name, 3500 as value UNION ALL
select '2021-11-07' as week, 'evan' as vendor_name, 3000 as value
)
,
ranked_data as (
select dense_rank() over (order by week desc) week_rank
, rank() over (partition by week order by value) vendor_rank
, *
from sample_data
)
select curr_week.week
, curr_week.vendor_name
, curr_week.value
, prev_week.week as previous_week
, prev_week.vendor_name as previous_top_vendors
, prev_week.value as previous_value
,
from ranked_data curr_week
left join ranked_data prev_week
on curr_week.vendor_rank=prev_week.vendor_rank
and prev_week.week_rank=2
where curr_week.week_rank=1
order by 3 desc
最终我认为您需要超前或滞后,具体取决于中间步骤的计算。如果您能详细说明一下,我也许可以添加组件进行更改。
我首先做的是获取所需周的前 5 个值,我使用了以下查询:
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-14'
接下来要做的就是对数值进行修改。为此,您需要使用百分比公式并按供应商的名称加入这几周的价值。
SELECT prev_week.rn,((curr_week.value-prev_week.value)/prev_week.value)*100 as change FROM curr_week right join prev_week on curr_week.vendor_name = prev_week.vendor_name
考虑以下方法:
with curr_week as(
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-14'
),
prev_week as (
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-07'
),
changes as(
SELECT prev_week.rn,((curr_week.value-prev_week.value)/prev_week.value)*100 as change FROM curr_week right join prev_week on curr_week.vendor_name = prev_week.vendor_name
)
select curr_week.date,
curr_week.vendor_name,
curr_week.value,
prev_week.date as previous_date,
prev_week.vendor_name as previous_top_vendors,
prev_week.value as previous_value,
changes.change
from curr_week left join prev_week on curr_week.rn = prev_week.rn
left join changes on prev_week.rn = changes.rn
我得到的结果如下:
我在大查询中有一个 table。像这样:
date(week) vendor_name value
2021-11-14 rick 8000
2021-11-14 rose 7000
2021-11-14 axel 6500
2021-11-14 boris 6000
2021-11-14 cliff 5500
2021-11-07 rose 9500
2021-11-07 axel 8750
2021-11-07 rick 4000
2021-11-07 dean 3500
2021-11-07 evan 3000
.....
日期栏表示每周的开始日期。 vendor_name 列显示当周销量前 5 名的供应商。值列显示总销售额。每周的顶级供应商可能会有所不同。预期输出:
date(week) vendor_name value previous_date previous_top_vendors previous_value change
2021-11-14 rick 8000 2021-11-07 rose 9500 -%26
2021-11-14 rose 7000 2021-11-07 axel 8750 -25
2021-11-14 axel 6500 2021-11-07 rick 4000 %100
2021-11-14 boris 6000 2021-11-07 dean 3500 Null
2021-11-14 cliff 5500 2021-11-07 evan 3000 Null
收到最新日期后,前一周的数据就会到达。这两周,将对卖家进行比较,他们的变化百分比将显示在名为变化的列中。
注意:供应商的排名每周可能会发生变化。可以对新供应商进行排名(在这种情况下,它应该在“更改”列中显示为 Null)
我试过这个:
SELECT * FROM (select*from `table.top20_vendor` where date = '2021-11-14'),
(select date as previous_date, vendor_name as previous_top_vendors,value as previous_value
from `table.top20_vendor` where date ='2021-11-07')
但输出不正确:
date(week) vendor_name value previous_date previous_top_vendors previous_value
2021-11-14 rick 8000 2021-11-07 rose 9500
2021-11-14 rick 8000 2021-11-07 axel 8750
2021-11-14 rick 8000 2021-11-07 rick 4000
2021-11-14 rick 8000 2021-11-07 dean 3500
2021-11-14 rick 8000 2021-11-07 evan 3000
另外,我不知道如何计算“更改”列
我仍然不明白您在比较哪些记录方面的更改逻辑,但这应该可以帮助您在几周之间比较供应商及其价值方面的大部分内容:
with sample_data as (
select '2021-11-14' as week, 'rick' as vendor_name, 8000 as value UNION ALL
select '2021-11-14' as week, 'rose' as vendor_name, 7000 as value UNION ALL
select '2021-11-14' as week, 'axel' as vendor_name, 6500 as value UNION ALL
select '2021-11-14' as week, 'boris' as vendor_name, 6000 as value UNION ALL
select '2021-11-14' as week, 'cliff' as vendor_name, 5500 as value UNION ALL
select '2021-11-07' as week, 'rose' as vendor_name, 9500 as value UNION ALL
select '2021-11-07' as week, 'axel' as vendor_name, 8750 as value UNION ALL
select '2021-11-07' as week, 'rick' as vendor_name, 4000 as value UNION ALL
select '2021-11-07' as week, 'dean' as vendor_name, 3500 as value UNION ALL
select '2021-11-07' as week, 'evan' as vendor_name, 3000 as value
)
,
ranked_data as (
select dense_rank() over (order by week desc) week_rank
, rank() over (partition by week order by value) vendor_rank
, *
from sample_data
)
select curr_week.week
, curr_week.vendor_name
, curr_week.value
, prev_week.week as previous_week
, prev_week.vendor_name as previous_top_vendors
, prev_week.value as previous_value
,
from ranked_data curr_week
left join ranked_data prev_week
on curr_week.vendor_rank=prev_week.vendor_rank
and prev_week.week_rank=2
where curr_week.week_rank=1
order by 3 desc
最终我认为您需要超前或滞后,具体取决于中间步骤的计算。如果您能详细说明一下,我也许可以添加组件进行更改。
我首先做的是获取所需周的前 5 个值,我使用了以下查询:
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-14'
接下来要做的就是对数值进行修改。为此,您需要使用百分比公式并按供应商的名称加入这几周的价值。
SELECT prev_week.rn,((curr_week.value-prev_week.value)/prev_week.value)*100 as change FROM curr_week right join prev_week on curr_week.vendor_name = prev_week.vendor_name
考虑以下方法:
with curr_week as(
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-14'
),
prev_week as (
select * from
(
select date, vendor_name, value , row_number() over(partition by date order by value desc) as rn
from `project.dataset.table`
)A
where rn<=5 and date='2021-11-07'
),
changes as(
SELECT prev_week.rn,((curr_week.value-prev_week.value)/prev_week.value)*100 as change FROM curr_week right join prev_week on curr_week.vendor_name = prev_week.vendor_name
)
select curr_week.date,
curr_week.vendor_name,
curr_week.value,
prev_week.date as previous_date,
prev_week.vendor_name as previous_top_vendors,
prev_week.value as previous_value,
changes.change
from curr_week left join prev_week on curr_week.rn = prev_week.rn
left join changes on prev_week.rn = changes.rn
我得到的结果如下: