如何显示登录从一周到另一周的变化,按同类群组细分
How to display changes in logins from one week to another segmented by cohorts
Objective: 我必须按订阅月份对电子邮件进行细分,这将决定群组。换句话说,2018 年 1 月订阅的每个人都在一个队列中,2018 年 2 月在另一个队列中。然后我需要从一周到另一周查看他们的登录 activity。如果 2018 年 1 月队列中的 100 名订阅者在 2019 年 ISO_WEEK 2 登录,其中 70 名订阅者在 ISO_WEEK 3 登录,则留存率为 70%。
问题:我不确定如何编写我的查询以将同类群组(例如 Jan2018、Feb2018、Mar2018)作为我的第一列,以下列作为计数从 2019 年开始,每个 ISO_WEEK 的不同电子邮件登录 activity。
示例数据:
CREATE TABLE member
([email] varchar(50), [creation_date] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-01-01 05:00:00'),
('player999@google.com', '2018-01-30 12:00:00'),
('player555@google.com', '2018-05-14 20:15:00')
CREATE TABLE login
([email] varchar(100), [login_date] Datetime)
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-07 09:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
我试过的:
;with
cte1 AS (
SELECT CAST(Creation_Date AS Date) AS Creation_Date
,CONCAT(DATEPART(MONTH,Creation_Date),'-',DATEPART(YEAR,Creation_Date)) AS Cohort
,email AS Emails
FROM member
),
cte2 AS (
SELECT Logins
,yy
,login_ISOWeeks
,Emails
FROM (
SELECT CAST(login_date as Date) AS Logins
,DATEPART(YEAR, login_date) AS yy
,DATEPART(ISO_WEEK,login_date) AS login_ISOWeeks
,email AS Emails
,ROW_NUMBER()
OVER(PARTITION BY DATEPART(YEAR, login_date), DATEPART(ISO_WEEK,login_date), email ORDER BY login_date ASC) AS week_count
FROM login) as f_log
WHERE f_log.week_count = 1
)
SELECT cte1.Creation_Date
,cte1.Cohort
,cte2.yy
,cte2.login_ISOWeeks
,cte1.Emails
FROM cte1
INNER JOIN cte2 ON cte1.Emails=cte2.Emails
期望输出:
Cohort 2019_2 2019_3
jan 2018 2 1
may 2018 1 0
你的数据有很多奇怪之处。为什么 join
键是电子邮件地址而不是会员 ID?为什么电子邮件成员 "created" 多次?
为了防止联接失控,我在执行联接之前汇总了每个表。这会产生您想要的结果:
select datename(year, m.creation_date) + '-' + datename(month, m.creation_date) as yyyymm,
count(distinct m.email) as num_members,
sum(case when l.yyyy = 2019 and l.isoweek = 2 then 1 else 0 end) as cnt_201902,
sum(case when l.yyyy = 2019 and l.isoweek = 3 then 1 else 0 end) as cnt_201903
from (select m.email, min(creation_date) as creation_date
from member m
group by m.email
) m left join
(select distinct l.email, year(l.login_date) as yyyy, datepart(iso_week, l.login_date) as isoweek
from login l
) l
on m.email = l.email
group by datename(year, m.creation_date) + '-' + datename(month, m.creation_date)
order by yyyymm;
Here 是一个 db<>fiddle.
Objective: 我必须按订阅月份对电子邮件进行细分,这将决定群组。换句话说,2018 年 1 月订阅的每个人都在一个队列中,2018 年 2 月在另一个队列中。然后我需要从一周到另一周查看他们的登录 activity。如果 2018 年 1 月队列中的 100 名订阅者在 2019 年 ISO_WEEK 2 登录,其中 70 名订阅者在 ISO_WEEK 3 登录,则留存率为 70%。
问题:我不确定如何编写我的查询以将同类群组(例如 Jan2018、Feb2018、Mar2018)作为我的第一列,以下列作为计数从 2019 年开始,每个 ISO_WEEK 的不同电子邮件登录 activity。
示例数据:
CREATE TABLE member
([email] varchar(50), [creation_date] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-01-01 05:00:00'),
('player999@google.com', '2018-01-30 12:00:00'),
('player555@google.com', '2018-05-14 20:15:00')
CREATE TABLE login
([email] varchar(100), [login_date] Datetime)
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-07 09:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
我试过的:
;with
cte1 AS (
SELECT CAST(Creation_Date AS Date) AS Creation_Date
,CONCAT(DATEPART(MONTH,Creation_Date),'-',DATEPART(YEAR,Creation_Date)) AS Cohort
,email AS Emails
FROM member
),
cte2 AS (
SELECT Logins
,yy
,login_ISOWeeks
,Emails
FROM (
SELECT CAST(login_date as Date) AS Logins
,DATEPART(YEAR, login_date) AS yy
,DATEPART(ISO_WEEK,login_date) AS login_ISOWeeks
,email AS Emails
,ROW_NUMBER()
OVER(PARTITION BY DATEPART(YEAR, login_date), DATEPART(ISO_WEEK,login_date), email ORDER BY login_date ASC) AS week_count
FROM login) as f_log
WHERE f_log.week_count = 1
)
SELECT cte1.Creation_Date
,cte1.Cohort
,cte2.yy
,cte2.login_ISOWeeks
,cte1.Emails
FROM cte1
INNER JOIN cte2 ON cte1.Emails=cte2.Emails
期望输出:
Cohort 2019_2 2019_3
jan 2018 2 1
may 2018 1 0
你的数据有很多奇怪之处。为什么 join
键是电子邮件地址而不是会员 ID?为什么电子邮件成员 "created" 多次?
为了防止联接失控,我在执行联接之前汇总了每个表。这会产生您想要的结果:
select datename(year, m.creation_date) + '-' + datename(month, m.creation_date) as yyyymm,
count(distinct m.email) as num_members,
sum(case when l.yyyy = 2019 and l.isoweek = 2 then 1 else 0 end) as cnt_201902,
sum(case when l.yyyy = 2019 and l.isoweek = 3 then 1 else 0 end) as cnt_201903
from (select m.email, min(creation_date) as creation_date
from member m
group by m.email
) m left join
(select distinct l.email, year(l.login_date) as yyyy, datepart(iso_week, l.login_date) as isoweek
from login l
) l
on m.email = l.email
group by datename(year, m.creation_date) + '-' + datename(month, m.creation_date)
order by yyyymm;
Here 是一个 db<>fiddle.