如何根据 SQL 中的条件 select 70% 的列?
How to select 70 percent of a column based on condition in SQL?
这是我现有的 table 数据
C1 C2 C3
1 A 1
2 B 1
3 C 0
4 D 0
5 E 0
6 F 0
7 G 1
8 H 1
9 I 1
10 J 0
我想要这个。我正在尝试的是我想要 select 70% C3 列,值为 1。C3 总共有五个。所以 5 的 70% 是 3.5,也就是 4。所以我想得到我的最终数据集,其中 70% 在 C3
C1 C2 C3
1 A 1
2 B 1
3 C 0
4 D 0
5 E 0
7 G 1
8 H 1
嗯。您似乎不需要 random 选择。它们似乎是按 col1
排序的。所以,你可以这样计算:
select t.*
from (select t.*,
sum(case when col3 = 1 then 1 else 0 end) over (order by col1) as running_col3,
sum(case when col3 = 1 then 1 else 0 end) over () as total_col3
from t
) t
where running_col3 >= 0.8 * total_col3 and
(running_col3 - col3) < 0.8 * total_col3;
注意:如果col3
只有0
和1
,可以将上面的简化为:
select t.*
from (select t.*,
sum(col3) over (order by col1) as running_col3,
sum(col3) over () as total_col3
from t
) t
where running_col3 >= 0.8 * total_col3 and
(running_col3 - col3) < 0.8 * total_col3
答案在这里
select *
from
(SELECT *,
(SELECT SUM(C3) FROM table_name t1 WHERE t1.C1 <= t.C1) AS cumulative_sum,
(select sum(C3) from table_name) as total_sum
FROM table_name t) t
where (cumulative_sum - C3) < 0.8 * total_sum
这是我现有的 table 数据
C1 C2 C3
1 A 1
2 B 1
3 C 0
4 D 0
5 E 0
6 F 0
7 G 1
8 H 1
9 I 1
10 J 0
我想要这个。我正在尝试的是我想要 select 70% C3 列,值为 1。C3 总共有五个。所以 5 的 70% 是 3.5,也就是 4。所以我想得到我的最终数据集,其中 70% 在 C3
C1 C2 C3
1 A 1
2 B 1
3 C 0
4 D 0
5 E 0
7 G 1
8 H 1
嗯。您似乎不需要 random 选择。它们似乎是按 col1
排序的。所以,你可以这样计算:
select t.*
from (select t.*,
sum(case when col3 = 1 then 1 else 0 end) over (order by col1) as running_col3,
sum(case when col3 = 1 then 1 else 0 end) over () as total_col3
from t
) t
where running_col3 >= 0.8 * total_col3 and
(running_col3 - col3) < 0.8 * total_col3;
注意:如果col3
只有0
和1
,可以将上面的简化为:
select t.*
from (select t.*,
sum(col3) over (order by col1) as running_col3,
sum(col3) over () as total_col3
from t
) t
where running_col3 >= 0.8 * total_col3 and
(running_col3 - col3) < 0.8 * total_col3
答案在这里
select *
from
(SELECT *,
(SELECT SUM(C3) FROM table_name t1 WHERE t1.C1 <= t.C1) AS cumulative_sum,
(select sum(C3) from table_name) as total_sum
FROM table_name t) t
where (cumulative_sum - C3) < 0.8 * total_sum