查找大于 r 中特定频率的特定值的数量
Find the number of specific value where is greater than a specific frequency in r
我正在尝试获取超过特定数量的列表的频率分布。在我的数据中,我有多个列,我想生成一个代码来标识每列中“0”的频率,其中“0”大于 3。
我的数据集是这样的:
a b c d e f g h
0 1 0 1 1 1 1 1
2 0 0 0 0 0 0 0
0 1 2 2 2 1 0 1
0 0 0 0 1 0 0 0
1 0 2 1 1 0 0 0
1 1 0 0 1 0 0 0
0 1 2 2 2 2 2 2
```
The output of the code that I need is :
```
Variable Frequency
a 4
c 4
f 4
g 5
h 4
```
So this will show us the numbers of "0" in the data frame in each column when it is greater than 3.
Thank you.
您可以使用 colSums
计算每列中 0 的个数,并对大于 3 的值进行子集化。
subset(stack(colSums(df == 0, na.rm = TRUE)), values > 3)
tidyverse
方法是:
library(dplyr)
df %>%
summarise(across(.fns = ~sum(. == 0, na.rm = TRUE))) %>%
tidyr::pivot_longer(cols = everything()) %>%
filter(value > 3)
# name value
# <chr> <int>
#1 a 4
#2 c 4
#3 f 4
#4 g 5
#5 h 4
数据
df <- structure(list(a = c(0L, 2L, 0L, 0L, 1L, 1L, 0L), b = c(1L, 0L,
1L, 0L, 0L, 1L, 1L), c = c(0L, 0L, 2L, 0L, 2L, 0L, 2L), d = c(1L,
0L, 2L, 0L, 1L, 0L, 2L), e = c(1L, 0L, 2L, 1L, 1L, 1L, 2L), f = c(1L,
0L, 1L, 0L, 0L, 0L, 2L), g = c(1L, 0L, 0L, 0L, 0L, 0L, 2L), h = c(1L,
0L, 1L, 0L, 0L, 0L, 2L)), class = "data.frame", row.names = c(NA, -7L))
我正在尝试获取超过特定数量的列表的频率分布。在我的数据中,我有多个列,我想生成一个代码来标识每列中“0”的频率,其中“0”大于 3。
我的数据集是这样的:
a b c d e f g h
0 1 0 1 1 1 1 1
2 0 0 0 0 0 0 0
0 1 2 2 2 1 0 1
0 0 0 0 1 0 0 0
1 0 2 1 1 0 0 0
1 1 0 0 1 0 0 0
0 1 2 2 2 2 2 2
```
The output of the code that I need is :
```
Variable Frequency
a 4
c 4
f 4
g 5
h 4
```
So this will show us the numbers of "0" in the data frame in each column when it is greater than 3.
Thank you.
您可以使用 colSums
计算每列中 0 的个数,并对大于 3 的值进行子集化。
subset(stack(colSums(df == 0, na.rm = TRUE)), values > 3)
tidyverse
方法是:
library(dplyr)
df %>%
summarise(across(.fns = ~sum(. == 0, na.rm = TRUE))) %>%
tidyr::pivot_longer(cols = everything()) %>%
filter(value > 3)
# name value
# <chr> <int>
#1 a 4
#2 c 4
#3 f 4
#4 g 5
#5 h 4
数据
df <- structure(list(a = c(0L, 2L, 0L, 0L, 1L, 1L, 0L), b = c(1L, 0L,
1L, 0L, 0L, 1L, 1L), c = c(0L, 0L, 2L, 0L, 2L, 0L, 2L), d = c(1L,
0L, 2L, 0L, 1L, 0L, 2L), e = c(1L, 0L, 2L, 1L, 1L, 1L, 2L), f = c(1L,
0L, 1L, 0L, 0L, 0L, 2L), g = c(1L, 0L, 0L, 0L, 0L, 0L, 2L), h = c(1L,
0L, 1L, 0L, 0L, 0L, 2L)), class = "data.frame", row.names = c(NA, -7L))