结合使用 pivot_longer 和 pivot_wider
use pivot_longer and pivot_wider in combination
我有一个数据框,在各个列中有很多 NaN
。
df <- data.frame(
Data1 = c(3,2,1,NaN, NaN, NaN),
Data2 = c(NaN, NaN, NaN, 3,5,3),
Data3 = c(NaN, NaN, 7,5,1, NaN)
)
我试图通过在 NaN
值上使用 pivot_longer
、filter
并使用 pivot_wider
来摆脱 NaN
值将正数再次放回到它们原来的列中,然而,这失败了:
library(tidyr)
df %>%
pivot_longer(c("Data1","Data2","Data3")) %>%
filter(!is.na(value)) %>%
pivot_wider(names_from = name,
values_from = value)
# A tibble: 1 x 3
Data1 Data3 Data2
<list> <list> <list>
1 <dbl [3]> <dbl [3]> <dbl [3]>
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
代码有什么问题,如何实现这个输出?
Data1 Data2 Data3
3 3 7
2 5 5
1 3 1
代码不一定会失败,但 return 会发出警告,因为每个单元格中有多个值。如果每列中的值数量相同,您可以 unnest
列表输出。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(starts_with('Data'), values_drop_na = TRUE) %>%
arrange(name) %>%
pivot_wider(names_from = name,values_from = value, values_fn = list) %>%
unnest()
# Data1 Data2 Data3
# <dbl> <dbl> <dbl>
#1 3 3 7
#2 2 5 5
#3 1 3 1
这个特殊问题可以使用 purrr
:
巧妙地解决
map_dfr(df, na.omit)
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1
基数 R:
我更喜欢 sapply
和 na.omit
:
sapply(df, na.omit)
输出:
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1
我有一个数据框,在各个列中有很多 NaN
。
df <- data.frame(
Data1 = c(3,2,1,NaN, NaN, NaN),
Data2 = c(NaN, NaN, NaN, 3,5,3),
Data3 = c(NaN, NaN, 7,5,1, NaN)
)
我试图通过在 NaN
值上使用 pivot_longer
、filter
并使用 pivot_wider
来摆脱 NaN
值将正数再次放回到它们原来的列中,然而,这失败了:
library(tidyr)
df %>%
pivot_longer(c("Data1","Data2","Data3")) %>%
filter(!is.na(value)) %>%
pivot_wider(names_from = name,
values_from = value)
# A tibble: 1 x 3
Data1 Data3 Data2
<list> <list> <list>
1 <dbl [3]> <dbl [3]> <dbl [3]>
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
代码有什么问题,如何实现这个输出?
Data1 Data2 Data3
3 3 7
2 5 5
1 3 1
代码不一定会失败,但 return 会发出警告,因为每个单元格中有多个值。如果每列中的值数量相同,您可以 unnest
列表输出。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(starts_with('Data'), values_drop_na = TRUE) %>%
arrange(name) %>%
pivot_wider(names_from = name,values_from = value, values_fn = list) %>%
unnest()
# Data1 Data2 Data3
# <dbl> <dbl> <dbl>
#1 3 3 7
#2 2 5 5
#3 1 3 1
这个特殊问题可以使用 purrr
:
map_dfr(df, na.omit)
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1
基数 R:
我更喜欢 sapply
和 na.omit
:
sapply(df, na.omit)
输出:
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1