努力解决如何在我的函数中基于 Vector 合并数据框行的重新排序

Struggling with how to incorporate a reordering of dataframe rows based on a Vector within my function

library(tidyverse)
library(ggplot2) for diamonds dataset

我在使用我的功能时遇到问题。在本示例中,我尝试使用 ggplot2 形式的钻石数据集来执行 dplyr::group_by "cut" 和 "color",然后 dplyr::summarise 来获取计数。我使用 rlang 和 purrr 将两个计数摘要输出到列表中,然后重命名其中一列,并将它们与 dplyr::map_df 绑定。最后,我想根据另一个名为 "Order" 的向量对 "Cut" 列重新排序。该函数一直有效,直到我尝试合并行重新排序...

这对于这个数据可能没有意义,但这只是一个例子,它对我的​​真实数据有意义。

无论如何,下面的代码有效...

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     diamonds%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))
 })

接下来,我想根据 "Order" 向量对行重新排序,这也有效。

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups%>%slice(match(Order, Cut))

但是,这就是我卡住的地方。我试图在一个函数中完成所有这些,但它似乎不起作用。我觉得我错过了一些小东西......

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))%>%
         slice(match(Order,Cut))
return(df)
})
}

这是另一个尝试...

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))

df<-df%>%slice(match(Order,Cut))
return(df)
})
}

我在这里错过了什么?

您对 Fun 的第一次尝试成功了,只是结果分配给了 Group 变量而不返回。尝试以下

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))%>%
         slice(match(Order,Cut))
return(df)
})
}

Fun(diamonds)

问题的可能更正。为了简单起见,我创建了一个 temp_df 变量并返回了相同的变量。

Fun<-function(df){

  Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

  Groups<-list("cut","color")

  Groups<-Groups%>%
    map_df(function(group){

      syms<-syms(group)

      temp <- df%>%
        group_by(!!!syms)%>%
        summarise(Count=n())%>%
        set_names(c("Cut","Count"))
    })

  temp_df <- Groups%>%slice(match(Order, Cut))
  return(temp_df)
}

> x <- Fun(diamonds)
> x
# A tibble: 12 x 2
   Cut       Count
   <chr>     <int>
 1 Good       4906
 2 Very Good 12082
 3 Premium   13791
 4 Ideal     21551
 5 Fair       1610
 6 E          9797
 7 F          9542
 8 G         11292
 9 D          6775
10 H          8304
11 J          2808
12 I          5422

我们不需要在循环中应用 syms。它可以将长度大于 1 的 vector/list 转换为符号。因此,遍历 syms 然后使用 map 对每个符号对象

执行 group_by
Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

Groups %>%
       syms %>%
       map_df(~ df %>%
               group_by(!!!  .x) %>%
               summarise(Count=n()) %>%
               set_names(c("Cut","Count")) %>%
               slice(match(Order,Cut)) #%>%                    
               #mutate(Cut = as.character(Cut)) 
               #to avoid the warning coercion of factor to character 


      )




}

Fun(diamonds)
# A tibble: 12 x 2
#   Cut       Count
#   <chr>     <int>
# 1 Good       4906
# 2 Very Good 12082
# 3 Premium   13791
# 4 Ideal     21551
# 5 Fair       1610
# 6 E          9797
# 7 F          9542
# 8 G         11292
# 9 D          6775
#10 H          8304
#11 J          2808
#12 I          5422