数据框中的列对应值

Question

给定一个数据框，以年龄为基础，我需要从其他列（运动、模式）中获取对应值。有人可以帮忙 R/python 代码吗？

。

事实上，如果我能在 15 岁时获得 2 次棒球和 2 场比赛，那将会很有帮助； 19 岁 1 次高尔夫球和 1 次比赛。

输出应该如下所示，年龄作为基本变量

进一步以运动为基本变量，模式应该有类似的总结。谢谢

Answer 1

df = data.frame(Age = c(15,15,16,17,18,18,19,20),
                Sport = c("Baseball","Baseball","Baseball","Baseball","Baseball","Golf","Golf","Golf"),
                Mode = c("Play","Play","Play","Watch","Watch","Play","Play","Watch"),
                stringsAsFactors = F)

library(dplyr)
library(tidyr)

df %>%
  count(Age, Sport) %>%
  spread(Sport, n, fill = 0)

# # A tibble: 6 x 3
#     Age Baseball  Golf
# * <dbl>    <dbl> <dbl>
# 1    15        2     0
# 2    16        1     0
# 3    17        1     0
# 4    18        1     1
# 5    19        0     1
# 6    20        0     1


df %>%
  count(Age, Mode) %>%
  spread(Mode, n, fill = 0)

# # A tibble: 6 x 3
#     Age  Play Watch
# * <dbl> <dbl> <dbl>
# 1    15     2     0
# 2    16     1     0
# 3    17     0     1
# 4    18     1     1
# 5    19     1     0
# 6    20     0     1

如果你想产生一个单一的输出，你可以使用这个：

df = data.frame(Age = c(15,15,16,17,18,18,19,20),
                Sport = c("Baseball","Baseball","Baseball","Baseball","Baseball","Golf","Golf","Golf"),
                Mode = c("Play","Play","Play","Watch","Watch","Play","Play","Watch"),
                stringsAsFactors = F)

library(dplyr)
library(tidyr)
library(purrr)

# function that reshapes data based on a column name
# (uses Age column as an identifier/key)
f = function(x) {
df %>%
  group_by_("Age",x) %>%
  summarise(n = n()) %>%
  spread_(x, "n", fill = 0) %>%
  ungroup()
}


names(df)[names(df) != "Age"] %>%   # get all column names (different than Age)
  map(f) %>%                        # apply function to each column name
  reduce(left_join, by="Age")       # join datasets sequentially

# # A tibble: 6 x 5
#     Age Baseball  Golf  Play Watch
#   <dbl>    <dbl> <dbl> <dbl> <dbl>
# 1    15        2     0     2     0
# 2    16        1     0     1     0
# 3    17        1     0     0     1
# 4    18        1     1     1     1
# 5    19        0     1     1     0
# 6    20        0     1     0     1

数据框中的列对应值

Column wise correspondence values in data frame

python

r

data-science