基于两列查找百分比
Finding Percentage Based on Two Columns
这是原始数据框的样子:
PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00
如何创建一个列来显示 SIZE by PLACEMENT 的百分比?
我希望它最后看起来像这样:
PLACEMENT SIZE COST PERCENTAGE
1 placement1 LARGE 1838128.00 11.9
58 placement1 MEDIUM 10962048.00 71.1
117 placement1 SMALL 2622851.00 17.0
175 placement1 UNKNOWN 443.00 0.0
2 placement2 LARGE 598.00 0.1
59 placement2 MEDIUM 24358.00 4.07
118 placement2 SMALL 571802.00 95.54
176 placement2 UNKNOWN 1706.00 0.29
3 placement3 LARGE 8.00 0.0
60 placement3 MEDIUM 22.00 0.0
119 placement3 SMALL 502388.00 99.84
177 placement3 UNKNOWN 762.00 0.16
任何帮助都会很棒,谢谢!我无法用 prop.table 库解决这个问题,尽管我觉得我应该使用它。
您可以使用 dplyr 快速完成:
library(dplyr)
df <- df %>% group_by(PLACEMENT) %>% mutate(PERCENTAGE=COST/SUM(COST))
看起来你想要的结果也是四舍五入的,如果你愿意,你可以用函数 round() 来做。
编辑 如果你想让你的百分比保持在 1 到 100 之间,你当然可以通过写 100*COST/SUM(COST) 来做到这一点,如果你更喜欢这样。
假设您的数据框输入是 DF
这将完成。不需要包。
transform(DF, PC = 100 * ave(COST, PLACEMENT, FUN = prop.table))
给予:
PLACEMENT SIZE COST PC
1 placement1 LARGE 1838128 11.917733169
58 placement1 MEDIUM 10962048 71.073811535
117 placement1 SMALL 2622851 17.005583050
175 placement1 UNKNOWN 443 0.002872246
2 placement2 LARGE 598 0.099922468
59 placement2 MEDIUM 24358 4.070086087
118 placement2 SMALL 571802 95.544928350
176 placement2 UNKNOWN 1706 0.285063095
3 placement3 LARGE 8 0.001589888
60 placement3 MEDIUM 22 0.004372193
119 placement3 SMALL 502388 99.842601057
177 placement3 UNKNOWN 762 0.151436862
注意:可重现形式的输入是:
Lines <- "PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00"
DF <- read.table(text = Lines, header = TRUE)
这是一个使用data.table
的选项
library(data.table)
setDT(df)[, PERCENTAGE := COST/SUM(COST) , by = PLACEMENT]
这是原始数据框的样子:
PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00
如何创建一个列来显示 SIZE by PLACEMENT 的百分比?
我希望它最后看起来像这样:
PLACEMENT SIZE COST PERCENTAGE
1 placement1 LARGE 1838128.00 11.9
58 placement1 MEDIUM 10962048.00 71.1
117 placement1 SMALL 2622851.00 17.0
175 placement1 UNKNOWN 443.00 0.0
2 placement2 LARGE 598.00 0.1
59 placement2 MEDIUM 24358.00 4.07
118 placement2 SMALL 571802.00 95.54
176 placement2 UNKNOWN 1706.00 0.29
3 placement3 LARGE 8.00 0.0
60 placement3 MEDIUM 22.00 0.0
119 placement3 SMALL 502388.00 99.84
177 placement3 UNKNOWN 762.00 0.16
任何帮助都会很棒,谢谢!我无法用 prop.table 库解决这个问题,尽管我觉得我应该使用它。
您可以使用 dplyr 快速完成:
library(dplyr)
df <- df %>% group_by(PLACEMENT) %>% mutate(PERCENTAGE=COST/SUM(COST))
看起来你想要的结果也是四舍五入的,如果你愿意,你可以用函数 round() 来做。
编辑 如果你想让你的百分比保持在 1 到 100 之间,你当然可以通过写 100*COST/SUM(COST) 来做到这一点,如果你更喜欢这样。
假设您的数据框输入是 DF
这将完成。不需要包。
transform(DF, PC = 100 * ave(COST, PLACEMENT, FUN = prop.table))
给予:
PLACEMENT SIZE COST PC
1 placement1 LARGE 1838128 11.917733169
58 placement1 MEDIUM 10962048 71.073811535
117 placement1 SMALL 2622851 17.005583050
175 placement1 UNKNOWN 443 0.002872246
2 placement2 LARGE 598 0.099922468
59 placement2 MEDIUM 24358 4.070086087
118 placement2 SMALL 571802 95.544928350
176 placement2 UNKNOWN 1706 0.285063095
3 placement3 LARGE 8 0.001589888
60 placement3 MEDIUM 22 0.004372193
119 placement3 SMALL 502388 99.842601057
177 placement3 UNKNOWN 762 0.151436862
注意:可重现形式的输入是:
Lines <- "PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00"
DF <- read.table(text = Lines, header = TRUE)
这是一个使用data.table
library(data.table)
setDT(df)[, PERCENTAGE := COST/SUM(COST) , by = PLACEMENT]