R计算不同间隔长度的年增长率或标准化增长率
R calculating yearly or normalized growth rates for different interval lengths
我有一个结构如下所示的数据框。我想计算年增长率。问题是并非所有模型的时间步长都是相同的。在下面的示例中,REMIND 以 5 年为间隔提供数据,而 TIAM-ECN 以 10 年为间隔提供数据。
model scenario region year value
REMIND Base NORTH_AM 2010 314.1330
REMIND Base CHINA+ 2010 1325.9220
REMIND RefPol NORTH_AM 2010 314.1330
REMIND RefPol CHINA+ 2010 1325.9220
TIAM-ECN Base NORTH_AM 2010 344.4005
TIAM-ECN Base CHINA+ 2010 1341.3352
TIAM-ECN RefPol NORTH_AM 2010 344.4005
TIAM-ECN RefPol CHINA+ 2010 1341.3352
REMIND Base NORTH_AM 2015 327.6270
REMIND Base CHINA+ 2015 1354.3180
REMIND RefPol NORTH_AM 2015 327.6270
REMIND RefPol CHINA+ 2015 1354.3180
REMIND Base NORTH_AM 2020 340.8490
REMIND Base CHINA+ 2020 1372.4630
REMIND RefPol NORTH_AM 2020 340.8490
REMIND RefPol CHINA+ 2020 1372.4630
TIAM-ECN Base NORTH_AM 2020 374.2647
TIAM-ECN Base CHINA+ 2020 1387.7915
TIAM-ECN RefPol NORTH_AM 2020 374.2647
TIAM-ECN RefPol CHINA+ 2020 1387.7915
计算不同时间间隔的增长率很简单:
library(dplyr)
tmp_gr <- group_by(df, model, scenario, region) %>%
mutate(value = value / lag(value) - 1) %>%
ungroup()
收益率(我省略了 2010 年的 NA):
model scenario region year value
REMIND Base NORTH_AM 2015 -0.7557456
REMIND Base CHINA+ 2015 3.1337191
REMIND RefPol NORTH_AM 2015 -0.7580871
REMIND RefPol CHINA+ 2015 3.1337191
REMIND Base NORTH_AM 2020 -0.7483242
REMIND Base CHINA+ 2020 3.0266012
REMIND RefPol NORTH_AM 2020 -0.7516516
REMIND RefPol CHINA+ 2020 3.0266012
TIAM-ECN Base NORTH_AM 2020 -0.7273044
TIAM-ECN Base CHINA+ 2020 2.7080483
TIAM-ECN RefPol NORTH_AM 2020 -0.7303164
TIAM-ECN RefPol CHINA+ 2020 2.7080483
但是现在,通过将区间增长率除以区间长度来计算年增长率
tmp_gr_yearly <- group_by(df, model, scenario, region) %>%
mutate(value = (value / lag(value) - 1) / (year - lag(year))) %>%
ungroup()
产量:
model scenario region year value
REMIND Base NORTH_AM 2015 -0.1511491
REMIND Base CHINA+ 2015 Inf
REMIND RefPol NORTH_AM 2015 -Inf
REMIND RefPol CHINA+ 2015 Inf
REMIND Base NORTH_AM 2020 -0.1496648
REMIND Base CHINA+ 2020 Inf
REMIND RefPol NORTH_AM 2020 -Inf
REMIND RefPol CHINA+ 2020 Inf
TIAM-ECN Base NORTH_AM 2020 -Inf
TIAM-ECN Base CHINA+ 2020 Inf
TIAM-ECN RefPol NORTH_AM 2020 -Inf
TIAM-ECN RefPol CHINA+ 2020 Inf
我不明白 Inf
来自哪里。
有什么想法吗?
我计算简单的非标准化增长率的例子已经错了。
无论如何,我想我自己想通了:
tmp_gr <- group_by(df, model, scenario, region) %>%
mutate(value = lag(value, n=0, order_by=year) / lag(value, order_by=year) - 1) %>%
ungroup()
tmp_gr_yearly <- group_by(df, model, scenario, region) %>%
mutate(value = (lag(value, n=0, order_by=year) / lag(value, order_by=year) - 1) / (lag(year, n=0, order_by=year) - lag(year, order_by=year))) %>%
ungroup()
通过对所有值使用滞后运算符并明确告知顺序,整个过程对无序数据变得稳健。
我有一个结构如下所示的数据框。我想计算年增长率。问题是并非所有模型的时间步长都是相同的。在下面的示例中,REMIND 以 5 年为间隔提供数据,而 TIAM-ECN 以 10 年为间隔提供数据。
model scenario region year value
REMIND Base NORTH_AM 2010 314.1330
REMIND Base CHINA+ 2010 1325.9220
REMIND RefPol NORTH_AM 2010 314.1330
REMIND RefPol CHINA+ 2010 1325.9220
TIAM-ECN Base NORTH_AM 2010 344.4005
TIAM-ECN Base CHINA+ 2010 1341.3352
TIAM-ECN RefPol NORTH_AM 2010 344.4005
TIAM-ECN RefPol CHINA+ 2010 1341.3352
REMIND Base NORTH_AM 2015 327.6270
REMIND Base CHINA+ 2015 1354.3180
REMIND RefPol NORTH_AM 2015 327.6270
REMIND RefPol CHINA+ 2015 1354.3180
REMIND Base NORTH_AM 2020 340.8490
REMIND Base CHINA+ 2020 1372.4630
REMIND RefPol NORTH_AM 2020 340.8490
REMIND RefPol CHINA+ 2020 1372.4630
TIAM-ECN Base NORTH_AM 2020 374.2647
TIAM-ECN Base CHINA+ 2020 1387.7915
TIAM-ECN RefPol NORTH_AM 2020 374.2647
TIAM-ECN RefPol CHINA+ 2020 1387.7915
计算不同时间间隔的增长率很简单:
library(dplyr)
tmp_gr <- group_by(df, model, scenario, region) %>%
mutate(value = value / lag(value) - 1) %>%
ungroup()
收益率(我省略了 2010 年的 NA):
model scenario region year value
REMIND Base NORTH_AM 2015 -0.7557456
REMIND Base CHINA+ 2015 3.1337191
REMIND RefPol NORTH_AM 2015 -0.7580871
REMIND RefPol CHINA+ 2015 3.1337191
REMIND Base NORTH_AM 2020 -0.7483242
REMIND Base CHINA+ 2020 3.0266012
REMIND RefPol NORTH_AM 2020 -0.7516516
REMIND RefPol CHINA+ 2020 3.0266012
TIAM-ECN Base NORTH_AM 2020 -0.7273044
TIAM-ECN Base CHINA+ 2020 2.7080483
TIAM-ECN RefPol NORTH_AM 2020 -0.7303164
TIAM-ECN RefPol CHINA+ 2020 2.7080483
但是现在,通过将区间增长率除以区间长度来计算年增长率
tmp_gr_yearly <- group_by(df, model, scenario, region) %>%
mutate(value = (value / lag(value) - 1) / (year - lag(year))) %>%
ungroup()
产量:
model scenario region year value
REMIND Base NORTH_AM 2015 -0.1511491
REMIND Base CHINA+ 2015 Inf
REMIND RefPol NORTH_AM 2015 -Inf
REMIND RefPol CHINA+ 2015 Inf
REMIND Base NORTH_AM 2020 -0.1496648
REMIND Base CHINA+ 2020 Inf
REMIND RefPol NORTH_AM 2020 -Inf
REMIND RefPol CHINA+ 2020 Inf
TIAM-ECN Base NORTH_AM 2020 -Inf
TIAM-ECN Base CHINA+ 2020 Inf
TIAM-ECN RefPol NORTH_AM 2020 -Inf
TIAM-ECN RefPol CHINA+ 2020 Inf
我不明白 Inf
来自哪里。
有什么想法吗?
我计算简单的非标准化增长率的例子已经错了。
无论如何,我想我自己想通了:
tmp_gr <- group_by(df, model, scenario, region) %>%
mutate(value = lag(value, n=0, order_by=year) / lag(value, order_by=year) - 1) %>%
ungroup()
tmp_gr_yearly <- group_by(df, model, scenario, region) %>%
mutate(value = (lag(value, n=0, order_by=year) / lag(value, order_by=year) - 1) / (lag(year, n=0, order_by=year) - lag(year, order_by=year))) %>%
ungroup()
通过对所有值使用滞后运算符并明确告知顺序,整个过程对无序数据变得稳健。