计算 pandas 数据框中组内的价格 returns
Calculating price returns within groups in pandas dataframe
我有一个数据框 df
,其中包含以下信息:
DateTime MDate Fwd Type
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A
1/6/2010 2/1/2010 62.2 A
1/7/2010 2/1/2010 61.1 A
1/8/2010 2/1/2010 60.25 A
1/11/2010 2/1/2010 57.12 A
1/12/2010 2/1/2010 57.35 A
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B
1/15/2010 2/1/2010 59.38 B
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B
8/3/2013 5/1/2014 57.9 B
8/4/2013 5/1/2014 59.25 B
8/5/2013 5/1/2014 57.67 B
我想创建以下内容:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B (0.02)
1/15/2010 2/1/2010 59.38 B 0.04
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B (0.01)
8/3/2013 5/1/2014 57.9 B 0.01
8/4/2013 5/1/2014 59.25 B 0.02
8/5/2013 5/1/2014 57.67 B (0.03)
我想根据组 (MDate, Type)
隔离时间序列并计算 pctChgange
因此,在我上面的示例中,第一组创建如下。它具有相同的 MDate
和 Type
对于所有行都是相同的:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
我计算 pctChange
为 59.73/61.17 - 1 = (0.02)
我正在考虑实施以下版本的某个版本:
import pandas as pd
df2 = pd.pivot_table(df, index=['MDate', 'Type'], values=['Fwd'], aggfunc=someFunction)
我不知道我可以为 someFunction
实现什么功能
应该这样做:
df[['MDate', 'DateTime']] = df[['MDate', 'DateTime']].apply(lambda x: pd.to_datetime(x, infer_datetime_format=True))
df['pctChange'] = df.groupby(['MDate', 'Type'])['Fwd'].transform(pd.Series.pct_change).fillna('').apply(lambda x: '({0:.2f})'.format(-x) if x < 0 else '{0:.2f}'.format(x) if x else x)
df
# DateTime Fwd MDate Type pctChange
#0 2010-01-04 61.17 2010-02-01 A
#1 2010-01-05 59.73 2010-02-01 A (0.02)
#2 2010-01-06 62.20 2010-02-01 A 0.04
#3 2010-01-07 61.10 2010-02-01 A (0.02)
#4 2010-01-08 60.25 2010-02-01 A (0.01)
#5 2010-01-11 57.12 2010-02-01 A (0.05)
#6 2010-01-12 57.35 2010-02-01 A 0.00
#7 2010-01-13 58.12 2010-02-01 B
#8 2010-01-14 57.12 2010-02-01 B (0.02)
#9 2010-01-15 59.38 2010-02-01 B 0.04
#10 2013-08-01 57.67 2014-05-01 B
#11 2013-08-02 57.25 2014-05-01 B (0.01)
#12 2013-08-03 57.90 2014-05-01 B 0.01
#13 2013-08-04 59.25 2014-05-01 B 0.02
#14 2013-08-05 57.67 2014-05-01 B (0.03)
第一行将 MDate
和 DateTime
转换为 datetime
,因为我不确定它们的格式是否正确。
我有一个数据框 df
,其中包含以下信息:
DateTime MDate Fwd Type
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A
1/6/2010 2/1/2010 62.2 A
1/7/2010 2/1/2010 61.1 A
1/8/2010 2/1/2010 60.25 A
1/11/2010 2/1/2010 57.12 A
1/12/2010 2/1/2010 57.35 A
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B
1/15/2010 2/1/2010 59.38 B
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B
8/3/2013 5/1/2014 57.9 B
8/4/2013 5/1/2014 59.25 B
8/5/2013 5/1/2014 57.67 B
我想创建以下内容:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B (0.02)
1/15/2010 2/1/2010 59.38 B 0.04
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B (0.01)
8/3/2013 5/1/2014 57.9 B 0.01
8/4/2013 5/1/2014 59.25 B 0.02
8/5/2013 5/1/2014 57.67 B (0.03)
我想根据组 (MDate, Type)
隔离时间序列并计算 pctChgange
因此,在我上面的示例中,第一组创建如下。它具有相同的 MDate
和 Type
对于所有行都是相同的:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
我计算 pctChange
为 59.73/61.17 - 1 = (0.02)
我正在考虑实施以下版本的某个版本:
import pandas as pd
df2 = pd.pivot_table(df, index=['MDate', 'Type'], values=['Fwd'], aggfunc=someFunction)
我不知道我可以为 someFunction
应该这样做:
df[['MDate', 'DateTime']] = df[['MDate', 'DateTime']].apply(lambda x: pd.to_datetime(x, infer_datetime_format=True))
df['pctChange'] = df.groupby(['MDate', 'Type'])['Fwd'].transform(pd.Series.pct_change).fillna('').apply(lambda x: '({0:.2f})'.format(-x) if x < 0 else '{0:.2f}'.format(x) if x else x)
df
# DateTime Fwd MDate Type pctChange
#0 2010-01-04 61.17 2010-02-01 A
#1 2010-01-05 59.73 2010-02-01 A (0.02)
#2 2010-01-06 62.20 2010-02-01 A 0.04
#3 2010-01-07 61.10 2010-02-01 A (0.02)
#4 2010-01-08 60.25 2010-02-01 A (0.01)
#5 2010-01-11 57.12 2010-02-01 A (0.05)
#6 2010-01-12 57.35 2010-02-01 A 0.00
#7 2010-01-13 58.12 2010-02-01 B
#8 2010-01-14 57.12 2010-02-01 B (0.02)
#9 2010-01-15 59.38 2010-02-01 B 0.04
#10 2013-08-01 57.67 2014-05-01 B
#11 2013-08-02 57.25 2014-05-01 B (0.01)
#12 2013-08-03 57.90 2014-05-01 B 0.01
#13 2013-08-04 59.25 2014-05-01 B 0.02
#14 2013-08-05 57.67 2014-05-01 B (0.03)
第一行将 MDate
和 DateTime
转换为 datetime
,因为我不确定它们的格式是否正确。