如何根据 df 行中的值在 df_s_t 中查找值并将结果保存在 df['s_t'] 中?
How can I find values in df_s_t based on values in the rows of df and save the results in df['s_t']?
我有以下 DataFrame (df):
print(df.head())
Date Contract_Name Maturity ... Call_Put Option_Price t
0 2016-01-04 Aalberts Industries 2017-10-20 ... C 12.29 0.049315
1 2016-01-05 Aalberts Industries 2017-10-20 ... P 0.01 0.049315
2 2016-01-06 Aalberts Industries 2017-10-20 ... C 11.29 0.049315
3 2016-01-04 WOLTERS-KLUWER 2017-10-20 ... P 0.01 0.049315
4 2016-01-05 WOLTERS-KLUWER 2017-10-20 ... C 9.29 0.049315
我想添加一个 df['s_t'] 列,它需要来自 df_s_t 的数据,这个 DataFrame 如下所示:
print(df_t_s.head())
Date Aalberts Industries ... UNILEVER WOLTERS-KLUWER
0 2016-01-04 30.125 ... 38.785 30.150
1 2016-01-05 30.095 ... 39.255 30.425
2 2016-01-06 29.405 ... 38.575 29.920
3 2016-01-07 29.005 ... 37.980 30.690
4 2016-01-08 28.930 ... 37.320 30.070
df['Date']可以匹配df_s_t['Date'],df['Contract_Name']可以匹配[=36=的列名].
我希望有人可以帮助我根据 df_s_t 中的值创建 df['s_t'](如上所述)。另请参阅下面的 df 示例
print(df.head())
Date Contract_Name Maturity ... Call_Put Option_Price t s_t
0 2016-01-04 Aalberts Industries 2017-10-20 ... C 12.29 0.049315 30.125
1 2016-01-05 Aalberts Industries 2017-10-20 ... P 0.01 0.049315 30.095
2 2016-01-06 Aalberts Industries 2017-10-20 ... C 11.29 0.049315 29.405
3 2016-01-04 WOLTERS-KLUWER 2017-10-20 ... P 0.01 0.049315 30.150
4 2016-01-05 WOLTERS-KLUWER 2017-10-20 ... C 9.29 0.049315 30.425
解决方案
df_s_t=pd.melt(df_s_t,id_vars=['Date'])
df_s_t=df_s_t.rename(columns={'variable':"Contract_Name"})
print(df_s_t.head())
Date Contract_Name value
0 2016-01-04 Aalberts Industries 30.125
1 2016-01-05 Aalberts Industries 30.095
2 2016-01-06 Aalberts Industries 29.405
3 2016-01-07 Aalberts Industries 29.005
4 2016-01-08 Aalberts Industries 28.93
现在我们可以使用合并了:
df=pd.merge(df,df_s_t,on=['Date','Contract_Name'],how='left')
df=df.rename(columns={'value':'s_t'})
print(df.head())
Date Contract_Name Maturity ... Option_Price t s_t
0 2017-10-02 Aalberts Industries 2017-10-20 ... 12.29 0.049315 41.29
1 2017-10-02 Aalberts Industries 2017-10-20 ... 0.01 0.049315 41.29
2 2017-10-02 Aalberts Industries 2017-10-20 ... 11.29 0.049315 41.29
3 2017-10-02 Aalberts Industries 2017-10-20 ... 0.01 0.049315 41.29
4 2017-10-02 Aalberts Industries 2017-10-20 ... 9.29 0.049315 41.29
这里有一个解决方案。
1) 我简化了你的数据,df1 只有 2 列 (Date and Contract_Name) / df2 只有 4 列 (Date / A / B / C)
2)我融化了 df2(变量被称为 'Contract_Name'),然后 groupby Date 和 Contract_Name
3)我合并两个数据框
4) 打印
import pandas as pd
df1 = pd.read_excel('Book1.xlsx', sheet_name='df1')
df2 = pd.melt(pd.read_excel('Book1.xlsx', sheet_name='df2'), id_vars=["Date"],var_name="Contract_Name", value_name="Value").groupby(['Date', 'Contract_Name']).sum().reset_index()
df = pd.merge(df1, df2, how='left', on=['Date','Contract_Name'])
print(df)
我有以下 DataFrame (df):
print(df.head())
Date Contract_Name Maturity ... Call_Put Option_Price t
0 2016-01-04 Aalberts Industries 2017-10-20 ... C 12.29 0.049315
1 2016-01-05 Aalberts Industries 2017-10-20 ... P 0.01 0.049315
2 2016-01-06 Aalberts Industries 2017-10-20 ... C 11.29 0.049315
3 2016-01-04 WOLTERS-KLUWER 2017-10-20 ... P 0.01 0.049315
4 2016-01-05 WOLTERS-KLUWER 2017-10-20 ... C 9.29 0.049315
我想添加一个 df['s_t'] 列,它需要来自 df_s_t 的数据,这个 DataFrame 如下所示:
print(df_t_s.head())
Date Aalberts Industries ... UNILEVER WOLTERS-KLUWER
0 2016-01-04 30.125 ... 38.785 30.150
1 2016-01-05 30.095 ... 39.255 30.425
2 2016-01-06 29.405 ... 38.575 29.920
3 2016-01-07 29.005 ... 37.980 30.690
4 2016-01-08 28.930 ... 37.320 30.070
df['Date']可以匹配df_s_t['Date'],df['Contract_Name']可以匹配[=36=的列名].
我希望有人可以帮助我根据 df_s_t 中的值创建 df['s_t'](如上所述)。另请参阅下面的 df 示例
print(df.head())
Date Contract_Name Maturity ... Call_Put Option_Price t s_t
0 2016-01-04 Aalberts Industries 2017-10-20 ... C 12.29 0.049315 30.125
1 2016-01-05 Aalberts Industries 2017-10-20 ... P 0.01 0.049315 30.095
2 2016-01-06 Aalberts Industries 2017-10-20 ... C 11.29 0.049315 29.405
3 2016-01-04 WOLTERS-KLUWER 2017-10-20 ... P 0.01 0.049315 30.150
4 2016-01-05 WOLTERS-KLUWER 2017-10-20 ... C 9.29 0.049315 30.425
解决方案
df_s_t=pd.melt(df_s_t,id_vars=['Date'])
df_s_t=df_s_t.rename(columns={'variable':"Contract_Name"})
print(df_s_t.head())
Date Contract_Name value
0 2016-01-04 Aalberts Industries 30.125
1 2016-01-05 Aalberts Industries 30.095
2 2016-01-06 Aalberts Industries 29.405
3 2016-01-07 Aalberts Industries 29.005
4 2016-01-08 Aalberts Industries 28.93
现在我们可以使用合并了:
df=pd.merge(df,df_s_t,on=['Date','Contract_Name'],how='left')
df=df.rename(columns={'value':'s_t'})
print(df.head())
Date Contract_Name Maturity ... Option_Price t s_t
0 2017-10-02 Aalberts Industries 2017-10-20 ... 12.29 0.049315 41.29
1 2017-10-02 Aalberts Industries 2017-10-20 ... 0.01 0.049315 41.29
2 2017-10-02 Aalberts Industries 2017-10-20 ... 11.29 0.049315 41.29
3 2017-10-02 Aalberts Industries 2017-10-20 ... 0.01 0.049315 41.29
4 2017-10-02 Aalberts Industries 2017-10-20 ... 9.29 0.049315 41.29
这里有一个解决方案。
1) 我简化了你的数据,df1 只有 2 列 (Date and Contract_Name) / df2 只有 4 列 (Date / A / B / C)
2)我融化了 df2(变量被称为 'Contract_Name'),然后 groupby Date 和 Contract_Name
3)我合并两个数据框
4) 打印
import pandas as pd
df1 = pd.read_excel('Book1.xlsx', sheet_name='df1')
df2 = pd.melt(pd.read_excel('Book1.xlsx', sheet_name='df2'), id_vars=["Date"],var_name="Contract_Name", value_name="Value").groupby(['Date', 'Contract_Name']).sum().reset_index()
df = pd.merge(df1, df2, how='left', on=['Date','Contract_Name'])
print(df)