我有 12 个 pd 数据帧,我想从每个数据帧中提取一列并作为新的 df 传递并根据源 df 重命名

I have 12 pd dataframes , i want to extract one column from each and pass as new df and rename based on source df

**我需要将所有的“adj close”提取到一个新的 DF 中,并根据源重命名,根据日期映射

new_DF = date AAL AAPL ALK .....(含adj close) 请帮忙**

AAL = pd.read_csv("AAL.csv")

AAPL = pd.read_csv("AAPL.csv")

ALK = pd.read_csv("ALK.csv")

亚马逊=pd.read_csv("AMZN.csv")

BHC = pd.read_csv("BHC.csv")

CS = pd.read_csv("CS.csv")

DB = pd.read_csv("DB.csv")

GS = pd.read_csv("GS.csv")

GOOG = pd.read_csv("GOOG.csv")

HA = pd.read_csv("HA.csv")

JNJ = pd.read_csv("JNJ.csv")

MRK = pd.read_csv("MRK.csv")

SP500 = pd.read_csv("S&P500.csv")

df = 日期 |打开|高 |低|关闭| adj 关闭 |音量

试试这个:

# load csv data
# define relative path to folder containing csv data
files_folder = '/path/to/csv/'

# load all csv files in one dataframe
df_list = []
for file in glob.glob(os.path.join(files_folder, '*.csv')):
    df = pd.read_csv(file)
    # write here column you want to select
    df_column = df['column_name'].rename(columns={'column_name':file[:-4]})
    df_list.append(df_column)
# concatenate the list of dataframes into one
df_final = pd.concat(df_list, axis=1)

这是一个例子。没有您的 .csv 文件意味着我们需要在如何获取数据方面发挥创意,但假设您有 dictDataFrames,每个代码一个。

这里我们使用Yahoo finance来得到相似的数据。我们要查找的列 ('adj close') 不在该数据中,因此对于此示例,我们将改用 Close

import yfinance as yf

tickers = ['AAL', 'AAPL', 'AMZN', 'GOOG']
sources = {ticker: yf.Ticker(ticker).history(period='5d') for ticker in tickers}

此时,我们已经获得了每个代码的数据。例如:

>>> sources['AAPL']
            Open        High        Low         Close       Volume     Dividends  Stock Splits
Date                                                                                          
2022-03-30  178.550003  179.610001  176.699997  177.770004   92633200  0          0           
2022-03-31  177.839996  178.029999  174.399994  174.610001  103049300  0          0           
2022-04-01  174.029999  174.880005  171.940002  174.309998   78699800  0          0           
2022-04-04  174.570007  178.490005  174.440002  178.440002   76468400  0          0           
2022-04-05  177.500000  178.300003  174.419998  175.059998   73311300  0          0           

在您的情况下,您将从 CSV 文件中获取数据,因此:

sources = {k: pd.read_csv(f'{k}.csv').set_index('Date') for k in tickers}

现在,回答你的问题:

df = pd.concat([v['Close'].to_frame(k) for k, v in sources.items()], axis=1)

>>> df
                  AAL        AAPL         AMZN         GOOG
Date                                                       
2022-03-30  18.049999  177.770004  3326.020020  2852.889893
2022-03-31  18.250000  174.610001  3259.949951  2792.989990
2022-04-01  18.240000  174.309998  3271.199951  2814.000000
2022-04-04  18.230000  178.440002  3366.929932  2872.850098
2022-04-05  17.840000  175.059998  3281.100098  2821.260010

同样,在您的情况下,您会 select 'adj close' 列。