将数据分配给单独的数据框,按年份组合和排序

Assign data to separate dataframes, combining and sort by year

我有 40 年的数据,所以我试图将它们中的每一个单独分配给一个数据框,然后将它们全部存储在一个新的数据框中,然后对它们进行排序。以下是我目前所拥有的:

import pandas as pd
from pandas import DataFrame

year = 1976
count = 1

for i in range(0,40):

    df[count] = pd.read_excel('42003h'+str(year)+'.xlsx', sheet_name = 'Sheet1')

    count = count + 1
    year = 1976 + 1

我遇到了这个错误

Wrong number of items passed 12, placement implies 1

请提供任何帮助?

我认为你需要初始化你的字典:

df = {}
for i in range(0,40):

    df[count] = pd.read_excel('42003h'+str(year)+'.xlsx', sheet_name = 'Sheet1')

我认为您可以先创建 Dataframes dfs 的列表,然后按列 year:

创建 concat it to one df . count is not necessary. Last IIUC sort_values
import pandas as pd

year = 1976

dfs = []
for i in range(0,40):
    dfs.append(pd.read_excel('42003h'+str(year)+'.xlsx', sheet_name = 'Sheet1'))
    year += 1

#if need concat by columns        
#df = pd.concat(dfs, axis=1)   

#if need concat by rows
df = pd.concat(dfs)  

#if need sort by column `year`
df.sort_values(by='year', inplace=True) 

我会这样做:

import glob
import pandas as pd

files = glob.glob('42003h*.xlsx')

# if you want to merge your DFs horizontally then add: `axis=1` parameter
df = pd.concat([pd.read_excel(f) for f in files], ignore_index=True).sort_values('year')

count = len(files)