读取腌制文件，pandas 数据帧中的长度不等错误

Question

我想在 python 3.5 中读取 pickle 文件。我正在使用以下代码。

以下是我的输出，我想将其加载为 pandas 数据帧。

当我尝试使用 df = pd.DataFrame(df) 转换为 pd 数据帧时，出现以下错误。

ValueError: arrays must all be same length

link 到数据- https://drive.google.com/file/d/1lSFBPLbUCluWfPjzolUZKmD98yelTSXt/view?usp=sharing

Answer 1

我认为你需要 dict comprehension 和 concat:

from pandas.io.json import json_normalize,
import pickle

fh = open("imdbnames40.pkl", 'rb')
d = pickle.load(fh)
df = pd.concat({k:json_normalize(v, 'scores', ['best']) for k,v in d.items()})
print (df.head())

                         ethnicity score              best
'Aina Rapoza 0               Asian  0.89             Asian
             1      GreaterAfrican  0.05             Asian
             2     GreaterEuropean  0.06             Asian
             3  IndianSubContinent  0.11  GreaterEastAsian
             4    GreaterEastAsian  0.89  GreaterEastAsian

然后如果需要MultiIndex第一层的列：

df = df.reset_index(level=1, drop=True).rename_axis('names').reset_index()

读取腌制文件，pandas 数据帧中的长度不等错误

Reading a pickled file, unequal length error in pandas dataframe

pickle

dataframe

python-3.x

pandas

data-science