有没有办法防止 Pandas read_json (orient='split') 机会性地将 float64 列转换为 int64？

Question

我希望能够使用 pandas 中的 read_json 来读取 json 文件，以与使用 to_json 编写时相同的方式解释列。在下面的示例中，'Dec' 列在使用 to_json 编写时是 dtype float64，而 json 文件将数字显示为浮点数（1.0、2.0 等）。但是，当使用 read_json 读取时，数据框中 'Dec' 的列类型最终为 int64。我希望它仍然是 float64，即使值恰好都是整数。如果重要的话，这些正在使用 orient='split' 。有没有办法做到这一点？我正在寻找一种不依赖于特定列名的通用方法，因为在实践中我希望它适用于许多不同的数据帧。

tmp_file = 'c:/Temp/in_df.json'
in_df = pd.DataFrame([['A', 2.0, 4], ['B', 3.0, 2], ['C', 4.0, 3]], columns=['Key', 'Dec', 'Num'])
dec_column_type_in = in_df['Dec'].dtype # float64
in_json = in_df.to_json(path_or_buf=tmp_file, orient='split', index=False)
out_df = pd.read_json(tmp_file, orient='split')
dec_column_type_out = out_df['Dec'].dtype # int64

Answer 1

您可以尝试手动关闭 dtypes 推断

out_df = pd.read_json(tmp_file, orient='split', dtype=False)

print(out_df.dtypes)

Key     object
Dec    float64
Num      int64
dtype: object

有没有办法防止 Pandas read_json (orient='split') 机会性地将 float64 列转换为 int64？

Is there a way to prevent Pandas read_json (orient='split') from opportunistically converting a float64 column to int64?

python

pandas