ArrowTypeError: Did not pass numpy.dtype object', 'Conversion failed for column X with type int32

ArrowTypeError: Did not pass numpy.dtype object', 'Conversion failed for column X with type int32

问题

我正在尝试将数据框保存为 Databricks 上的镶木地板文件,出现 ArrowTypeError。

Databricks 运行时版本: 7.6 ML(包括 Apache Spark 3.0.1、Scala 2.12)

日志跟踪

ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column inv_yr with type int32')

您面临的问题源于您使用的是带有最新 numpy 1.20 版本的旧 pyarrow 方向盘。您 运行 遇到了错误 "PyArray_DescrCheck doesn't work anymore if the consumer library was compiled with an older NumPy version "。更新您的 pyarrow 版本或降级到 numpy<1.20.