AWS Sagemaker: AttributeError: module 'pandas' has no attribute 'core'

Question

让我先声明一下，我是 tensorflow 的新手，甚至是 AWS Sagemaker 的新手。

我有一些 tensorflow/keras 代码是我在本地 dockerized Jupyter notebook 上编写和测试的，运行良好。在其中，我导入了一个 csv 文件作为输入。

我使用 Sagemaker 通过 conda_tensorflow_p36 启动一个 jupyter notebook 实例。我修改了 pandas.read_csv() 代码以指向我的输入文件，该文件现在托管在 S3 存储桶上。

所以我把这行代码改成了

import pandas as pd

data = pd.read_csv("/input.csv", encoding="latin1")

至此

import pandas as pd

data = pd.read_csv("https://s3.amazonaws.com/my-sagemaker-bucket/input.csv", encoding="latin1")

我收到这个错误

AttributeError: module 'pandas' has no attribute 'core'

我不确定这是否是权限问题。我读到只要我用字符串 "sagemaker" 命名我的桶，它就应该可以访问它。

Answer 1

例如从 S3 中提取我们的数据：

import boto3
import io
import pandas as pd


# Set below parameters
bucket = '<bucket name>'
key = 'data/training/iris.csv'
endpointName = 'decision-trees'

# Pull our data from S3
s3 = boto3.client('s3')
f = s3.get_object(Bucket=bucket, Key=key)

# Make a dataframe
shape = pd.read_csv(io.BytesIO(f['Body'].read()), header=None)

AWS Sagemaker: AttributeError: module 'pandas' has no attribute 'core'

AWS Sagemaker: AttributeError: module 'pandas' has no attribute 'core'

pandas

tensorflow

amazon-sagemaker