使用 aws Sagemaker 的 运行 训练作业出错
error to run training job with aws Sagemaker
我正在尝试通过 github 示例将我自己的 sickit-learn ML 模型与 SageMaker 结合使用。
python 代码如下:
# Define IAM role import boto3
import re
import os
import numpy as np
import pandas as pd
from sagemaker import get_execution_role
import sagemaker as sage from time
import gmtime, strftime
role = get_execution_role()
ess = sage.Session()
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/decision-trees-sample:latest'.format(account, region)
output_path="s3://output"
sess
tree = sage.estimator.Estimator(image,
role, 1, 'ml.c4.2xlarge',
output_path='s3-eu-west-1.amazonaws.com/output',
sagemaker_session=sess)
tree.fit("s3://output/iris.csv")
但是我得到这个错误:
INFO:sagemaker:Creating training-job with name:
decision-trees-sample-2018-04-24-13-13-38-281
--------------------------------------------------------------------------- ClientError Traceback (most recent call
last) in ()
14 sagemaker_session=sess)
15
---> 16 tree.fit("s3://inteldatastore-cyrine/iris.csv")
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py
in fit(self, inputs, wait, logs, job_name)
161 self.output_path = 's3://{}/'.format(self.sagemaker_session.default_bucket())
162
--> 163 self.latest_training_job = _TrainingJob.start_new(self, inputs)
164 if wait:
165 self.latest_training_job.wait(logs=logs)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py
in start_new(cls, estimator, inputs)
336 input_config=input_config, role=role,
job_name=estimator._current_job_name,
337 output_config=output_config, resource_config=resource_config,
--> 338 hyperparameters=hyperparameters, stop_condition=stop_condition)
339
340 return cls(estimator.sagemaker_session, estimator._current_job_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py
in train(self, image, input_mode, input_config, role, job_name,
output_config, resource_config, hyperparameters, stop_condition)
242 LOGGER.info('Creating training-job with name: {}'.format(job_name))
243 LOGGER.debug('train request: {}'.format(json.dumps(train_request, indent=4)))
--> 244 self.sagemaker_client.create_training_job(**train_request)
245
246 def create_model(self, name, role, primary_container):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py
in _api_call(self, *args, **kwargs)
312 "%s() only accepts keyword arguments." % py_operation_name)
313 # The "self" in this scope is referring to the BaseClient.
--> 314 return self._make_api_call(operation_name, kwargs)
315
316 _api_call.name = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py
in _make_api_call(self, operation_name, api_params)
610 error_code = parsed_response.get("Error", {}).get("Code")
611 error_class = self.exceptions.from_code(error_code)
--> 612 raise error_class(parsed_response, operation_name)
613 else:
614 return parsed_response
ClientError: An error occurred (AccessDeniedException) when calling
the CreateTrainingJob operation: User:
arn:aws:sts::307504647302:assumed-role/default/SageMaker is
not authorized to perform: sagemaker:CreateTrainingJob on resource:
arn:aws:sagemaker:eu-west-1:307504647302:training-job/decision-trees-sample-2018-04-24-13-13-38-281
你能帮我解决这个问题吗?
谢谢
您似乎无权访问资源
arn:aws:sagemaker:eu-west-1:307504647302:training-job/decision-trees-sample-2018-04-24-13-13-38-281
您能否检查资源 url 是否正确以及是否在安全组中设置了适当的权限。
我在开始使用 SageMaker 时遇到了类似的问题,所以我开发了这个开源项目 https://github.com/Kenza-AI/sagify (sagify),它是一个 CLI 工具,可以帮助您训练和部署自己的机器 Learning/Deep以非常简单的方式在 SageMaker 上学习模型。无论我使用什么库(Keras、Tensorflow、scikit-learn、LightFM、spacy 等),我都设法训练和部署了我的所有 ML 模型。本质上,您可以以经典的 Pythonic 方式指定所有依赖项,即在 requiments.txt 中,sagify 将读取它们并将它们安装在 Docker 图像上。然后,可以在 SageMaker 上执行此 Docker 映像以进行训练和部署。
此外,我在 sagify 文档 (https://kenza-ai.github.io/sagify/) 中指定了一个关于如何设置 AWS 账户以避免权限相关问题的一次性流程。
可能您正在使用 AWS Educate 账户。
目前您无法使用 SageMaker 服务通过 AWS Educate Starter 账户创建训练或建模作业。
目前,如果您想 use/deploy 使用 SageMaker 服务进行训练,您可以使用自己的个人 AWS 账户。
但是,您可以通过 AWS Educate 账户通过 SageMaker 继续使用 Jupyter 笔记本。
我正在尝试通过 github 示例将我自己的 sickit-learn ML 模型与 SageMaker 结合使用。
python 代码如下:
# Define IAM role import boto3
import re
import os
import numpy as np
import pandas as pd
from sagemaker import get_execution_role
import sagemaker as sage from time
import gmtime, strftime
role = get_execution_role()
ess = sage.Session()
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/decision-trees-sample:latest'.format(account, region)
output_path="s3://output"
sess
tree = sage.estimator.Estimator(image,
role, 1, 'ml.c4.2xlarge',
output_path='s3-eu-west-1.amazonaws.com/output',
sagemaker_session=sess)
tree.fit("s3://output/iris.csv")
但是我得到这个错误:
INFO:sagemaker:Creating training-job with name: decision-trees-sample-2018-04-24-13-13-38-281
--------------------------------------------------------------------------- ClientError Traceback (most recent call last) in () 14 sagemaker_session=sess) 15 ---> 16 tree.fit("s3://inteldatastore-cyrine/iris.csv")
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name) 161 self.output_path = 's3://{}/'.format(self.sagemaker_session.default_bucket()) 162 --> 163 self.latest_training_job = _TrainingJob.start_new(self, inputs) 164 if wait: 165 self.latest_training_job.wait(logs=logs)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in start_new(cls, estimator, inputs) 336 input_config=input_config, role=role, job_name=estimator._current_job_name, 337 output_config=output_config, resource_config=resource_config, --> 338 hyperparameters=hyperparameters, stop_condition=stop_condition) 339 340 return cls(estimator.sagemaker_session, estimator._current_job_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in train(self, image, input_mode, input_config, role, job_name, output_config, resource_config, hyperparameters, stop_condition) 242 LOGGER.info('Creating training-job with name: {}'.format(job_name)) 243 LOGGER.debug('train request: {}'.format(json.dumps(train_request, indent=4))) --> 244 self.sagemaker_client.create_training_job(**train_request) 245 246 def create_model(self, name, role, primary_container):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 312 "%s() only accepts keyword arguments." % py_operation_name) 313 # The "self" in this scope is referring to the BaseClient. --> 314 return self._make_api_call(operation_name, kwargs) 315 316 _api_call.name = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 610 error_code = parsed_response.get("Error", {}).get("Code") 611 error_class = self.exceptions.from_code(error_code) --> 612 raise error_class(parsed_response, operation_name) 613 else: 614 return parsed_response
ClientError: An error occurred (AccessDeniedException) when calling the CreateTrainingJob operation: User: arn:aws:sts::307504647302:assumed-role/default/SageMaker is not authorized to perform: sagemaker:CreateTrainingJob on resource: arn:aws:sagemaker:eu-west-1:307504647302:training-job/decision-trees-sample-2018-04-24-13-13-38-281
你能帮我解决这个问题吗?
谢谢
您似乎无权访问资源
arn:aws:sagemaker:eu-west-1:307504647302:training-job/decision-trees-sample-2018-04-24-13-13-38-281
您能否检查资源 url 是否正确以及是否在安全组中设置了适当的权限。
我在开始使用 SageMaker 时遇到了类似的问题,所以我开发了这个开源项目 https://github.com/Kenza-AI/sagify (sagify),它是一个 CLI 工具,可以帮助您训练和部署自己的机器 Learning/Deep以非常简单的方式在 SageMaker 上学习模型。无论我使用什么库(Keras、Tensorflow、scikit-learn、LightFM、spacy 等),我都设法训练和部署了我的所有 ML 模型。本质上,您可以以经典的 Pythonic 方式指定所有依赖项,即在 requiments.txt 中,sagify 将读取它们并将它们安装在 Docker 图像上。然后,可以在 SageMaker 上执行此 Docker 映像以进行训练和部署。
此外,我在 sagify 文档 (https://kenza-ai.github.io/sagify/) 中指定了一个关于如何设置 AWS 账户以避免权限相关问题的一次性流程。
可能您正在使用 AWS Educate 账户。
目前您无法使用 SageMaker 服务通过 AWS Educate Starter 账户创建训练或建模作业。
目前,如果您想 use/deploy 使用 SageMaker 服务进行训练,您可以使用自己的个人 AWS 账户。
但是,您可以通过 AWS Educate 账户通过 SageMaker 继续使用 Jupyter 笔记本。