gcr.io 上的 GKE imagePullBackOff
GKE imagePullBackOff on gcr.io
我尝试使用 gcr.io 在 GKE 上设置我自己的容器,但一直出现 ImagePullBackOff 失败。
以为我做错了什么,我回到这里的教程 https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app 并按照所有步骤操作并得到相同的错误。这看起来像是一个凭据问题,但我按照教程的所有步骤进行操作,但仍然没有成功。
我该如何调试这个错误,因为日志似乎没有帮助。
教程工作的第 1-4 步。
kubectl run hello-web --image=gcr.io/${PROJECT_ID}/hello-app:v1 --port 8080
ImagePullBackOff 失败
我认为 GKE 和 gcr.io 会自动处理凭据。
我究竟做错了什么?我该如何调试?
kubectl describe pods hello-web-6444d588b7-tqgdm
Name: hello-web-6444d588b7-tqgdm
Namespace: default
Node: gke-aia-default-pool-9ad6a2ee-j5g7/10.152.0.2
Start Time: Sat, 27 Oct 2018 06:51:38 +1000
Labels: pod-template-hash=2000814463
run=hello-web
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container hello-web
Status: Pending
IP: 10.12.2.5
Controlled By: ReplicaSet/hello-web-6444d588b7
Containers:
hello-web:
Container ID:
Image: gcr.io/<project-id>/hello-app:v1
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qgv8h (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-qgv8h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qgv8h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 45m default-scheduler Successfully assigned hello-web-6444d588b7-tqgdm to gke-aia-default-pool-9ad6a2ee-j5g7
Normal SuccessfulMountVolume 45m kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 MountVolume.SetUp succeeded for volume "default-token-qgv8h"
Normal Pulling 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 pulling image "gcr.io/<project-id>/hello-app:v1"
Warning Failed 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Failed to pull image "gcr.io/<project-id>/hello-app:v1": rpc error: code = Unknown desc = Error response from daemon: repository gcr.io/<project-id>/hello-app not found: does not exist or no pull access
Warning Failed 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Error: ErrImagePull
Normal BackOff 5m (x168 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Back-off pulling image "gcr.io/<project-id>/hello-app:v1"
Warning Failed 48s (x189 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Error: ImagePullBackOff
集群权限:
User info Disabled
Compute Engine Read/Write
Storage Read Only
Task queue Disabled
BigQuery Disabled
Cloud SQL Disabled
Cloud Datastore Disabled
Stackdriver Logging API Write Only
Stackdriver Monitoring API Full
Cloud Platform Disabled
Bigtable Data Disabled
Bigtable Admin Disabled
Cloud Pub/Sub Disabled
Service Control Enabled
Service Management Read Only
Stackdriver Trace Write Only
Cloud Source Repositories Disabled
Cloud Debugger Disabled
阅读一些文档后,我使用以下说明手动添加了访问权限:
https://cloud.google.com/container-registry/docs/access-control
现在允许部署示例代码。看起来从 gke 到 gcr 的自动访问不起作用。
kubectl 服务帐户应具有执行部署和 GCR 访问所需的权限(存储管理员)。
步骤1 。在 GCP 上创建一个服务帐户并为角色分配 Kubernetes 和 GCR 权限。
第2步 。保存生成的服务帐户 Json 文件
第 3 步。使用具有相同 Json 文件的 G-Cloud 进行身份验证。
第4步 。执行部署
创建 GKE 集群时,请确保节点具有 Storage RO 或 https://www.googleapis.com/auth/devstorage.read_only
范围。
我在通过 Terraform 创建 GKE 集群时被这个绊倒了:
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
...
而不是
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/devstorage.read_only"
]
...
我尝试使用 gcr.io 在 GKE 上设置我自己的容器,但一直出现 ImagePullBackOff 失败。
以为我做错了什么,我回到这里的教程 https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app 并按照所有步骤操作并得到相同的错误。这看起来像是一个凭据问题,但我按照教程的所有步骤进行操作,但仍然没有成功。
我该如何调试这个错误,因为日志似乎没有帮助。
教程工作的第 1-4 步。
kubectl run hello-web --image=gcr.io/${PROJECT_ID}/hello-app:v1 --port 8080
ImagePullBackOff 失败 我认为 GKE 和 gcr.io 会自动处理凭据。 我究竟做错了什么?我该如何调试?
kubectl describe pods hello-web-6444d588b7-tqgdm
Name: hello-web-6444d588b7-tqgdm
Namespace: default
Node: gke-aia-default-pool-9ad6a2ee-j5g7/10.152.0.2
Start Time: Sat, 27 Oct 2018 06:51:38 +1000
Labels: pod-template-hash=2000814463
run=hello-web
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container hello-web
Status: Pending
IP: 10.12.2.5
Controlled By: ReplicaSet/hello-web-6444d588b7
Containers:
hello-web:
Container ID:
Image: gcr.io/<project-id>/hello-app:v1
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qgv8h (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-qgv8h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qgv8h
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 45m default-scheduler Successfully assigned hello-web-6444d588b7-tqgdm to gke-aia-default-pool-9ad6a2ee-j5g7
Normal SuccessfulMountVolume 45m kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 MountVolume.SetUp succeeded for volume "default-token-qgv8h"
Normal Pulling 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 pulling image "gcr.io/<project-id>/hello-app:v1"
Warning Failed 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Failed to pull image "gcr.io/<project-id>/hello-app:v1": rpc error: code = Unknown desc = Error response from daemon: repository gcr.io/<project-id>/hello-app not found: does not exist or no pull access
Warning Failed 44m (x4 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Error: ErrImagePull
Normal BackOff 5m (x168 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Back-off pulling image "gcr.io/<project-id>/hello-app:v1"
Warning Failed 48s (x189 over 45m) kubelet, gke-aia-default-pool-9ad6a2ee-j5g7 Error: ImagePullBackOff
集群权限:
User info Disabled
Compute Engine Read/Write
Storage Read Only
Task queue Disabled
BigQuery Disabled
Cloud SQL Disabled
Cloud Datastore Disabled
Stackdriver Logging API Write Only
Stackdriver Monitoring API Full
Cloud Platform Disabled
Bigtable Data Disabled
Bigtable Admin Disabled
Cloud Pub/Sub Disabled
Service Control Enabled
Service Management Read Only
Stackdriver Trace Write Only
Cloud Source Repositories Disabled
Cloud Debugger Disabled
阅读一些文档后,我使用以下说明手动添加了访问权限: https://cloud.google.com/container-registry/docs/access-control
现在允许部署示例代码。看起来从 gke 到 gcr 的自动访问不起作用。
kubectl 服务帐户应具有执行部署和 GCR 访问所需的权限(存储管理员)。 步骤1 。在 GCP 上创建一个服务帐户并为角色分配 Kubernetes 和 GCR 权限。 第2步 。保存生成的服务帐户 Json 文件 第 3 步。使用具有相同 Json 文件的 G-Cloud 进行身份验证。 第4步 。执行部署
创建 GKE 集群时,请确保节点具有 Storage RO 或 https://www.googleapis.com/auth/devstorage.read_only
范围。
我在通过 Terraform 创建 GKE 集群时被这个绊倒了:
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
...
而不是
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/devstorage.read_only"
]
...