`kubectl delete service` 卡在 'Terminating' 状态

`kubectl delete service` gets stuck in 'Terminating' state

我正在尝试删除我编写并部署到 Azure Kubernetes 服务的服务(以及随附的所需 Dask 组件),当我 运行 kubectl delete -f my_manifest.yml 时,我的服务卡住了在终止状态。控制台告诉我删除了,但是命令挂了:

> kubectl delete -f my-manifest.yaml
service "dask-scheduler" deleted
deployment.apps "dask-scheduler" deleted
deployment.apps "dask-worker" deleted
service "my-service" deleted
deployment.apps "my-deployment" deleted

我要Ctrl+C这个命令。查看我的服务,Dask已经成功删除,但是我的自定义服务还没有。如果我尝试手动删除它,它同样 hangs/fails:

> kubectl get services
NAME                TYPE           CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
kubernetes          ClusterIP      x.x.x.x      <none>        443/TCP                      18h
my-service          LoadBalancer   x.x.x.x      x.x.x.x       80:30786/TCP,443:31934/TCP   18h

> kubectl delete service my-service
service "my-service" deleted

says to delete the pods first, but all my pods are deleted (kubectl get pods returns nothing). There's also this closed K8s issue--wait=false 可能会修复前景级联删除,但这不起作用,而且似乎也不是这里的问题(因为 pods 本身有已被删除)。

我假设我可以完全清除我的 AKS 群集并重新创建,但这是最后的选择。我不知道它是否相关,但我的服务正在使用 the azure-load-balancer-internal: "true" annotation 作为服务,并且我在我的 VNet 中部署了一个使用此服务的 Web 应用程序。

有没有其他方法可以强制关闭此服务?

感谢@4c74356b41 的建议查看 kubectl describe service my-service(出于某种原因我没有考虑),我看到了这个警告:

Code="LinkedAuthorizationFailed" Message="The client 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with object id 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' has permission to perform action 'Microsoft.Network/loadBalancers/write' on scope '/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.Network/loadBalancers/kubernetes-internal'; however, it does not have permission to perform action 'Microsoft.Network/virtualNetworks/subnets/join/action' on the linked scope(s) '/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.Network/virtualNetworks/<vnet>/subnets/<subnet>' or the linked scope(s) are invalid.

(客户端和对象 id GUID 的值相同。)

这表明这不完全是 Kubernetes 问题,而是 Azure 生态系统中的更多权限问题。我查看了门户,但没有在我的任何用户、组或应用程序中找到该 GUID,因此我不确定它指的是什么。但是,我将 Owner 角色授予此客户端 ID,几分钟后,该服务被删除。

az role assignment create `
    --role Owner `
    --assignee xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx