无法在 Kubernetes POD 上部署 Spark 历史服务器

Unable to deploy spark history server on Kubernetes POD

我正在尝试在 kubernetes POD 上部署 spark 历史服务器。为此,我使用了以下一组命令:-

helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm install stable/spark-history-server --generate-name

但是在这样做的时候,我遇到了问题,下面是错误日志:-

Events:
  Type     Reason       Age                      From                               Message
  ----     ------       ----                     ----                               -------
  Warning  FailedMount  7m51s (x129 over 3h31m)  kubelet, aks-agentpool-20240184-1  (combined from similar events): MountVolume.SetUp failed for volume "nfs-pv" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/2bc91c0b-a9e8-4af6-9a6a-8e4781079afb/volumes/kubernetes.io~nfs/nfs-pv --scope -- mount -t nfs spark-history-server-1599813147-nfs.default.svc.cluster.local:/ /var/lib/kubelet/pods/2bc91c0b-a9e8-4af6-9a6a-8e4781079afb/volumes/kubernetes.io~nfs/nfs-pv
Output: Running scope as unit run-re958022a7250453abcd26d58efcbf360.scope.
mount.nfs: Failed to resolve server spark-history-server-1599813147-nfs.default.svc.cluster.local: Name or service not known
  Warning  FailedMount  2m51s (x17 over 3h31m)  kubelet, aks-agentpool-20240184-1  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[spark-history-server-1599813147-token-bglz7 data]: timed out waiting for the condition

任何帮助将不胜感激!

不幸的是,这是其中之一 known issues:

Kubernetes installs do not configure the nodes' resolv.conf files to use the cluster DNS by default, because that process is inherently distribution-specific. This should probably be implemented eventually.

有一些解决方法供您选择:

  • 指定ClusterIP而非域名时NFS挂载成功。你可以找到一个例子 here.

  • 在每个节点上手动更新 resolv.conv

  • 在所有节点上手动写入 /etc/hosts 中的服务名称。