尝试安装 Acumos 时出错

Errors trying to install Acumos

背景: VMware15.0 ubuntu16.04-64位 32G内存+16核CPU /etc/hosts: 192.168.79.130 本地主机

这样做(并在出现提示时输入 sudo 密码):

git clone https://gerrit.acumos.org/r/system-integration
apt-get -y update
apt-get -y install docker-ce=18.06.3~ce~3-0~ubuntu
if [[ "$(id -nG "$USER" | grep docker)" == "" ]]; then sudo usermod -aG docker $USER; fi
# Logged out and in again and verified that my user is in the docker group
cd system-integration/tools/
bash setup_k8s_stack.sh setup
cd
bash system-integration/AIO/setup_prereqs.sh k8s localhost $USER generic 2>&1 | tee aio_prep.log
# When "Prerequisites setup is complete" messages is displayed I continue with
cd system-integration/AIO
bash oneclick_deploy.sh 2>&1 | tee aio_deploy.log

部署失败并显示以下错误消息:

+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
++ jq -r '.items[0].metadata.name'
+ dep=null
++ cat /tmp/a72a447b-df96-4fec-98c9-bb99e447d00d
+ kubectl patch deployment -n acumos null --patch 'spec:
  template:
    spec:
      hostAliases:
      - ip: "192.168.79.130"
        hostnames:
        - "ubuntu"'
Error from server (NotFound): deployments.extensions "null" not found

打开文件:“system-integration/AIO/utils.sh”

  if [[ "$component" != "" ]]; then c="-l component=$component"; fi
  dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
  kubectl patch deployment -n $namespace $dep --patch "$(cat $tmp)"

修改为:

  #if [[ "$component" != "" ]]; then c="-l component=$component"; fi
  dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
  kubectl patch deployment -n $namespace filebeat --patch "$(cat $tmp)"

此错误已解决,但出现以下错误:

oneclick_deploy.sh setup_federation:233 (Tue Sep 17 02:32:31 PDT 2019) CDS API is not yet ready; waiting 10 seconds
+ t=300
+ sleep 10
++ curl -k -u ccds_client:187bbf19-40b9-45c8-9945-4903292d963d https://localhost/ccds/peer
++ grep -c numberOfElements
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   203  100   203    0     0    957      0 --:--:-- --:--:-- --:--:--   962
+ [[ 0 -eq 0 ]]
+ [[ 300 -eq 300 ]]
+ fail 'CDS API is not ready after 300 seconds'
+ set +x

oneclick_deploy.sh fail:42 (Tue Sep 17 02:32:41 PDT 2019) CDS API is not ready after 300 seconds

pod cds 发生错误:CrashLoopBackOff 日志:

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  13m                  default-scheduler  Successfully assigned acumos/cds-7474fccbc7-nm4jk to ubuntu
  Normal   Pulled     9m28s (x5 over 13m)  kubelet, ubuntu    Container image "nexus3.acumos.org:10002/common-dataservice:2.2.5" already present on machine
  Normal   Created    9m28s (x5 over 13m)  kubelet, ubuntu    Created container
  Normal   Started    9m28s (x5 over 13m)  kubelet, ubuntu    Started container
  Warning  BackOff    3m9s (x28 over 12m)  kubelet, ubuntu    Back-off restarting failed container

这个问题和上一个类似: Errors trying to install Acumos Boreas release 谢谢!

日志:('Patch deployment for cds')

...
oneclick_deploy.sh start_deployment:518 (Tue Sep 17 18:25:16 PDT 2019) Creating deployment cds
+ kubectl create -f deploy/cds-deployment.yaml
deployment.apps/cds created
+ get_host_ip_from_etc_hosts localhost
+ trap fail ERR
++ grep -v '^127\.'
++ awk '{print }'
++ grep -E '\slocalhost( |$)' /etc/hosts
+ HOST_IP='192.168.79.130
192.168.79.130'
+ [[ 192.168.79.130
192.168.79.130 != '' ]]
+ patch_deployment_with_host_alias acumos cds ubuntu 192.168.79.130 192.168.79.130
+ trap fail ERR
+ namespace=acumos
+ app=cds
+ name=ubuntu
+ ip=192.168.79.130
+ component=192.168.79.130
+ log 'Patch deployment for cds (192.168.79.130), to restart it with the changes'
+ setx=x
+ set +x

oneclick_deploy.sh patch_deployment_with_host_alias:448 (Tue Sep 17 18:25:17 PDT 2019) Patch deployment for cds (192.168.79.130), to restart it with the changes
++ uuidgen
+ tmp=/tmp/ef6cab06-ece6-436b-812c-1a00728bec01
+ cat
+ [[ 192.168.79.130 != '' ]]
+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
...

两条建议:

(1) 将自己添加到无密码 sudo 权限

sudo visudo
(add to the end of the file and save)
<your username>   ALL=(ALL:ALL) NOPASSWD:ALL

2) Re 'The deployment fails with the following error message:', '-l component=192.168.79.130' 表示一个错误导致 patch_deployment_with_host_alias (在 utils.sh 中)认为指定了一个组件(参数5).对此的具体调用来自 start_acumos_core_app 的 cds 应用程序,第

patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_MARIADB_HOST $HOST_IP  
or
patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_HOST $HOST_IP

由于其中最多有四个参数(master),你能提供更多的日志吗(回到输出的行:'Patch deployment for cds')。

超出此范围不需要更新 utils.sh,我认为您所做的更改可能会产生副作用。所以我会扭转这些变化: - 你用 filebeat 替换了 $dep 这可能让您取得了进步,但它破坏了该功能的目的(为组件添加主机别名 - 不仅仅是 filebeat - 指的是在 DNS 中不可解析的名称)。