尝试安装 Acumos 时出错
Errors trying to install Acumos
背景:
VMware15.0
ubuntu16.04-64位
32G内存+16核CPU
/etc/hosts:
192.168.79.130 本地主机
这样做(并在出现提示时输入 sudo 密码):
git clone https://gerrit.acumos.org/r/system-integration
apt-get -y update
apt-get -y install docker-ce=18.06.3~ce~3-0~ubuntu
if [[ "$(id -nG "$USER" | grep docker)" == "" ]]; then sudo usermod -aG docker $USER; fi
# Logged out and in again and verified that my user is in the docker group
cd system-integration/tools/
bash setup_k8s_stack.sh setup
cd
bash system-integration/AIO/setup_prereqs.sh k8s localhost $USER generic 2>&1 | tee aio_prep.log
# When "Prerequisites setup is complete" messages is displayed I continue with
cd system-integration/AIO
bash oneclick_deploy.sh 2>&1 | tee aio_deploy.log
部署失败并显示以下错误消息:
+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
++ jq -r '.items[0].metadata.name'
+ dep=null
++ cat /tmp/a72a447b-df96-4fec-98c9-bb99e447d00d
+ kubectl patch deployment -n acumos null --patch 'spec:
template:
spec:
hostAliases:
- ip: "192.168.79.130"
hostnames:
- "ubuntu"'
Error from server (NotFound): deployments.extensions "null" not found
打开文件:“system-integration/AIO/utils.sh”
if [[ "$component" != "" ]]; then c="-l component=$component"; fi
dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
kubectl patch deployment -n $namespace $dep --patch "$(cat $tmp)"
修改为:
#if [[ "$component" != "" ]]; then c="-l component=$component"; fi
dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
kubectl patch deployment -n $namespace filebeat --patch "$(cat $tmp)"
此错误已解决,但出现以下错误:
oneclick_deploy.sh setup_federation:233 (Tue Sep 17 02:32:31 PDT 2019) CDS API is not yet ready; waiting 10 seconds
+ t=300
+ sleep 10
++ curl -k -u ccds_client:187bbf19-40b9-45c8-9945-4903292d963d https://localhost/ccds/peer
++ grep -c numberOfElements
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 203 100 203 0 0 957 0 --:--:-- --:--:-- --:--:-- 962
+ [[ 0 -eq 0 ]]
+ [[ 300 -eq 300 ]]
+ fail 'CDS API is not ready after 300 seconds'
+ set +x
oneclick_deploy.sh fail:42 (Tue Sep 17 02:32:41 PDT 2019) CDS API is not ready after 300 seconds
pod cds 发生错误:CrashLoopBackOff
日志:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned acumos/cds-7474fccbc7-nm4jk to ubuntu
Normal Pulled 9m28s (x5 over 13m) kubelet, ubuntu Container image "nexus3.acumos.org:10002/common-dataservice:2.2.5" already present on machine
Normal Created 9m28s (x5 over 13m) kubelet, ubuntu Created container
Normal Started 9m28s (x5 over 13m) kubelet, ubuntu Started container
Warning BackOff 3m9s (x28 over 12m) kubelet, ubuntu Back-off restarting failed container
这个问题和上一个类似:
Errors trying to install Acumos Boreas release
谢谢!
日志:('Patch deployment for cds')
...
oneclick_deploy.sh start_deployment:518 (Tue Sep 17 18:25:16 PDT 2019) Creating deployment cds
+ kubectl create -f deploy/cds-deployment.yaml
deployment.apps/cds created
+ get_host_ip_from_etc_hosts localhost
+ trap fail ERR
++ grep -v '^127\.'
++ awk '{print }'
++ grep -E '\slocalhost( |$)' /etc/hosts
+ HOST_IP='192.168.79.130
192.168.79.130'
+ [[ 192.168.79.130
192.168.79.130 != '' ]]
+ patch_deployment_with_host_alias acumos cds ubuntu 192.168.79.130 192.168.79.130
+ trap fail ERR
+ namespace=acumos
+ app=cds
+ name=ubuntu
+ ip=192.168.79.130
+ component=192.168.79.130
+ log 'Patch deployment for cds (192.168.79.130), to restart it with the changes'
+ setx=x
+ set +x
oneclick_deploy.sh patch_deployment_with_host_alias:448 (Tue Sep 17 18:25:17 PDT 2019) Patch deployment for cds (192.168.79.130), to restart it with the changes
++ uuidgen
+ tmp=/tmp/ef6cab06-ece6-436b-812c-1a00728bec01
+ cat
+ [[ 192.168.79.130 != '' ]]
+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
...
两条建议:
(1) 将自己添加到无密码 sudo 权限
sudo visudo
(add to the end of the file and save)
<your username> ALL=(ALL:ALL) NOPASSWD:ALL
2) Re 'The deployment fails with the following error message:', '-l component=192.168.79.130' 表示一个错误导致 patch_deployment_with_host_alias (在 utils.sh 中)认为指定了一个组件(参数5).对此的具体调用来自 start_acumos_core_app 的 cds 应用程序,第
行
patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_MARIADB_HOST $HOST_IP
or
patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_HOST $HOST_IP
由于其中最多有四个参数(master),你能提供更多的日志吗(回到输出的行:'Patch deployment for cds')。
超出此范围不需要更新 utils.sh,我认为您所做的更改可能会产生副作用。所以我会扭转这些变化:
- 你用 filebeat 替换了 $dep
这可能让您取得了进步,但它破坏了该功能的目的(为组件添加主机别名 - 不仅仅是 filebeat - 指的是在 DNS 中不可解析的名称)。
背景: VMware15.0 ubuntu16.04-64位 32G内存+16核CPU /etc/hosts: 192.168.79.130 本地主机
这样做(并在出现提示时输入 sudo 密码):
git clone https://gerrit.acumos.org/r/system-integration
apt-get -y update
apt-get -y install docker-ce=18.06.3~ce~3-0~ubuntu
if [[ "$(id -nG "$USER" | grep docker)" == "" ]]; then sudo usermod -aG docker $USER; fi
# Logged out and in again and verified that my user is in the docker group
cd system-integration/tools/
bash setup_k8s_stack.sh setup
cd
bash system-integration/AIO/setup_prereqs.sh k8s localhost $USER generic 2>&1 | tee aio_prep.log
# When "Prerequisites setup is complete" messages is displayed I continue with
cd system-integration/AIO
bash oneclick_deploy.sh 2>&1 | tee aio_deploy.log
部署失败并显示以下错误消息:
+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
++ jq -r '.items[0].metadata.name'
+ dep=null
++ cat /tmp/a72a447b-df96-4fec-98c9-bb99e447d00d
+ kubectl patch deployment -n acumos null --patch 'spec:
template:
spec:
hostAliases:
- ip: "192.168.79.130"
hostnames:
- "ubuntu"'
Error from server (NotFound): deployments.extensions "null" not found
打开文件:“system-integration/AIO/utils.sh”
if [[ "$component" != "" ]]; then c="-l component=$component"; fi
dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
kubectl patch deployment -n $namespace $dep --patch "$(cat $tmp)"
修改为:
#if [[ "$component" != "" ]]; then c="-l component=$component"; fi
dep=$(kubectl get deployment -n $namespace -l app=$app $c -o json | jq -r ".items[0].metadata.name")
kubectl patch deployment -n $namespace filebeat --patch "$(cat $tmp)"
此错误已解决,但出现以下错误:
oneclick_deploy.sh setup_federation:233 (Tue Sep 17 02:32:31 PDT 2019) CDS API is not yet ready; waiting 10 seconds
+ t=300
+ sleep 10
++ curl -k -u ccds_client:187bbf19-40b9-45c8-9945-4903292d963d https://localhost/ccds/peer
++ grep -c numberOfElements
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 203 100 203 0 0 957 0 --:--:-- --:--:-- --:--:-- 962
+ [[ 0 -eq 0 ]]
+ [[ 300 -eq 300 ]]
+ fail 'CDS API is not ready after 300 seconds'
+ set +x
oneclick_deploy.sh fail:42 (Tue Sep 17 02:32:41 PDT 2019) CDS API is not ready after 300 seconds
pod cds 发生错误:CrashLoopBackOff 日志:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned acumos/cds-7474fccbc7-nm4jk to ubuntu
Normal Pulled 9m28s (x5 over 13m) kubelet, ubuntu Container image "nexus3.acumos.org:10002/common-dataservice:2.2.5" already present on machine
Normal Created 9m28s (x5 over 13m) kubelet, ubuntu Created container
Normal Started 9m28s (x5 over 13m) kubelet, ubuntu Started container
Warning BackOff 3m9s (x28 over 12m) kubelet, ubuntu Back-off restarting failed container
这个问题和上一个类似: Errors trying to install Acumos Boreas release 谢谢!
日志:('Patch deployment for cds')
...
oneclick_deploy.sh start_deployment:518 (Tue Sep 17 18:25:16 PDT 2019) Creating deployment cds
+ kubectl create -f deploy/cds-deployment.yaml
deployment.apps/cds created
+ get_host_ip_from_etc_hosts localhost
+ trap fail ERR
++ grep -v '^127\.'
++ awk '{print }'
++ grep -E '\slocalhost( |$)' /etc/hosts
+ HOST_IP='192.168.79.130
192.168.79.130'
+ [[ 192.168.79.130
192.168.79.130 != '' ]]
+ patch_deployment_with_host_alias acumos cds ubuntu 192.168.79.130 192.168.79.130
+ trap fail ERR
+ namespace=acumos
+ app=cds
+ name=ubuntu
+ ip=192.168.79.130
+ component=192.168.79.130
+ log 'Patch deployment for cds (192.168.79.130), to restart it with the changes'
+ setx=x
+ set +x
oneclick_deploy.sh patch_deployment_with_host_alias:448 (Tue Sep 17 18:25:17 PDT 2019) Patch deployment for cds (192.168.79.130), to restart it with the changes
++ uuidgen
+ tmp=/tmp/ef6cab06-ece6-436b-812c-1a00728bec01
+ cat
+ [[ 192.168.79.130 != '' ]]
+ c='-l component=192.168.79.130'
++ kubectl get deployment -n acumos -l app=cds -l component=192.168.79.130 -o json
...
两条建议:
(1) 将自己添加到无密码 sudo 权限
sudo visudo
(add to the end of the file and save)
<your username> ALL=(ALL:ALL) NOPASSWD:ALL
2) Re 'The deployment fails with the following error message:', '-l component=192.168.79.130' 表示一个错误导致 patch_deployment_with_host_alias (在 utils.sh 中)认为指定了一个组件(参数5).对此的具体调用来自 start_acumos_core_app 的 cds 应用程序,第
行patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_MARIADB_HOST $HOST_IP
or
patch_deployment_with_host_alias $ACUMOS_NAMESPACE $app $ACUMOS_HOST $HOST_IP
由于其中最多有四个参数(master),你能提供更多的日志吗(回到输出的行:'Patch deployment for cds')。
超出此范围不需要更新 utils.sh,我认为您所做的更改可能会产生副作用。所以我会扭转这些变化: - 你用 filebeat 替换了 $dep 这可能让您取得了进步,但它破坏了该功能的目的(为组件添加主机别名 - 不仅仅是 filebeat - 指的是在 DNS 中不可解析的名称)。