播放过程中的 Ansible SSH 错误

Question

我在使用 Ansible 时遇到一个奇怪的错误。首先，第一个角色工作正常，但是当 Ansible 尝试执行第二个角色时，由于 ssh 错误而失败。

环境:

OS: 分OS 7
Ansible 版本：2.2.1.0
Python版本：2.7.5
OpenSSH 版本：OpenSSH_6.6.1p1，OpenSSL 1.0.1e-fips 2013 年 2 月 11 日

执行的Ansible命令:

ansible-playbook -vvvv -i inventory/dev playbook_update_system.yml --limit "db[0]"

剧本:

- name: "HUB Playbook | Updating system packages on {{ ansible_hostname }}"
  hosts: release_first_half
  roles:
    - upgrade_system_package
    - reboot_server

角色：upgrade_system_package：

- name: "upgrading CentOS system packages on {{ ansible_hostname }}"
  shell: sudo puppet apply -e 'exec{"upgrade-package":command => "/usr/bin/yum clean all; /usr/bin/yum -y update;"}'
  when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names

作用：reboot_server：

- name: "reboot CentOS [{{ ansible_hostname }}] server"
  shell: sudo puppet apply -e 'exec{"reboot-os":command => "/usr/sbin/reboot"}'
  when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names

当前行为:

连接到 "db1" 节点并执行角色 "upgrade system packages" => OK
尝试连接到 "db1" 并执行角色 "reboot_server" => 由于 ssh 而失败。

Ansible返回的错误信息:

fatal: [db1]: UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013\r\ndebug1: Reading configuration data /USR/newtprod/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 56: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 64994\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\nShared connection to db1 closed.\r\n", "unreachable": true }

我不明白，因为之前的角色已经在这个节点上执行成功了。此外，我们有很多使用相同清单文件的剧本，它们运行良好。我也在另一个节点上尝试过，但结果相同。

Answer 1

这是一个简单且众所周知的问题：关闭过程导致 SSH 守护程序退出，这会中断当前的 SSH 会话（您会收到 "broken pipe" 错误）。服务器正常重启，但 Ansible 流程中断。

您需要为您的 shell 命令和运行添加延迟 async 选项，以便 Ansible 的 SSH 会话可以在它被杀死之前完成。

shell: sleep 5; sudo puppet apply -e 'exec{"reboot-os":command => "/usr/sbin/reboot"}'
async: 0
poll: 0

播放过程中的 Ansible SSH 错误

Ansible SSH error during play

ssh

ansible

ansible-2.x