Docker-Compose:如何等待其他服务就绪?

Docker-Compose: how to wait for other service to be ready?

我有以下 docker-compose,我需要等待服务 jhipster-registry在启动 myprogram-app.

之前启动并接受连接

我按照官方文档尝试了healthcheck方式https://docs.docker.com/compose/compose-file/compose-file-v2/

version: '2.1'
services:
    myprogram-app:
        image: myprogram
        mem_limit: 1024m
        environment:
            - SPRING_PROFILES_ACTIVE=prod,swagger
            - EUREKA_CLIENT_SERVICE_URL_DEFAULTZONE=http://admin:$${jhipster.registry.password}@jhipster-registry:8761/eureka
            - SPRING_CLOUD_CONFIG_URI=http://admin:$${jhipster.registry.password}@jhipster-registry:8761/config
            - SPRING_DATASOURCE_URL=jdbc:postgresql://myprogram-postgresql:5432/myprogram
            - JHIPSTER_SLEEP=0
            - SPRING_DATA_ELASTICSEARCH_CLUSTER_NODES=myprogram-elasticsearch:9300
            - JHIPSTER_REGISTRY_PASSWORD=53bqDrurQAthqrXG
            - EMAIL_USERNAME
            - EMAIL_PASSWORD
        ports:
            - 8080:8080
        networks:
          - backend
        depends_on:
          - jhipster-registry:
              "condition": service_started
          - myprogram-postgresql
          - myprogram-elasticsearch
    myprogram-postgresql:
        image: postgres:9.6.5
        mem_limit: 256m
        environment:
            - POSTGRES_USER=myprogram
            - POSTGRES_PASSWORD=myprogram
        networks:
          - backend
    myprogram-elasticsearch:
        image: elasticsearch:2.4.6
        mem_limit: 512m
        networks:
          - backend
    jhipster-registry:
        extends:
            file: jhipster-registry.yml
            service: jhipster-registry
        mem_limit: 512m
        ports:
            - 8761:8761
        networks:
          - backend
        healthcheck:
          test: "exit 0"
networks:
  backend:
    driver: "bridge"

但是当 运行 docker-compose up:

时出现以下错误
ERROR: The Compose file './docker-compose.yml' is invalid because:
services.myprogram-app.depends_on contains {"jhipster-registry": {"condition": "service_started"}}, which is an invalid type, it should be a string

是我做错了什么,还是不再支持此功能?如何实现服务之间的这种同步?

更新版本

version: '2.1'
services:
    myprogram-app:
        image: myprogram
        mem_limit: 1024m
        environment:
            - SPRING_PROFILES_ACTIVE=prod,swagger
            - EUREKA_CLIENT_SERVICE_URL_DEFAULTZONE=http://admin:$${jhipster.registry.password}@jhipster-registry:8761/eureka
            - SPRING_CLOUD_CONFIG_URI=http://admin:$${jhipster.registry.password}@jhipster-registry:8761/config
            - SPRING_DATASOURCE_URL=jdbc:postgresql://myprogram-postgresql:5432/myprogram
            - JHIPSTER_SLEEP=0
            - SPRING_DATA_ELASTICSEARCH_CLUSTER_NODES=myprogram-elasticsearch:9300
            - JHIPSTER_REGISTRY_PASSWORD=53bqDrurQAthqrXG
            - EMAIL_USERNAME
            - EMAIL_PASSWORD
        ports:
            - 8080:8080
        networks:
          - backend
        depends_on:
          jhipster-registry:
            condition: service_healthy
          myprogram-postgresql:
            condition: service_started
          myprogram-elasticsearch:
            condition: service_started
        #restart: on-failure
    myprogram-postgresql:
        image: postgres:9.6.5
        mem_limit: 256m
        environment:
            - POSTGRES_USER=myprogram
            - POSTGRES_PASSWORD=tuenemreh
        networks:
          - backend
    myprogram-elasticsearch:
        image: elasticsearch:2.4.6
        mem_limit: 512m
        networks:
          - backend
    jhipster-registry:
        extends:
            file: jhipster-registry.yml
            service: jhipster-registry
        mem_limit: 512m
        ports:
            - 8761:8761
        networks:
          - backend
        healthcheck:
          test: ["CMD", "curl", "-f", "http://jhipster-registry:8761", "|| exit 1"]
          interval: 30s
          retries: 20
          #start_period: 30s
networks:
  backend:
    driver: "bridge"

更新后的版本给我一个不同的错误,

ERROR: for myprogram-app  Container "8ebca614590c" is unhealthy.
ERROR: Encountered errors while bringing up the project.

说 jhipster-registry 的容器不健康,但可以通过浏览器访问。如何修复健康检查中的命令以使其正常工作?

documentation建议,在Docker专门编写版本2文件depends_on:可以是一个字符串列表,或者一个映射,其中键是服务名称,值是条件。对于您没有(或不需要)健康检查的服务,有一个 service_started 条件。

depends_on:
  # notice: these lines don't start with "-"
  jhipster-registry:
    condition: service_healthy
  myprogram-postgresql:
    condition: service_started
  myprogram-elasticsearch:
    condition: service_started

根据您对程序及其库的控制程度,如果您可以安排服务能够在其依赖项不一定可用的情况下启动(等效地,如果其依赖项死亡而运行,则更好)服务是 运行),而不是使用 depends_on: 选项。例如,如果数据库已关闭,您可能 return 出现 HTTP 503 服务不可用错误。另一种通常有用的策略是,如果您的依赖项不可用,则立即退出,但使用 a setting(如 restart: on-error)要求协调器重新启动服务。

虽然你已经得到了答案,但应该指出的是,你试图实现的目标存在一些严重的风险。

理想情况下,服务应该是自给自足的,并且足够智能,可以重试并等待依赖项可用(在关闭之前)。否则,您将更容易受到传播到其他服务的一次故障的影响。还要考虑到与手动启动不同的是,系统重新启动可能会忽略依赖项顺序。

如果一个服务崩溃导致你所有的系统崩溃,你可能有一个工具可以重新启动一切,但最好有能抵抗这种情况的服务。

更新到版本 3+。

请遵循版本 3 中的 documents

There are several things to be aware of when using depends_on:

depends_on does not wait for db and redis to be “ready” before starting web - only until they have been started.
If you need to wait for a service to be ready, see Controlling startup order for more on this problem and strategies for solving it.
Version 3 no longer supports the condition form of depends_on.
The depends_on option is ignored when deploying a stack in swarm mode with a version 3 Compose file.

我会考虑使用 restart_policy 选项来配置您的 myprogram-app 以重新启动,直到 jhipster-registry启动并接受连接:

 restart_policy:
        condition: on-failure
        delay: 3s
        max_attempts: 5
        window: 60s

最佳方法 - 弹性应用启动

虽然 docker 支持启动依赖项,但他们正式建议更新您的应用程序启动逻辑以测试外部依赖项的可用性并重试。除了规避 docker compose up

中的竞争条件之外,这对于可以在野外即时重启的健壮应用程序有很多好处

Docker 方法 - wait-for-it.sh

根据他们在 Control startup and shutdown order in Compose is to download wait-for-it.sh 上的文档,docker 推荐的方法接受 domain:port 进行轮询,如果成功则执行下一组命令。

version: "2"
services:
  web:
    build: .
    ports:
      - "80:8000"
    depends_on:
      - "db"
    command: ["./wait-for-it.sh", "db:5432", "--", "python", "app.py"]
  db:
    image: postgres

注意:这需要覆盖映像的启动命令,因此请确保您知道要传递什么以保持默认启动的奇偶校验。

历史方法 - depends_on - service_healthy(已弃用 3+)

从历史上看,您可以定义一个 healthcheck(仍然是一个好的做法),然后将 depends_on 的条件设置为 service_healthy,但是 depends_on 的条件方差在 3.0

中被弃用
version: '3.0'
services:
  php:
    build:
      context: .
      dockerfile: tests/Docker/Dockerfile-PHP
    depends_on:
      redis:
        condition: service_healthy
  redis:
    image: redis
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 1s
      timeout: 3s
      retries: 30

进一步阅读

  • How can I wait for a docker container to be up and running?

我发现的最佳方法是在入口点检查所需的端口。有不同的方法可以做到这一点,例如wait-for-it 但我喜欢使用这个在 apline 和 bash 图像之间跨平台的解决方案,并且不会从 GitHub:

下载自定义脚本

安装 netcat-openbsd(适用于 aptapk)。然后在入口点(适用于 #!/bin/bash#!/bin/sh):

#!/bin/bash

wait_for()
{
  echo "Waiting  seconds for :"
  timeout  sh -c 'until nc -z [=10=] ; do sleep 0.1; done'   || return 1
  echo ": available"
}


wait_for 10 db 5432
wait_for 10 redis 6379

如果你不想打印任何东西,你也可以把它做成 1-liner。