在 Elastic Beanstalk 上启动 SQS celery worker

Start SQS celery worker on Elastic Beanstalk

我正在尝试在 EB 上启动一个 celery worker,但出现了一个无法解释的错误。

.ebextensions dir配置文件中的命令:

03_celery_worker:
  command: "celery worker --app=config --loglevel=info -E --workdir=/opt/python/current/app/my_project/"

列出的命令在我的本地机器上运行良好(只需更改 workdir 参数)。

来自 EB 的错误:

Activity execution failed, because: /opt/python/run/venv/local/lib/python3.6/site-packages/celery/platforms.py:796: RuntimeWarning: You're running the worker with superuser privileges: this is absolutely not recommended!

Starting new HTTPS connection (1): eu-west-1.queue.amazonaws.com (ElasticBeanstalk::ExternalInvocationError)

我已经用参数 --uid=2 更新了 celery worker 命令,权限错误消失了,但命令执行仍然失败,原因是

ExternalInvocationError

对我做错了什么有什么建议吗?

ExternalInvocationError

据我了解,这意味着列出的命令不能是来自 EB 容器命令的 运行。需要在服务器上创建脚本并从脚本中创建 运行 celery。 描述了如何操作。

更新: 需要在 .ebextensions 目录中创建一个配置文件。我称之为celery.config。 Link 到上面的 post 提供了一个几乎可以正常工作的脚本。需要进行一些小的添加才能 100% 正确地工作。我在安排定期任务时遇到了问题 (celery beat)。以下是修复方法的步骤:

  1. Install (add to requirements) django-celery beat pip install django-celery-beat, 将其添加到已安装的应用程序中,并在使用时使用 --scheduler 参数开始芹菜节拍。指令是here.

  2. 在脚本中指定 运行 脚本的用户。对于 celery worker,它是 celery 用户,它是在脚本的前面添加的(如果不存在)。当我尝试启动 celery beat 时出现错误 PermissionDenied。这意味着 celery 用户没有所有必要的权限。使用 ssh 我登录到 EB,查看了所有用户的列表 (cat /etc/passwd) 并决定使用 daemon user.

列出的步骤解决了芹菜节拍错误。使用脚本更新的配置文件如下 (celery.config):

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Create required directories
      sudo mkdir -p /var/log/celery/
      sudo mkdir -p /var/run/celery/

      # Create group called 'celery'
      sudo groupadd -f celery
      # add the user 'celery' if it doesn't exist and add it to the group with same name
      id -u celery &>/dev/null || sudo useradd -g celery celery
      # add permissions to the celery user for r+w to the folders just created
      sudo chown -R celery:celery /var/log/celery/
      sudo chown -R celery:celery /var/run/celery/

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=${celeryenv%?}

      # Create CELERY configuration script
      celeryconf="[program:celeryd]
      directory=/opt/python/current/app
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery worker -A config.celery:app --loglevel=INFO --logfile=\"/var/log/celery/%%n%%I.log\" --pidfile=\"/var/run/celery/%%n.pid\"

      user=celery
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"


      # Create CELERY BEAT configuraiton script
      celerybeatconf="[program:celerybeat]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery beat -A config.celery:app --loglevel=INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler --logfile=\"/var/log/celery/celery-beat.log\" --pidfile=\"/var/run/celery/celery-beat.pid\"

      directory=/opt/python/current/app
      user=daemon
      numprocs=1
      stdout_logfile=/var/log/celerybeat.log
      stderr_logfile=/var/log/celerybeat.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=999

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf
      echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "celery.conf" /opt/python/etc/supervisord.conf
        then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: uwsgi.conf celery.conf celerybeat.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Enable supervisor to listen for HTTP/XML-RPC requests.
      # supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
      # Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
      if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
        then
          echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
          echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
      supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat

    commands:
      01_killotherbeats:
        command: "ps auxww | grep 'celery beat' | awk '{print }' | sudo xargs kill -9 || true"
        ignoreErrors: true
      02_restartbeat:
        command: "supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat"
        leader_only: true

需要注意一点:在我的项目中celery.py文件在config目录下,所以我在启动celery worker时写-A config.celery:app 芹菜击败