Chronos 不能 运行 私有 Docker 容器

Chronos can't run a private Docker container

我在安装 DC/OS 的本地主机上玩游戏。虽然一切正常,但我似乎无法 运行 位于私人仓库中的 docker 图像。我正在使用 python 与 chronos 通信:

@celery.task(name='add-job', soft_time_limit=5)
def add_job(job_id):
    job_document = mongo.jobs.find_one({
        '_id': job_id
    })

    if job_document:
        worker_document = mongo.workers.find_one({
            '_id': job_document['workerId']
        })

        if worker_document:
            job = {
                'async': True,
                'name': job_document['_id'],
                'owner': 'owner@gmail.com',
                'command': "python /code/run.py",
                "disabled": False,
                "shell": True,
                "cpus": worker_document['cpus'],
                "disk": worker_document['disk'],
                "mem": worker_document['memory'],
                'schedule': 'R1//PT300S',# start now,
                "epsilon": "PT60M",
                "container": {
                    "type": "DOCKER",
                    "forcePullImage": True,
                    "image": "quay.io/username/container",
                    "network": "HOST",
                    "volumes": [{
                        "containerPath": "/images/",
                        "hostPath": "/images/",
                        "mode": "RW"
                    }]
                },
                "uris": [
                    "file:///images/docker.tar.gz"
                ]
            }
            return chronos_client.add(job)
        else:
            return 'worker not found'
    else:
        return 'job not found'

使用 public 图像 (alpine:latest) 的作业 运行 没问题,但在 dcos 安装中没有任何错误就失败了。

作业已执行,但立即失败。 chronos 中作业的错误日志如下所示:

I1212 12:39:11.141639 25058 fetcher.cpp:498] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":false,"value":"file:\/\/\/images\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/docker\/links\/7029bbea-4c3d-439a-8720-411f6fe40eb9","user":"root"}
I1212 12:39:11.143575 25058 fetcher.cpp:409] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143587 25058 fetcher.cpp:250] Fetching directly into the sandbox directory
I1212 12:39:11.143602 25058 fetcher.cpp:187] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143612 25058 fetcher.cpp:167] Copying resource with command:cp '/images/docker.tar.gz' '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'
I1212 12:39:11.146726 25058 fetcher.cpp:547] Fetched 'file:///images/docker.tar.gz' to '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'

标准输出为空。作为具有相同设置的应用程序直接在马拉松内部执行,身份验证有效,我的图像被下载并执行。这是 chronos 不支持的东西吗?它应该...我的意思是,它有 docker...

的命令

更新:深入挖掘代理日志我发现了这个:

Failed to run 'docker -H unix:///var/run/docker.sock pull quay.io/username/container': exited with status 1; stderr='Error: Status 403 trying to pull repository username/container: "{\"error\": \"Permission Denied\"}"

我尝试在代理本身上使用它的 config.json 文件存档,当从命令行触发时它可以下载。我似乎无法理解为什么 chronos 没有正确使用它。除了这个,我找不到关于如何放置我的凭据的任何其他参考。

你的post看起来有点像this one,结果证明是卷的问题。

事实证明...uris 参数已被弃用,取而代之的是 fetch。我从头开始使用应用于 chronos 的马拉松配置,并在看到以下内容时仔细查看日志:{'message': 'Tried to add both uri (deprecated) and fetch parameters on aBPepwhG5z33e4teG', 'status': 'Bad Request'}。然后我将我的 uris 参数更改为:

"fetch": [{
    "uri": "/images/docker.tar.gz",
    "extract": true,
    "executable": false,
    "cache": false
}]

...成功了。