无法在 docker 上启动带有火炬模型的 gunicorn flask 应用程序
Cannot launch gunicorn flask app with torch model on the docker
有没有人有 docker 在一个应用程序中使用 GPU、torch、gunicorn 和 flask 的工作示例? Torch 1.4.0 抛出异常。请在下面找到配置
Docker 文件:
FROM nvidia/cuda:10.2-base-ubuntu18.04
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
# Create a working directory
RUN mkdir /app
WORKDIR /app
RUN apt-get update
RUN apt-get install -y curl python3.7 python3.7-dev python3.7-distutils
# Register the version in alternatives
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1
# Set python 3 as the default python
RUN update-alternatives --set python /usr/bin/python3.7
# Upgrade pip to latest version
RUN curl -s https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python get-pip.py --force-reinstall && \
rm get-pip.py
# Set the default command to python3
COPY requirements.txt .
RUN pip --no-cache-dir install -r requirements.txt
RUN pip install torch torchvision
WORKDIR /usr/src/app
COPY . ./
CMD python ./new_main.py --workers 1
和 new_main.py:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 300
}
StandaloneApplication(app, options).run()
init()
我使用的路线:
@app.route("/api/work", methods=["POST"])
def work():
try:
body = request.get_json()
if app.worker is None:
app.worker = worker()
app.worker.load_models()
...
这里抛出异常:
2020-04-09 11:33:33,544 loading file /mnt/models/best-model
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
我正在使用的命令:
sudo docker run -p 8889:8888 -e MODELSLOCATION=/mnt/models --gpus all -v $MODELSLOCATION:/mnt/models cc14ffc68256
对于 torch 1.4.0,适用于我的解决方案如下。您需要在单独的函数 init
.
中启动 flask 应用程序
最重要的是 - 您需要将 import torch
放入此函数中,并从烧瓶启动文件中删除任何出现的 if 。 Torch 1.4.0 在多处理方面存在一些问题。
def init():
if __name__ == '__main__':
import torch
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
torch.multiprocessing.set_start_method('spawn')
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 900
}
StandaloneApplication(app, options).run()
init()
有没有人有 docker 在一个应用程序中使用 GPU、torch、gunicorn 和 flask 的工作示例? Torch 1.4.0 抛出异常。请在下面找到配置
Docker 文件:
FROM nvidia/cuda:10.2-base-ubuntu18.04
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
# Create a working directory
RUN mkdir /app
WORKDIR /app
RUN apt-get update
RUN apt-get install -y curl python3.7 python3.7-dev python3.7-distutils
# Register the version in alternatives
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1
# Set python 3 as the default python
RUN update-alternatives --set python /usr/bin/python3.7
# Upgrade pip to latest version
RUN curl -s https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python get-pip.py --force-reinstall && \
rm get-pip.py
# Set the default command to python3
COPY requirements.txt .
RUN pip --no-cache-dir install -r requirements.txt
RUN pip install torch torchvision
WORKDIR /usr/src/app
COPY . ./
CMD python ./new_main.py --workers 1
和 new_main.py:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 300
}
StandaloneApplication(app, options).run()
init()
我使用的路线:
@app.route("/api/work", methods=["POST"])
def work():
try:
body = request.get_json()
if app.worker is None:
app.worker = worker()
app.worker.load_models()
...
这里抛出异常:
2020-04-09 11:33:33,544 loading file /mnt/models/best-model
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
我正在使用的命令:
sudo docker run -p 8889:8888 -e MODELSLOCATION=/mnt/models --gpus all -v $MODELSLOCATION:/mnt/models cc14ffc68256
对于 torch 1.4.0,适用于我的解决方案如下。您需要在单独的函数 init
.
最重要的是 - 您需要将 import torch
放入此函数中,并从烧瓶启动文件中删除任何出现的 if 。 Torch 1.4.0 在多处理方面存在一些问题。
def init():
if __name__ == '__main__':
import torch
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
torch.multiprocessing.set_start_method('spawn')
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 900
}
StandaloneApplication(app, options).run()
init()