通过 Crontab 使用 Selenium 执行 Python 个脚本

Execute Python scripts with Selenium via Crontab

我有几个 python 在 Debian 服务器上使用 selenium webdriver 的脚本。如果我从终端手动 运行 它们(通常以 root 身份),一切正常,但每次我尝试通过 crontab 运行 它们时,我都会遇到这样的异常:

WebDriverException: Message: Can't load the profile. Profile Dir: /tmp/tmpQ4vStP If you specified a log_file in the FirefoxBinary constructor, check it for details.

例如试试这个脚本:

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from pyvirtualdisplay import Display
from selenium import webdriver
import datetime
import logging

FIREFOX_PATH = '/usr/bin/firefox'

if __name__ == '__main__':
    cur_date = datetime.datetime.now().strftime('%Y-%m-%d')
    logging.basicConfig(filename="./logs/download_{0}.log".format(cur_date),
                        filemode='w',
                        level=logging.DEBUG,
                        format='%(asctime)s - %(levelname)s - %(message)s')
    try:
        display = Display(visible=0, size=(800, 600))
        display.start()
        print 'start'
        logging.info('start')
        binary = FirefoxBinary(FIREFOX_PATH,
                               log_file='/home/egor/dev/test/logs/firefox_binary_log.log')
        driver = webdriver.Firefox()
        driver.get("http://google.com")
        logging.info('title: ' + driver.title)
        driver.quit()
        display.stop()
    except:
        logging.exception('')
    logging.info('finish')
    print 'finish'

它的 crontab 命令:

0 13 * * * cd "/home/egor/dev/test" && python test.py

此脚本的日志文件如下所示:

2016-09-27 16:30:01,742 - DEBUG - param: "['Xvfb', '-help']" 
2016-09-27 16:30:01,743 - DEBUG - command: ['Xvfb', '-help']
2016-09-27 16:30:01,743 - DEBUG - joined command: Xvfb -help
2016-09-27 16:30:01,745 - DEBUG - process was started (pid=23042)
2016-09-27 16:30:01,747 - DEBUG - process has ended
2016-09-27 16:30:01,748 - DEBUG - return code=0
2016-09-27 16:30:01,748 - DEBUG - stdout=
2016-09-27 16:30:01,751 - DEBUG - param: "['Xvfb', '-br', '-nolisten', 'tcp', '-screen', '0', '800x600x24', ':1724']" 
2016-09-27 16:30:01,751 - DEBUG - command: ['Xvfb', '-br', '-nolisten', 'tcp', '-screen', '0', '800x600x24', ':1724']
2016-09-27 16:30:01,751 - DEBUG - joined command: Xvfb -br -nolisten tcp -screen 0 800x600x24 :1724
2016-09-27 16:30:01,753 - DEBUG - param: "['Xvfb', '-br', '-nolisten', 'tcp', '-screen', '0', '800x600x24', ':1725']" 
2016-09-27 16:30:01,753 - DEBUG - command: ['Xvfb', '-br', '-nolisten', 'tcp', '-screen', '0', '800x600x24', ':1725']
2016-09-27 16:30:01,753 - DEBUG - joined command: Xvfb -br -nolisten tcp -screen 0 800x600x24 :1725
2016-09-27 16:30:01,755 - DEBUG - process was started (pid=23043)
2016-09-27 16:30:01,755 - DEBUG - DISPLAY=:1725
2016-09-27 16:30:01,855 - INFO - start
2016-09-27 16:30:31,965 - ERROR - 
Traceback (most recent call last):
  File "test.py", line 31, in <module>
    driver = webdriver.Firefox()
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 103, in __init__
    self.binary, timeout)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/extension_connection.py", line 51, in __init__
    self.binary.launch_browser(self.profile, timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 68, in launch_browser
    self._wait_until_connectable(timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 106, in _wait_until_connectable
    % (self.profile.path))
WebDriverException: Message: Can't load the profile. Profile Dir: /tmp/tmpQ4vStP If you specified a log_file in the FirefoxBinary constructor, check it for details.

2016-09-27 16:30:31,966 - INFO - finish

我尝试过的:

  1. 确保脚本文件归 root 所有
  2. 使用导出显示=:0;或导出 DISPLAY=:99;在 crontab 命令中
  3. 将 crontab 中的 HOME 变量设置为执行 cronjob 的用户的主目录路径 运行 as

我真的被这个问题困住了。

我在 Debian 7.7 上有 python 2.7.10、带有 Xvbf 的 selenium 2.53.6 和 Firefox 47.0.1

试试硬编码的 Firefox 二进制文件 https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.firefox_binary.html

selenium.webdriver.firefox.firefox_binary.FirefoxBinary("/your/binary/location/firefox")
driver = webdriver.Firefox(firefox_binary=binary)

问题与环境变量有关:cron 由系统启动,对用户环境一无所知。

所以,解决问题的方法是使 cron 运行 成为一个 shell 脚本,首先设置所需的环境变量,然后 运行s 脚本。在我的例子中,我需要像这样设置 PATH 变量:PATH=/root/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 此外,在某些情况下设置 HOMEDISPLAY 变量可能很有用。