2018/09/30

GCPのdebian上でseleniumのheadless firefoxを試みる

GCPのdebian上でseleniumのheadless firefoxを試みる

結論を言いますとうまくいきませんでした.
手元のMac上なら動くが,GCP上ではうまく行かず.


以下,やったことの記録.

以下のサイトに習い,geckodriveをインストール.
https://a-zumi.net/selenium-ubuntu-geckodriver/
https://github.com/mozilla/geckodriver/releases


firefoxをインストール.
sudo apt-get install firefox-esr

他にも,mozillaのリポジトリを追加して最新版を追加したり,debファイルを引っ張ってきて,旧版をインストールしたりしたが,結果的にはあまり意味なかった.

python3で以下のように実行.

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
driver = webdriver.Firefox(firefox_options=options)


次のようなエラーが出る.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
    keep_alive=True)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 156, in __init__
    self.start_session(capabilities, browser_profile)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 251, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute
    self.error_handler.check_response(response)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: Unable to find a matching set of capabilities


試行錯誤の上,capabilities の marionette を False にするとcapability のエラーが無くなった.


from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
desiredcapabilities = DesiredCapabilities.FIREFOX.copy()
desiredcapabilities['marionette'] = False
driver = webdriver.Firefox(capabilities=desiredcapabilities, firefox_options=options)


しかし,まだ以下のようなエラーが出る.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py", line 187, in __init__
    self.binary, timeout)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/firefox/extension_connection.py", line 52, in __init__
    self.binary.launch_browser(self.profile, timeout=timeout)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 73, in launch_browser
    self._wait_until_connectable(timeout=timeout)
  File "/opt/bitnami/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 104, in _wait_until_connectable
    "The browser appears to have exited "
selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.


以下のようにすると,geckodriverのログを出力できる.

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
desiredcapabilities = DesiredCapabilities.FIREFOX.copy()
desiredcapabilities['marionette'] = False
binary = FirefoxBinary(log_file=open("geckodriver.log", "wb"))
binary.add_command_line_options('-headless')
driver = webdriver.Firefox(capabilities=desiredcapabilities, firefox_binary=binary)

ログを見ると,以下のような内容.

XPCOMGlueLoad error for file /usr/lib/firefox-esr/libmozgtk.so:
/usr/lib/x86_64-linux-gnu/libpangoft2-1.0.so.0: undefined symbol: FcConfigReference
Couldn't load XPCOM.

このエラーがどうしても解決できずに断念.
chromeに切り替えることにした.


2018/10/8 追記
Chromeに変えても同様の問題が発生し,散々色々やった挙げ句,解決せず途方に暮れた.
結局,vmをbitnamiのdjangostacでやっていたのが問題だったようで,vmをbitnamiを使わず新たに立ち上げ直してやってみたらすんなり行って,拍子抜けした.