检索 google 播放评论时出现 Selenium Webdriver 异常
Selenium Webdriver exception while retrieving google play reviews
我正在尝试使用 selenium 和 BeautifulSoup 从 Google Play 商店检索应用相关信息。当我尝试检索信息时,出现 webdriver 异常错误。我检查了 chrome 版本和 chrome 驱动程序版本(两者兼容)。以下是导致问题的网络链接、检索信息的代码以及代码抛出的错误:
代码:
driver = webdriver.Chrome('path')
driver.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
soup = bs.BeautifulSoup(driver.page_source, 'html.parser')
第三行出现错误。以下是错误消息的部分内容:
错误消息的开头:
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
<ipython-input-280-4e8a1ef443f2> in <module>()
----> 1 soup = bs.BeautifulSoup(driver.page_source, 'html.parser')
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in page_source(self)
676 driver.page_source
677 """
--> 678 return self.execute(Command.GET_PAGE_SOURCE)['value']
679
680 def close(self):
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
318 response = self.command_executor.execute(driver_command, params)
319 if response:
--> 320 self.error_handler.check_response(response)
321 response['value'] = self._unwrap_value(
322 response.get('value', None))
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
240 alert_text = value['alert'].get('text')
241 raise exception_class(message, screen, stacktrace, alert_text)
--> 242 raise exception_class(message, screen, stacktrace)
243
244 def _value_or_default(self, obj, key, default):
WebDriverException: Message: unknown error: bad inspector message:
错误信息结束:
(Session info: chrome=79.0.3945.117)
谁能指导我如何解决这个问题?
试试这个
driver = webdriver.Chrome('path')
driver.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
# retrieve data you want, for example
review_user_list = driver.find_elements_by_class_name("X43Kjb")
我认为这是由于 chromedriver 编码问题。
有关此错误的更多信息,请参阅 https://bugs.chromium.org/p/chromium/issues/detail?id=723592#c9。
您可以使用 BeautifulSoup 获取页面源,而不是 selenium,如下所示。
import requests
from bs4 import BeautifulSoup
r = requests.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
soup = BeautifulSoup(r.content, "lxml")
print(soup)
您可以按如下方式使用urllib with beautifulsoup:
代码块:
# -*- coding: UTF-8 -*
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
url = "https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true"
uClient = uReq(url)
page_html = uClient.read()
uClient.close()
page_soup = BeautifulSoup(page_html, "html.parser")
print(page_soup)
控制台输出:
<!DOCTYPE doctype html>
<html dir="ltr" lang="en"><head><base href="https://play.google.com/"/><meta content="origin" name="referrer"/><link href="/opensearch.xml" rel="search" title="Google Play" type="application/opensearchdescription+xml"/>
.
.
.
<style nonce="96JYwPKBYhVDb+ABipwCww">@font-face{font-family:'Roboto';font-style:normal;font-weight:100;src:local('Roboto Thin'),local('Roboto-Thin'),url(//fonts.gstatic.com/s/roboto/v18/KFOkCnqEu92Fr1MmgVxIIzc.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:300;src:local('Roboto Light'),local('Roboto-Light'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmSU5fBBc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:400;src:local('Roboto Regular'),local('Roboto-Regular'),url(//fonts.gstatic.com/s/roboto/v18/KFOmCnqEu92Fr1Mu4mxP.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:500;src:local('Roboto Medium'),local('Roboto-Medium'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmEU9fBBc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:700;src:local('Roboto Bold'),local('Roboto-Bold'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmWUlfBBc9.ttf)format('truetype');}@font-face{font-family:'Material Icons Extended';font-style:normal;font-weight:400;src:url(//fonts.gstatic.com/s/materialiconsextended/v50/kJEjBvgX7BgnkSrUwT8UnLVc38YydejYY-oE_LvM.ttf)format('truetype');}.material-icons-extended{font-family:'Material Icons Extended';font-weight:normal;font-style:normal;font-size:24px;line-height:1;letter-spacing:normal;text-transform:none;display:inline-block;white-space:nowrap;word-wrap:normal;direction:ltr;}@font-face{font-family:'Product Sans';font-style:normal;font-weight:400;src:local('Product Sans'),local('ProductSans-Regular'),url(//fonts.gstatic.com/s/productsans/v9/pxiDypQkot1TnFhsFMOfGShVF9eL.ttf)format('truetype');}</style><script nonce="96JYwPKBYhVDb+ABipwCww">(function(){/*
Copyright The Closure Library Authors.
SPDX-License-Identifier: Apache-2.0
*/
我正在尝试使用 selenium 和 BeautifulSoup 从 Google Play 商店检索应用相关信息。当我尝试检索信息时,出现 webdriver 异常错误。我检查了 chrome 版本和 chrome 驱动程序版本(两者兼容)。以下是导致问题的网络链接、检索信息的代码以及代码抛出的错误:
代码:
driver = webdriver.Chrome('path')
driver.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
soup = bs.BeautifulSoup(driver.page_source, 'html.parser')
第三行出现错误。以下是错误消息的部分内容:
错误消息的开头:
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
<ipython-input-280-4e8a1ef443f2> in <module>()
----> 1 soup = bs.BeautifulSoup(driver.page_source, 'html.parser')
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in page_source(self)
676 driver.page_source
677 """
--> 678 return self.execute(Command.GET_PAGE_SOURCE)['value']
679
680 def close(self):
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
318 response = self.command_executor.execute(driver_command, params)
319 if response:
--> 320 self.error_handler.check_response(response)
321 response['value'] = self._unwrap_value(
322 response.get('value', None))
~/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
240 alert_text = value['alert'].get('text')
241 raise exception_class(message, screen, stacktrace, alert_text)
--> 242 raise exception_class(message, screen, stacktrace)
243
244 def _value_or_default(self, obj, key, default):
WebDriverException: Message: unknown error: bad inspector message:
错误信息结束:
(Session info: chrome=79.0.3945.117)
谁能指导我如何解决这个问题?
试试这个
driver = webdriver.Chrome('path')
driver.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
# retrieve data you want, for example
review_user_list = driver.find_elements_by_class_name("X43Kjb")
我认为这是由于 chromedriver 编码问题。 有关此错误的更多信息,请参阅 https://bugs.chromium.org/p/chromium/issues/detail?id=723592#c9。
您可以使用 BeautifulSoup 获取页面源,而不是 selenium,如下所示。
import requests
from bs4 import BeautifulSoup
r = requests.get('https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true')
soup = BeautifulSoup(r.content, "lxml")
print(soup)
您可以按如下方式使用urllib with beautifulsoup:
代码块:
# -*- coding: UTF-8 -* from bs4 import BeautifulSoup from urllib.request import urlopen as uReq url = "https://play.google.com/store/apps/details?id=com.tudasoft.android.BeMakeup&hl=en&showAllReviews=true" uClient = uReq(url) page_html = uClient.read() uClient.close() page_soup = BeautifulSoup(page_html, "html.parser") print(page_soup)
控制台输出:
<!DOCTYPE doctype html> <html dir="ltr" lang="en"><head><base href="https://play.google.com/"/><meta content="origin" name="referrer"/><link href="/opensearch.xml" rel="search" title="Google Play" type="application/opensearchdescription+xml"/> . . . <style nonce="96JYwPKBYhVDb+ABipwCww">@font-face{font-family:'Roboto';font-style:normal;font-weight:100;src:local('Roboto Thin'),local('Roboto-Thin'),url(//fonts.gstatic.com/s/roboto/v18/KFOkCnqEu92Fr1MmgVxIIzc.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:300;src:local('Roboto Light'),local('Roboto-Light'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmSU5fBBc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:400;src:local('Roboto Regular'),local('Roboto-Regular'),url(//fonts.gstatic.com/s/roboto/v18/KFOmCnqEu92Fr1Mu4mxP.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:500;src:local('Roboto Medium'),local('Roboto-Medium'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmEU9fBBc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:700;src:local('Roboto Bold'),local('Roboto-Bold'),url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmWUlfBBc9.ttf)format('truetype');}@font-face{font-family:'Material Icons Extended';font-style:normal;font-weight:400;src:url(//fonts.gstatic.com/s/materialiconsextended/v50/kJEjBvgX7BgnkSrUwT8UnLVc38YydejYY-oE_LvM.ttf)format('truetype');}.material-icons-extended{font-family:'Material Icons Extended';font-weight:normal;font-style:normal;font-size:24px;line-height:1;letter-spacing:normal;text-transform:none;display:inline-block;white-space:nowrap;word-wrap:normal;direction:ltr;}@font-face{font-family:'Product Sans';font-style:normal;font-weight:400;src:local('Product Sans'),local('ProductSans-Regular'),url(//fonts.gstatic.com/s/productsans/v9/pxiDypQkot1TnFhsFMOfGShVF9eL.ttf)format('truetype');}</style><script nonce="96JYwPKBYhVDb+ABipwCww">(function(){/* Copyright The Closure Library Authors. SPDX-License-Identifier: Apache-2.0 */