RSelenium 的 remoteDriver 函数发生错误

a error occur in remoteDriver function of RSelenium

我想通过以下代码在动态网络上抓取数据:

> URL<- "http://www.cbooo.cn/realtime"
> library(bitops)
> library(RCurl)
> library(XML)
> library(RSelenium)
> library(magrittr)
> checkForServer()
Warning message:
checkForServer is deprecated.
Users in future can find the function in file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity. 
Options include manually starting a server see vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see  vignette("RSelenium-docker", package = "RSelenium") 
> startServer()
$stop
function () 
{
    tools::pskill(selPID)
}
<environment: 0x10991af0>

$getPID
function () 
{
    return(selPID)
}
<environment: 0x10991af0>

Warning message:
startServer is deprecated.
Users in future can find the function in file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity. 
Options include manually starting a server see vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see  vignette("RSelenium-docker", package = "RSelenium") 
> remDrv <- remoteDriver()
> remDrv$browserName="Internet Explorer"
> remDrv$open()
[1] "Connecting to remote server"

Selenium message: The best matching driver provider org.openqa.selenium.ie.InternetExplorerDriver can't create a new driver instance for Capabilities [{nativeEvents=true, browserName=Internet Explorer, javascriptEnabled=true, version=, platform=ANY}]
Build info: version: '2.53.1', revision: 'a36b8b1', time: '2016-06-30 17:37:03'
System info: host: 'DESKTOP-J0D980N', ip: '10.36.17.76', os.name: 'Windows 10', os.arch: 'x86', os.version: '10.0', java.version: '1.8.0_77'
Driver info: driver.version: unknown 
Error:   Summary: UnknownError
     Detail: An unknown server-side error occurred while processing the command.
     class: org.openqa.selenium.WebDriverException
     Further Details: run errorDetails method

有以下问题我无法解决: 1 checkForServer、startServer 已弃用。 2 连接到远程服务器总是失败,我不知道如何在此函数中设置一些参数以及应该做什么 希望尽快得到答复,谢谢

为了生成一个可行的工作解决方案,我会使用旧版本的 RSelenium 和所有带有此代码的东西。

if (!require("XML")) {
  install.packages("XML",repos= 'https://cloud.r-project.org') 
  library("XML")
}
#XML is a dependency
if (!require("RSelenium")) {
  install.packages("https://cran.r-project.org/src/contrib/Archive/RSelenium/RSelenium_1.3.5.tar.gz", repos=NULL, type="source", dependencies = TRUE)
  library("RSelenium")
}
download.file('http://selenium-release.storage.googleapis.com/2.53/selenium-server-standalone-2.53.1.jar', destfile = "~/Documents/R/library/RSelenium/bin/selenium-server-standalone.jar")

#start server
system('java -jar "~/Documents/R/library/RSelenium/bin/selenium-server-standalone.jar"')

library(RSelenium)
checkForServer()
startServer()

这不是最佳解决方案。但是一个可行的解决方案。

RSelenium的作者给出了如下解决方案(https://github.com/ropensci/RSelenium/issues/81):

从 Firefox 48 开始,marionette 将需要 gecko 驱动程序/ 运行 带有 Selenium 的 Firefox。

如果你有 Firefox 48,你可以 运行 gecko 驱动程序如下:

参考指南

https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/WebDriver

从以下位置下载相关的 gecko 驱动程序 https://github.com/mozilla/geckodriver/releases

将其添加到您的 PATH 或在启动二进制文件时参考该位置(见下文)

# get beta selenium standalone
RSelenium::checkForServer(beta = TRUE)
# assume gecko driver is not in our path (assume windows and we downloaded to docs folder)
# if the driver is in your PATH the javaargs call is not needed
selServ <- RSelenium::startServer(javaargs = c("-    Dwebdriver.gecko.driver=\"C:/Users/john/Documents/geckodriver.exe\""))
remDr <- remoteDriver(extraCapabilities = list(marionette = TRUE))
remDr$open()
....
....
remDr$close()
selServ$stop()