抓取 "load more" 按钮给出错误 - 无法定位元素

Scraping "load more" buttom giving error - unable to locate element

我正在尝试从这个 post , on this website https://www.coindesk.com/ 中重现所选答案的代码。但是,以下行给出错误:

#original    
#load_btn <- ffd$findElement(using = "css selector", ".load-more .btn")
#modified
load_btn <- ffd$findElement(using = "css selector", ".load-more-stories .btn")

Selenium message:Unable to locate element: load-more-stories For documentation on this error, please visit: https://www.seleniumhq.org/exceptions/no_such_element.html Build info: version: '4.0.0-alpha-2', revision: 'f148142cf8', time: '2019-07-01T21:30:10' System info: host: 'LAPTOP-sdsds9L', ip: 'sdssd', os.name: 'Windows 10', os.arch: 'x86', os.version: '10.0', java.version: '1.8.0_211' Driver info: driver.version: unknown

Error: Summary: NoSuchElement Detail: An element could not be located on the page using the given search parameters. class: org.openqa.selenium.NoSuchElementException Further Details: run errorDetails method

我根据第 449-452 行假定了按钮名称:

 </div>
            <div id="load-more-stories">
    <button>Load More Stories</button>
</div>        </div>

知道如何正确调整此策略吗?

诊断:基本上您 运行 遇到了这个问题,因为页面没有重定向到另一个页面,而是在页面上添加文章 link。我用 Web Scraping Language

写了这个

GOTO www.coindesk.com >> CRAWL ['#load-more-stories', 3] .stream-article >> EXTRACT {'title':'.meta h1', 'article':'.article-content'}

说明:这应该通过单击底部的 #load-more-stories 或 "Load More Stories" link 将所有文章抓取到第 3rd 页。然后,它使用选择器 .stream-article 访问每个 link,并在后续页面上,使用相应的选择器提取 titlearticle

A HTML id= 与 CSS class 不同。

因此您的选择器是错误的,不匹配。

您首先需要通过单击接受按钮关闭 cookie 栏,然后继续使用 load-more-stories 作为 ID,而不是 class。我无法在 R 中进行测试,但类似于:

cookie_button  <- ffd$findElement("css selector", '#CybotCookiebotDialogBodyLevelButtonAccept')
cookie_button$clickElement()
load_more_button  <- ffd$findElement("css selector", '#load-more-stories')
load_more_button$clickElement()

参考文献:

  1. https://cran.r-project.org/web/packages/RSelenium/RSelenium.pdf