在 rvest 中提交没有提交按钮的表单

Submit form with no submit button in rvest

我正在尝试编写一个爬虫来下载一些信息,类似于 答案对于创建填写的表单很有用,但我正在努力寻找一种提交表单的方法提交按钮不是表单的一部分。这是一个例子:

session <- html_session("www.chase.com")
form <- html_form(session)[[3]]

filledform <- set_values(form, `user_name` = user_name, `usr_password` = usr_password)
session <- submit_form(session, filledform)

此时,我收到此错误:

Error in names(submits)[[1]] : subscript out of bounds

如何提交此表单?

这是一个对我有用的肮脏技巧:在研究了 submit_form source code 之后,我认为我可以通过在我的表单代码版本中注入一个伪造的提交按钮来解决这个问题,然后 submit_form 函数会调用它。它可以工作,除了它会发出警告,通常会列出不适当的输入对象(尽管在下面的示例中没有)。然而,尽管有警告,代码对我有用:

session <- html_session("www.chase.com")
form <- html_form(session)[[3]]

# Form on home page has no submit button,
# so inject a fake submit button or else rvest cannot submit it.
# When I do this, rvest gives a warning "Submitting with '___'", where "___" is
# often an irrelevant field item.
# This warning might be an rvest (version 0.3.2) bug, but the code works.
fake_submit_button <- list(name = NULL,
                           type = "submit",
                           value = NULL,
                           checked = NULL,
                           disabled = NULL,
                           readonly = NULL,
                           required = FALSE)
attr(fake_submit_button, "class") <- "input"
form[["fields"]][["submit"]] <- fake_submit_button

user_name <- "user"
usr_password <- "password"

filledform <- set_values(form, `user_name` = user_name, `usr_password` = usr_password)
session <- submit_form(session, filledform)

成功的结果显示如下警告,我直接无视:

> Submitting with 'submit'