从组合框中选择后从网页获取 href

Fetch href from webpage after selecting from combobox

我正在尝试从“https://beacon.schneidercorp.com/”抓取数据并需要实现:

  1. 在州组合框中设置“爱荷华”,在 County/city/area 组合框中设置“爱荷华州亚代尔县”
  2. 使用 属性 搜索按钮
  3. 单击 属性 搜索按钮并转到下一页

完成这一切后,浏览器到达“https://beacon.schneidercorp.com/Application.aspx?AppID=1034&LayerID=22042&PageTypeID=2&PageID=9328”,这是我的主要目标。

我填充了组合框 (tagname="option"),但出现了下一个问题:

一个。 属性 我想点击进入下一页的搜索,在我实际点击 select County/city/area 组合框

上的一个选项之前不会弹出

这是填充组合框的例程

Sub extraccionCondados2()
   Dim IE As New SHDocVw.InternetExplorer
   Dim htmlDoc As MSHTML.HTMLDocument
   Dim htmlElementos As MSHTML.IHTMLElementCollection
   Dim htmlElemento As MSHTML.IHTMLElement
   
   IE.Visible = True
   IE.navigate "https://beacon.schneidercorp.com/"
    
   Do While IE.readyState <> READYSTATE_COMPLETE
      DoEvents
   Loop
   
   Set htmlDoc = IE.document
   Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")
   htmlElementos(0).Value = "Iowa" 'POPULATES THE STATE COMBOBOX
   htmlElementos(1).Value = "1034" 'POPULATES THE COUNTY/CITY/AREA WITH THE RIGHT VALUE
   htmlElementos(1).Click 'IN THIS CASE THIS LINE DOESN'T DO ANYTHING
   'I'VE TRIED WORKING WITH htmlElementos CHILDREN BUT DIDN'T FIND A WAY TO DO IT
End Sub

b。在 属性 搜索进入视图

之前,我正在寻找的 href 不会出现

显示 属性 搜索之前 id="quickstartList" 为空

显示 属性 搜索后,id="quickstartList" 得到了新的 children 并且有我的目标 URL

如何使用 属性 搜索按钮,或者更好地获取第二张图片上的 href?

您必须在每次从组合框中选择后触发更改事件:

Sub extraccionCondados2()
  Dim IE As New SHDocVw.InternetExplorer
  Dim htmlDoc As MSHTML.htmlDocument
  Dim htmlElementos As MSHTML.IHTMLElementCollection
  Dim htmlElemento As MSHTML.IHTMLElement
  Dim urlFromPropertySearchButton As String

  IE.Visible = True
  IE.navigate "https://beacon.schneidercorp.com/"
  Do While IE.readyState <> 4: DoEvents: Loop

  Set htmlDoc = IE.document
  Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")

  'Select state and trigger html change event of the combobox
  htmlElementos(0).Value = "Iowa"
  Call TriggerEvent(htmlDoc, htmlElementos(0), "change")

  'Select country/city/area and trigger html change event of the combobox
  htmlElementos(1).Value = "1034"
  Call TriggerEvent(htmlDoc, htmlElementos(1), "change")

  'Get property search button
  Set htmlElemento = htmlDoc.getElementsByClassName("list-group-item track-mru")(0)

  'If needed as string read url
  urlFromPropertySearchButton = htmlElemento.href
  'You have the url before clicking the button
  MsgBox urlFromPropertySearchButton

  'If you want to open the page for selection
  htmlElemento.Click
End Sub

此程序触发一个 html 事件:

Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)

  Dim theEvent As Object

  htmlElementWithEvent.Focus
  Set theEvent = htmlDocument.createEvent("HTMLEvents")
  theEvent.initEvent eventType, True, False
  htmlElementWithEvent.dispatchEvent theEvent
End Sub

以目标网站为例,使用 MSXML2.ServerHTTP 对象自动抓取网页的一些建议。

首先,您可以这样进入问题中您想要的页面:

Sub Example1()

Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References

    ' Opens a new GET request (no hidden info) for the url
    con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
    con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
    con.send searchBody ' Send the request

    MsgBox con.responseText

End Sub

注意 URL 我只需要包括 AppID=1034 用于亚代尔县和 PageTypeID=2 用于 属性 搜索(我认为 pagetypeId 1 是地图)。您只需查看 HTML 即可从主页获得完整的 AppID 列表(我想您已经知道如何执行此操作了)。 MsgBox 仅显示 con 对象已将响应 return 编辑为 html 文档。

在处理您的项目并帮助调试和查看 html 时,如果您想在闲暇时查看请求的任何响应,我使用以下函数将字符串保存为文本文件:

Sub WriteToFile(s As String, n As String)
Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Dim oFile As Object
Set oFile = fso.CreateTextFile(n)
oFile.WriteLine s
oFile.Close
Set fso = Nothing
Set oFile = Nothing
End Sub

所以对于上面的代码,我会在最后调用该函数以将我的响应保存为文本文件,我可以使用记事本++将其查看为 HTML。您也可以在 F12 开发工具中查看 html 而无需保存它。

我还在下面包含了一个 HTMLdocument 对象,我将响应放入其中。

Sub Example2()

Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References
Dim html As New HTMLDocument ' An html document to hold responses, used to parse info - add reference to "Microsoft HTML Object Library"

    ' Opens a new GET request (no hidden info) for the url
    con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
    con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
    con.send searchBody ' Send the request

    WriteToFile con.responseText, "C:\Users\JamHeadArt\Documents\responseText.txt"
    html.body.innerhtml = con.responseBody

End Sub

填充 html 文档后,您可以使用 getElementByID 之类的东西来帮助解析结果等。这只是 XML 的另一种形式,因此您可以遍历节点并找到东西通过 child/parent 关系等


使用 F12 开发工具

我可以使用网络下的 F12 开发人员工具弄清楚这些东西。在单击搜索按钮或其他任何按钮之前,只需清除网络流量,然后当您单击搜索时,您会看到一堆请求。第一个通常是您想要查看并基本上模仿的请求(其余请求将是 javascript 触发、css、图像和一般内容)。任何请求都有一个 URL,如果是 post 请求,有时还有一个 BODY。

无需深入了解太多细节,您通常可以跳过一大堆搜索步骤和页面,并通过了解最终搜索的结构和参数来获取所需的信息,只需调用一次网站即可, return 信息直接解析为 Excel。没用浏览器,快多了。


选择 Iowa 后,您是否在包含所有选项值的 html 下拉列表中找到 html?

<optgroup label="Iowa">
    <option value="1034">Adair County,  IA</option>
    <option value="78">Allamakee County, IA</option>
    <option value="165">Ames, IA</option>
    <option value="96">Audubon County, IA</option>
    <option value="83">Benton County, IA</option>
    <option value="84">Boone County, IA</option>
    <option value="330">Bremer County, IA</option>
    <option value="1015">Buena Vista County,  IA</option>
    <option value="215">Cass County, IA</option>
    <option value="408">Cerro Gordo County, IA</option>
    <option value="501">Cherokee County, IA</option>
    <option value="47">Chickasaw County, IA</option>
    <option value="29">City of Ames, IA - Traffic Accident Database</option>
    <option value="933">City of Cascade, IA</option>
    <option value="516">City of Estherville, IA</option>
    <option value="1061">City of Sigourney, IA</option>
    <option value="1043">Clay County,  IA</option>
    <option value="227">Clayton County, IA</option>
    <option value="375">Clinton County, IA</option>
    <option value="909">Dallas County,  IA</option>
    <option value="49">Davis County, IA</option>
    <option value="72">Delaware County, IA</option>
    <option value="376">Dickinson County, IA</option>
    <option value="93">Dubuque County, IA</option>
    <option value="15">Emmet County, IA</option>
    <option value="79">Fayette County, IA</option>
    <option value="82">Floyd County, IA</option>
    <option value="150">Franklin County, IA</option>
    <option value="825">Fremont County,  IA</option>
    <option value="1064">Greene County,  IA</option>
    <option value="3">Grundy County, IA</option>
    <option value="395">Guthrie County, IA</option>
    <option value="140">Hardin County, IA</option>
    <option value="44">Harrison County, IA</option>
    <option value="60">Henry County, IA</option>
    <option value="617">Humboldt County, IA</option>
    <option value="80">Jackson County, IA</option>
    <option value="325">Jasper County, IA</option>
    <option value="1037">Jefferson County,  IA</option>
    <option value="86">Johnson County, IA</option>
    <option value="164">Jones County, IA</option>
    <option value="81">Keokuk County, IA</option>
    <option value="177">Lee County, IA</option>
    <option value="54">Louisa County, IA</option>
    <option value="594">Lyon County, IA</option>
    <option value="406">Madison County, IA</option>
    <option value="25">Mahaska County, IA</option>
    <option value="70">Marion County, IA</option>
    <option value="1026">Marshall County,  IA</option>
    <option value="410">Mason City, IA</option>
    <option value="153">Mills County, IA</option>
    <option value="929">Mitchell County,  IA</option>
    <option value="21">Montgomery County, IA</option>
    <option value="12">Muscatine Area Geographic Information Consortium (MAGIC)</option>
    <option value="331">O'Brien County, IA</option>
    <option value="611">Osceola County, IA</option>
    <option value="220">Page County, IA</option>
    <option value="218">Palo Alto County, IA</option>
    <option value="1012">Plymouth County,  IA</option>
    <option value="144">Pocahontas County, IA</option>
    <option value="135">Poweshiek County, IA</option>
    <option value="508">Ringgold County, IA</option>
    <option value="75">Sac County, IA</option>
    <option value="1024">Scott County / City of Davenport, Iowa</option>
    <option value="11">Shelby County, IA</option>
    <option value="10">Sioux City, IA</option>
    <option value="984">Sioux County,  IA</option>
    <option value="165">Story County, IA / City of Ames</option>
    <option value="225">Union County, IA</option>
    <option value="595">Wapello County, IA</option>
    <option value="9">Warren County, IA</option>
    <option value="1036">Washington County,  IA</option>
    <option value="723">Webster County, IA</option>
    <option value="73">Winnebago County, IA</option>
    <option value="110">Winneshiek County, IA</option>
    <option value="10">Woodbury County, IA / Sioux City</option>
    <option value="588">Worth County, IA</option>
    <option value="399">Wright County, IA</option>
</optgroup>