从组合框中选择后从网页获取 href
Fetch href from webpage after selecting from combobox
我正在尝试从“https://beacon.schneidercorp.com/”抓取数据并需要实现:
- 在州组合框中设置“爱荷华”,在 County/city/area 组合框中设置“爱荷华州亚代尔县”
- 使用 属性 搜索按钮
- 单击 属性 搜索按钮并转到下一页
完成这一切后,浏览器到达“https://beacon.schneidercorp.com/Application.aspx?AppID=1034&LayerID=22042&PageTypeID=2&PageID=9328”,这是我的主要目标。
我填充了组合框 (tagname="option"),但出现了下一个问题:
一个。 属性 我想点击进入下一页的搜索,在我实际点击 select County/city/area 组合框
上的一个选项之前不会弹出
这是填充组合框的例程
Sub extraccionCondados2()
Dim IE As New SHDocVw.InternetExplorer
Dim htmlDoc As MSHTML.HTMLDocument
Dim htmlElementos As MSHTML.IHTMLElementCollection
Dim htmlElemento As MSHTML.IHTMLElement
IE.Visible = True
IE.navigate "https://beacon.schneidercorp.com/"
Do While IE.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set htmlDoc = IE.document
Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")
htmlElementos(0).Value = "Iowa" 'POPULATES THE STATE COMBOBOX
htmlElementos(1).Value = "1034" 'POPULATES THE COUNTY/CITY/AREA WITH THE RIGHT VALUE
htmlElementos(1).Click 'IN THIS CASE THIS LINE DOESN'T DO ANYTHING
'I'VE TRIED WORKING WITH htmlElementos CHILDREN BUT DIDN'T FIND A WAY TO DO IT
End Sub
b。在 属性 搜索进入视图
之前,我正在寻找的 href 不会出现
显示 属性 搜索之前 id="quickstartList" 为空
显示 属性 搜索后,id="quickstartList" 得到了新的 children 并且有我的目标 URL
如何使用 属性 搜索按钮,或者更好地获取第二张图片上的 href?
您必须在每次从组合框中选择后触发更改事件:
Sub extraccionCondados2()
Dim IE As New SHDocVw.InternetExplorer
Dim htmlDoc As MSHTML.htmlDocument
Dim htmlElementos As MSHTML.IHTMLElementCollection
Dim htmlElemento As MSHTML.IHTMLElement
Dim urlFromPropertySearchButton As String
IE.Visible = True
IE.navigate "https://beacon.schneidercorp.com/"
Do While IE.readyState <> 4: DoEvents: Loop
Set htmlDoc = IE.document
Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")
'Select state and trigger html change event of the combobox
htmlElementos(0).Value = "Iowa"
Call TriggerEvent(htmlDoc, htmlElementos(0), "change")
'Select country/city/area and trigger html change event of the combobox
htmlElementos(1).Value = "1034"
Call TriggerEvent(htmlDoc, htmlElementos(1), "change")
'Get property search button
Set htmlElemento = htmlDoc.getElementsByClassName("list-group-item track-mru")(0)
'If needed as string read url
urlFromPropertySearchButton = htmlElemento.href
'You have the url before clicking the button
MsgBox urlFromPropertySearchButton
'If you want to open the page for selection
htmlElemento.Click
End Sub
此程序触发一个 html 事件:
Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)
Dim theEvent As Object
htmlElementWithEvent.Focus
Set theEvent = htmlDocument.createEvent("HTMLEvents")
theEvent.initEvent eventType, True, False
htmlElementWithEvent.dispatchEvent theEvent
End Sub
以目标网站为例,使用 MSXML2.ServerHTTP 对象自动抓取网页的一些建议。
首先,您可以这样进入问题中您想要的页面:
Sub Example1()
Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References
' Opens a new GET request (no hidden info) for the url
con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
con.send searchBody ' Send the request
MsgBox con.responseText
End Sub
注意 URL 我只需要包括 AppID=1034
用于亚代尔县和 PageTypeID=2
用于 属性 搜索(我认为 pagetypeId 1 是地图)。您只需查看 HTML 即可从主页获得完整的 AppID 列表(我想您已经知道如何执行此操作了)。 MsgBox 仅显示 con
对象已将响应 return 编辑为 html 文档。
在处理您的项目并帮助调试和查看 html 时,如果您想在闲暇时查看请求的任何响应,我使用以下函数将字符串保存为文本文件:
Sub WriteToFile(s As String, n As String)
Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Dim oFile As Object
Set oFile = fso.CreateTextFile(n)
oFile.WriteLine s
oFile.Close
Set fso = Nothing
Set oFile = Nothing
End Sub
所以对于上面的代码,我会在最后调用该函数以将我的响应保存为文本文件,我可以使用记事本++将其查看为 HTML。您也可以在 F12 开发工具中查看 html 而无需保存它。
我还在下面包含了一个 HTMLdocument
对象,我将响应放入其中。
Sub Example2()
Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References
Dim html As New HTMLDocument ' An html document to hold responses, used to parse info - add reference to "Microsoft HTML Object Library"
' Opens a new GET request (no hidden info) for the url
con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
con.send searchBody ' Send the request
WriteToFile con.responseText, "C:\Users\JamHeadArt\Documents\responseText.txt"
html.body.innerhtml = con.responseBody
End Sub
填充 html
文档后,您可以使用 getElementByID
之类的东西来帮助解析结果等。这只是 XML 的另一种形式,因此您可以遍历节点并找到东西通过 child/parent 关系等
使用 F12 开发工具
我可以使用网络下的 F12 开发人员工具弄清楚这些东西。在单击搜索按钮或其他任何按钮之前,只需清除网络流量,然后当您单击搜索时,您会看到一堆请求。第一个通常是您想要查看并基本上模仿的请求(其余请求将是 javascript 触发、css、图像和一般内容)。任何请求都有一个 URL,如果是 post 请求,有时还有一个 BODY。
无需深入了解太多细节,您通常可以跳过一大堆搜索步骤和页面,并通过了解最终搜索的结构和参数来获取所需的信息,只需调用一次网站即可, return 信息直接解析为 Excel。没用浏览器,快多了。
选择 Iowa 后,您是否在包含所有选项值的 html 下拉列表中找到 html?
<optgroup label="Iowa">
<option value="1034">Adair County, IA</option>
<option value="78">Allamakee County, IA</option>
<option value="165">Ames, IA</option>
<option value="96">Audubon County, IA</option>
<option value="83">Benton County, IA</option>
<option value="84">Boone County, IA</option>
<option value="330">Bremer County, IA</option>
<option value="1015">Buena Vista County, IA</option>
<option value="215">Cass County, IA</option>
<option value="408">Cerro Gordo County, IA</option>
<option value="501">Cherokee County, IA</option>
<option value="47">Chickasaw County, IA</option>
<option value="29">City of Ames, IA - Traffic Accident Database</option>
<option value="933">City of Cascade, IA</option>
<option value="516">City of Estherville, IA</option>
<option value="1061">City of Sigourney, IA</option>
<option value="1043">Clay County, IA</option>
<option value="227">Clayton County, IA</option>
<option value="375">Clinton County, IA</option>
<option value="909">Dallas County, IA</option>
<option value="49">Davis County, IA</option>
<option value="72">Delaware County, IA</option>
<option value="376">Dickinson County, IA</option>
<option value="93">Dubuque County, IA</option>
<option value="15">Emmet County, IA</option>
<option value="79">Fayette County, IA</option>
<option value="82">Floyd County, IA</option>
<option value="150">Franklin County, IA</option>
<option value="825">Fremont County, IA</option>
<option value="1064">Greene County, IA</option>
<option value="3">Grundy County, IA</option>
<option value="395">Guthrie County, IA</option>
<option value="140">Hardin County, IA</option>
<option value="44">Harrison County, IA</option>
<option value="60">Henry County, IA</option>
<option value="617">Humboldt County, IA</option>
<option value="80">Jackson County, IA</option>
<option value="325">Jasper County, IA</option>
<option value="1037">Jefferson County, IA</option>
<option value="86">Johnson County, IA</option>
<option value="164">Jones County, IA</option>
<option value="81">Keokuk County, IA</option>
<option value="177">Lee County, IA</option>
<option value="54">Louisa County, IA</option>
<option value="594">Lyon County, IA</option>
<option value="406">Madison County, IA</option>
<option value="25">Mahaska County, IA</option>
<option value="70">Marion County, IA</option>
<option value="1026">Marshall County, IA</option>
<option value="410">Mason City, IA</option>
<option value="153">Mills County, IA</option>
<option value="929">Mitchell County, IA</option>
<option value="21">Montgomery County, IA</option>
<option value="12">Muscatine Area Geographic Information Consortium (MAGIC)</option>
<option value="331">O'Brien County, IA</option>
<option value="611">Osceola County, IA</option>
<option value="220">Page County, IA</option>
<option value="218">Palo Alto County, IA</option>
<option value="1012">Plymouth County, IA</option>
<option value="144">Pocahontas County, IA</option>
<option value="135">Poweshiek County, IA</option>
<option value="508">Ringgold County, IA</option>
<option value="75">Sac County, IA</option>
<option value="1024">Scott County / City of Davenport, Iowa</option>
<option value="11">Shelby County, IA</option>
<option value="10">Sioux City, IA</option>
<option value="984">Sioux County, IA</option>
<option value="165">Story County, IA / City of Ames</option>
<option value="225">Union County, IA</option>
<option value="595">Wapello County, IA</option>
<option value="9">Warren County, IA</option>
<option value="1036">Washington County, IA</option>
<option value="723">Webster County, IA</option>
<option value="73">Winnebago County, IA</option>
<option value="110">Winneshiek County, IA</option>
<option value="10">Woodbury County, IA / Sioux City</option>
<option value="588">Worth County, IA</option>
<option value="399">Wright County, IA</option>
</optgroup>
我正在尝试从“https://beacon.schneidercorp.com/”抓取数据并需要实现:
- 在州组合框中设置“爱荷华”,在 County/city/area 组合框中设置“爱荷华州亚代尔县”
- 使用 属性 搜索按钮
- 单击 属性 搜索按钮并转到下一页
完成这一切后,浏览器到达“https://beacon.schneidercorp.com/Application.aspx?AppID=1034&LayerID=22042&PageTypeID=2&PageID=9328”,这是我的主要目标。
我填充了组合框 (tagname="option"),但出现了下一个问题:
一个。 属性 我想点击进入下一页的搜索,在我实际点击 select County/city/area 组合框
上的一个选项之前不会弹出这是填充组合框的例程
Sub extraccionCondados2()
Dim IE As New SHDocVw.InternetExplorer
Dim htmlDoc As MSHTML.HTMLDocument
Dim htmlElementos As MSHTML.IHTMLElementCollection
Dim htmlElemento As MSHTML.IHTMLElement
IE.Visible = True
IE.navigate "https://beacon.schneidercorp.com/"
Do While IE.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set htmlDoc = IE.document
Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")
htmlElementos(0).Value = "Iowa" 'POPULATES THE STATE COMBOBOX
htmlElementos(1).Value = "1034" 'POPULATES THE COUNTY/CITY/AREA WITH THE RIGHT VALUE
htmlElementos(1).Click 'IN THIS CASE THIS LINE DOESN'T DO ANYTHING
'I'VE TRIED WORKING WITH htmlElementos CHILDREN BUT DIDN'T FIND A WAY TO DO IT
End Sub
b。在 属性 搜索进入视图
之前,我正在寻找的 href 不会出现显示 属性 搜索之前 id="quickstartList" 为空
显示 属性 搜索后,id="quickstartList" 得到了新的 children 并且有我的目标 URL
如何使用 属性 搜索按钮,或者更好地获取第二张图片上的 href?
您必须在每次从组合框中选择后触发更改事件:
Sub extraccionCondados2()
Dim IE As New SHDocVw.InternetExplorer
Dim htmlDoc As MSHTML.htmlDocument
Dim htmlElementos As MSHTML.IHTMLElementCollection
Dim htmlElemento As MSHTML.IHTMLElement
Dim urlFromPropertySearchButton As String
IE.Visible = True
IE.navigate "https://beacon.schneidercorp.com/"
Do While IE.readyState <> 4: DoEvents: Loop
Set htmlDoc = IE.document
Set htmlElementos = htmlDoc.getElementsByClassName("form-control input-lg")
'Select state and trigger html change event of the combobox
htmlElementos(0).Value = "Iowa"
Call TriggerEvent(htmlDoc, htmlElementos(0), "change")
'Select country/city/area and trigger html change event of the combobox
htmlElementos(1).Value = "1034"
Call TriggerEvent(htmlDoc, htmlElementos(1), "change")
'Get property search button
Set htmlElemento = htmlDoc.getElementsByClassName("list-group-item track-mru")(0)
'If needed as string read url
urlFromPropertySearchButton = htmlElemento.href
'You have the url before clicking the button
MsgBox urlFromPropertySearchButton
'If you want to open the page for selection
htmlElemento.Click
End Sub
此程序触发一个 html 事件:
Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)
Dim theEvent As Object
htmlElementWithEvent.Focus
Set theEvent = htmlDocument.createEvent("HTMLEvents")
theEvent.initEvent eventType, True, False
htmlElementWithEvent.dispatchEvent theEvent
End Sub
以目标网站为例,使用 MSXML2.ServerHTTP 对象自动抓取网页的一些建议。
首先,您可以这样进入问题中您想要的页面:
Sub Example1()
Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References
' Opens a new GET request (no hidden info) for the url
con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
con.send searchBody ' Send the request
MsgBox con.responseText
End Sub
注意 URL 我只需要包括 AppID=1034
用于亚代尔县和 PageTypeID=2
用于 属性 搜索(我认为 pagetypeId 1 是地图)。您只需查看 HTML 即可从主页获得完整的 AppID 列表(我想您已经知道如何执行此操作了)。 MsgBox 仅显示 con
对象已将响应 return 编辑为 html 文档。
在处理您的项目并帮助调试和查看 html 时,如果您想在闲暇时查看请求的任何响应,我使用以下函数将字符串保存为文本文件:
Sub WriteToFile(s As String, n As String)
Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Dim oFile As Object
Set oFile = fso.CreateTextFile(n)
oFile.WriteLine s
oFile.Close
Set fso = Nothing
Set oFile = Nothing
End Sub
所以对于上面的代码,我会在最后调用该函数以将我的响应保存为文本文件,我可以使用记事本++将其查看为 HTML。您也可以在 F12 开发工具中查看 html 而无需保存它。
我还在下面包含了一个 HTMLdocument
对象,我将响应放入其中。
Sub Example2()
Dim con As New MSXML2.ServerXMLHTTP60 ' A web request object - must add project reference to "Microsoft XML, V6.0" in Tools > References
Dim html As New HTMLDocument ' An html document to hold responses, used to parse info - add reference to "Microsoft HTML Object Library"
' Opens a new GET request (no hidden info) for the url
con.Open "GET", "https://beacon.schneidercorp.com/Application.aspx?AppID=1034&PageTypeID=2"
con.setRequestHeader "Content-type", "application/x-www-form-urlencoded" ' set a standard content-type for the request
con.send searchBody ' Send the request
WriteToFile con.responseText, "C:\Users\JamHeadArt\Documents\responseText.txt"
html.body.innerhtml = con.responseBody
End Sub
填充 html
文档后,您可以使用 getElementByID
之类的东西来帮助解析结果等。这只是 XML 的另一种形式,因此您可以遍历节点并找到东西通过 child/parent 关系等
使用 F12 开发工具
我可以使用网络下的 F12 开发人员工具弄清楚这些东西。在单击搜索按钮或其他任何按钮之前,只需清除网络流量,然后当您单击搜索时,您会看到一堆请求。第一个通常是您想要查看并基本上模仿的请求(其余请求将是 javascript 触发、css、图像和一般内容)。任何请求都有一个 URL,如果是 post 请求,有时还有一个 BODY。
无需深入了解太多细节,您通常可以跳过一大堆搜索步骤和页面,并通过了解最终搜索的结构和参数来获取所需的信息,只需调用一次网站即可, return 信息直接解析为 Excel。没用浏览器,快多了。
选择 Iowa 后,您是否在包含所有选项值的 html 下拉列表中找到 html?
<optgroup label="Iowa">
<option value="1034">Adair County, IA</option>
<option value="78">Allamakee County, IA</option>
<option value="165">Ames, IA</option>
<option value="96">Audubon County, IA</option>
<option value="83">Benton County, IA</option>
<option value="84">Boone County, IA</option>
<option value="330">Bremer County, IA</option>
<option value="1015">Buena Vista County, IA</option>
<option value="215">Cass County, IA</option>
<option value="408">Cerro Gordo County, IA</option>
<option value="501">Cherokee County, IA</option>
<option value="47">Chickasaw County, IA</option>
<option value="29">City of Ames, IA - Traffic Accident Database</option>
<option value="933">City of Cascade, IA</option>
<option value="516">City of Estherville, IA</option>
<option value="1061">City of Sigourney, IA</option>
<option value="1043">Clay County, IA</option>
<option value="227">Clayton County, IA</option>
<option value="375">Clinton County, IA</option>
<option value="909">Dallas County, IA</option>
<option value="49">Davis County, IA</option>
<option value="72">Delaware County, IA</option>
<option value="376">Dickinson County, IA</option>
<option value="93">Dubuque County, IA</option>
<option value="15">Emmet County, IA</option>
<option value="79">Fayette County, IA</option>
<option value="82">Floyd County, IA</option>
<option value="150">Franklin County, IA</option>
<option value="825">Fremont County, IA</option>
<option value="1064">Greene County, IA</option>
<option value="3">Grundy County, IA</option>
<option value="395">Guthrie County, IA</option>
<option value="140">Hardin County, IA</option>
<option value="44">Harrison County, IA</option>
<option value="60">Henry County, IA</option>
<option value="617">Humboldt County, IA</option>
<option value="80">Jackson County, IA</option>
<option value="325">Jasper County, IA</option>
<option value="1037">Jefferson County, IA</option>
<option value="86">Johnson County, IA</option>
<option value="164">Jones County, IA</option>
<option value="81">Keokuk County, IA</option>
<option value="177">Lee County, IA</option>
<option value="54">Louisa County, IA</option>
<option value="594">Lyon County, IA</option>
<option value="406">Madison County, IA</option>
<option value="25">Mahaska County, IA</option>
<option value="70">Marion County, IA</option>
<option value="1026">Marshall County, IA</option>
<option value="410">Mason City, IA</option>
<option value="153">Mills County, IA</option>
<option value="929">Mitchell County, IA</option>
<option value="21">Montgomery County, IA</option>
<option value="12">Muscatine Area Geographic Information Consortium (MAGIC)</option>
<option value="331">O'Brien County, IA</option>
<option value="611">Osceola County, IA</option>
<option value="220">Page County, IA</option>
<option value="218">Palo Alto County, IA</option>
<option value="1012">Plymouth County, IA</option>
<option value="144">Pocahontas County, IA</option>
<option value="135">Poweshiek County, IA</option>
<option value="508">Ringgold County, IA</option>
<option value="75">Sac County, IA</option>
<option value="1024">Scott County / City of Davenport, Iowa</option>
<option value="11">Shelby County, IA</option>
<option value="10">Sioux City, IA</option>
<option value="984">Sioux County, IA</option>
<option value="165">Story County, IA / City of Ames</option>
<option value="225">Union County, IA</option>
<option value="595">Wapello County, IA</option>
<option value="9">Warren County, IA</option>
<option value="1036">Washington County, IA</option>
<option value="723">Webster County, IA</option>
<option value="73">Winnebago County, IA</option>
<option value="110">Winneshiek County, IA</option>
<option value="10">Woodbury County, IA / Sioux City</option>
<option value="588">Worth County, IA</option>
<option value="399">Wright County, IA</option>
</optgroup>