尝试使用 Python 从使用 BS4 的网站抓取数据

Trying to webcrawl data using Python from the website using BS4

我正在尝试从 URL(代码中提到)导入数据。当我 运行 代码时,我没有得到任何信息(如计划名称和费率)并给我容器 div 标签而不是内容。另外,我试过 response.text 但它给了我 results.I 不想使用 Selenium。有办法解决吗?

from bs4 import BeautifulSoup
import urllib

from urllib.request import urlopen

URL="https://www.energymadeeasy.gov.au/plan?id=POW15475MBE3&postcode=2000"
response=urlopen(URL)
html_content=BeautifulSoup(response)
print(html_content)

soup=BeautifulSoup(requests.get(URL).text,'lxml')
print(soup)

我尝试使用下面的

提取header
h1=html_content.find("div", {"class":"header-left"})
print(h1)

网站进行 ajax 调用以加载数据。

为加载数据进行了 2 次 xhr 调用。可能你正在看其中之一。

import requests, json
res = requests.get("https://api.energymadeeasy.gov.au/plans/dpids/POW15475MBE3")
with open("data.json", "w") as f:
    json.dump(res.json(), f)

将 json 保存到文件。

文件中的示例数据:

[{"planData": {"planType": "M", "tariffType": "TOU", "contract": [{"pricingModel": "TOU", "benefitPeriod": "1 year", "coolingOffDays": 10, "solarFit": [{"type": "R", "description": "Powerdirect Retailer Feed-in Tariff (exc. GST if any)", "rate": 9.5}], "additionalFeeInformation": "Additional fees and charges may apply. Please see the Powerdirect fee schedules at powerdirect.com.au/fees", "fee": [{"description": "Fee may be charged when reconnecting or reading your meter when you move into a property or change retailer. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "ConnF", "feeTerm": "F"}, {"description": "Fee may be charged when reconnecting in other circumstances, such as after disconnection for non-payment. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "RecoF", "feeTerm": "F"}, {"description": "Fee may be charged when disconnecting or reading your meter when you move out of a property or change retailer. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "DiscoFMO", "feeTerm": "F"}, {"description": 
...
...
...