具有 python、javascript 输出的网页抓取

Web scraping with python, javascript output

我想从这个网站上删除工作信息,已经卡了几天了。当我打印 soup.text 输出时,我得到一个简短的 javascript 文本,这不是我想要的,因为我想要 html 元素。我已经看到类似的解决方案来实现 'Header less browsing',但是当我实现它时,我只收到了几个错误。我是网络抓取的新手,看过各种教程、视频,但根本没有得到我想要的输出,也不知道我做错了什么。

import requests
from bs4 import BeautifulSoup



def aSwiftScraper():

    jobLinks = []
    pages = []
    URL = "https://www.amiqus.com/jobs?options=,20993,20877,20876&page=1"
    page = requests.get(URL)
    soup = BeautifulSoup(page.content, "html.parser")
    print(soup.text)


aSwiftScraper()

向服务器发出请求时尝试更改 User-Agent HTTP header:

import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
}

url = "https://www.amiqus.com/jobs?options=,20993,20877,20876&page=1"

soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
for title in soup.select(".attrax-vacancy-tile__title"):
    print(title.get_text(strip=True))

打印:

Engine Programmer C++ AAA opportunity - Remote working
Senior Programmer
Gameplay Programmer

...